>when we get a prompt working reliably on one model, we often have trouble porti...

		andai 50 days ago \| parent \| context \| favorite \| on: Study identifies weaknesses in how AI systems are ... >when we get a prompt working reliably on one model, we often have trouble porting it to another LLM I saw a study where a prompt massively boosted one model's performance on a task, but significantly reduced another popular model's performance on the same task.

Do you have any pointer to search for that?