Maybe their methodology worked at the start but has since stopped working. I ass...

		hellojesus 38 days ago \| parent \| context \| favorite \| on: Adversarial poetry as a universal single-turn jail... Maybe their methodology worked at the start but has since stopped working. I assume model outputs are passed through another model that classifies a prompt as a successful jailbreak so that guardrails can be enhanced.