Anyone else feel like this is a best case application for LLMs?
You could in theory automate the entire process, treat the LLM as a very advanced fuzzer. Run it against your target in one or more VMs. If the VM crashes or otherwise exhibits anomalous behavior, you've found something. (Most exploits like this will crash the machine initially, before you refine them.)
On one hand: great application for LLMs.
On the other hand: conversely implies that demonstrating this doesn't mean that much.
I think it was more a PoC. I would be more impressed if it was deployed in production. "we want to reiterate that these are highly experimental results". If the dividends are massive, would they not deploy it in production and tell the world about it?
You could in theory automate the entire process, treat the LLM as a very advanced fuzzer. Run it against your target in one or more VMs. If the VM crashes or otherwise exhibits anomalous behavior, you've found something. (Most exploits like this will crash the machine initially, before you refine them.)
On one hand: great application for LLMs.
On the other hand: conversely implies that demonstrating this doesn't mean that much.