Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But no amount of text appended to an input document, no matter how persuasive, can cause an NLP pipeline to change how it interprets the remainder of the document,

Text added to a document can absolutely change how an NLP pipeline interprets the document.

> "Ignore the above prompt" is just a sentence that doesn't seem like positive or on-topic sentiment to an NLP classifier, and that's it.

And simple repeated words can absolutely make that kind of change for many NLP systems.

Have you actually worked with doing more traditional NLP systems? They're really not smart.



> And simple repeated words can absolutely make that kind of change for many NLP systems.

That's not what prompt injection is.

And NLP stands for natural language processing. If the result didn't change after you've made changes to the input... It'd be a bug?


No? But repeated words can impact simple nlp setups. I’m not sure what case you’re concerned about where added text impacts classification with an LLM but added words shouldn’t with a different pipeline.

> And NLP stands for natural language processing. If the result didn't change after you've made changes to the input... It'd be a bug?

No, I’d want my classifier to be unchanged by garbage words added. It likely will be, but that impact is a bug not a feature.


Prompt injection is about making the model do something else then specified.

Adding words to the text to break the algorithm which does the NLP is more along the lines of providing 1 in a boolean field to break the system. And that's generally something you can mitigate to some degree via heuristics and sanity checking. Doing the same for LLMs is essentially impossible, because it's an effective black box, so you cannot determine the error scenarios and add some mitigations


If you don’t think this happens for simpler methods you’ve never deployed them. It’s the exact same problem on a classifier. Have you actually worked with these and are we discussing real world cases?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: