Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The safety here is not just "don't mention potentially controversial topics".

The safety here can also be LLMs working within acceptable bounds for the usecase.

Let's say you had a healthcare LLM that can help a patient navigate a healthcare facility, provide patient education, and help patients perform routine administrative tasks at a hospital.

You wouldn't want the patient to start asking the bot for prescription advice and the bot to come back with recommending dosages change, or recommend a OTC drug with adverse reactions to their existing prescriptions, without a provider reviewing that.

We know that currently many LLMs can be prompted to return nonsense very authoritatively, or can return back what the user wants it to say. There's many settings where that is an actual safety issue.



In this instance, we know what they've aimed for [1] - "Violence & Hate", "Sexual Content", "Guns & Illegal Weapons", "Regulated or Controlled Substances", "Suicide & Self Harm" and "Criminal Planning"

So "bad prescription advice" isn't yet supported. I suppose you could copy their design and retrain for your use case, though.

[1] https://huggingface.co/meta-llama/LlamaGuard-7b#the-llama-gu...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: