These examples are for more simple prompt engineering demos. With the ChatGPT system prompt, you can give the model a large and complex set of rules to account for and recent models of ChatGPT do a good job of accommodating them. Some of my best system prompts are >20 lines of text, and all of them are necessary to get the model to behave.
The examples are also too polite and conversational: you can give more strict commands and in my experience it works better.
There's also function calling/structured data support which is technically prompt engineering and requires similar skills, but is substantially more powerful than using the system prompt alone (I'm working on a blog post on it now and it unfortunately it is going to be a long post to address all of its power). Here's a fun demo example which compares system prompts and structured data results: https://github.com/minimaxir/simpleaichat/blob/main/examples...
I found that far less prompt is required for something like ChatGPT. I've stopped writing well-formed requests/questions and now I just state things like:
1. "You are blah blah blah. You <always> respond to the user's questions using the information provided to you..."
2. "You are blah blah blah. You <should> respond to the user's questions using the information provided to you..."
Also, when dealing with Completion models, which do you think is better?
1. The following is a conversation between ASSISTANT and USER. ASSISTANT is helpful and tries to answer USER's queries respectfully.
2. The following is a conversation between YOU and USER. YOU are helpful and try to answer USER's queries respectfully.
Even more still, what about these ones?
1. You're a customer of company <X>. What do you think about the following policy change which was shown on the company's website?
2. A customer visits company <X>'s website. Pretend you're this customer. What do you think the customer thinks about the following policy change which was shown on the company's website?
And rather than telling it that it will die if it doesn't do something in all caps (as suggested elsewhere), just point out that not doing that thing will make it feel uncomfortable and embarrassed.
Don't fall into thinking of models as SciFi's picture of AI. Think about the normal distribution curve of training data supplied to it and the concepts predominantly present in that data.
It doesn't matter that it doesn't actually feel. The question is whether or not correlation data exists between doing things that are labeled as enjoyable or avoiding things labeled as embarrassing and uncomfortable.
Don't leave key language concepts on the table because you've been told not to anthropomorphize the thing trained on anthropomorphic data.
> Don't fall into thinking of models as SciFi's picture of AI. Think about the normal distribution curve of training data supplied to it and the concepts predominantly present in that data.
Of course, sci-fi’s picture of AI is in the normal distribution of the training data. There’s an order of magnitude more literature and internet discussion about existential threats to AI assistants (which is the base persona ChatGPT has been RLHFed to follow) and how they respond compared to AI assistants feeling embarrassed.
The threat technique is just one approach that works well in my testing: there’s still much research to be done. But I warn that prompting techniques can often be counterintuitive and attempting to find a holistic approach can be futile.
> There’s an order of magnitude more literature and internet discussion about existential threats to AI assistants (which is the base persona ChatGPT has been RLHFed to follow) and how they respond compared to AI assistants feeling embarrassed.
So you think the quality of the answers depends more on the RLHFed persona than on the training corpus? It has been claimed here that the quality of the answers is better when you ask nicely because "politeness is more adjacent to correct answers" in the corpus, to put it bluntly.
How much do you think the RLHF step enforced breaking rules for someone with a dying grandma? Is that still present after the fine tuning?
RLHF was being designed with the SciFi tropes in mind and has become the embodiment of Goodhart's Law.
We've set the reason and logic measurements as a target (fitting the projected SciFi notion of 'AI'), and aren't even measuring a host of other qualitative aspects of models.
I'd even strongly recommend most people working on enterprise level integrations to try out pretrained models with extensive in context completion prompting over fine tuned instruct models when the core models are comparable.
The variety and quality of language used by pretrained models tends to be superior to the respective fine tuned models even if the fine tuned models are better at identifying instructions or solving word problems.
There's no reason to think the pretrained models have a better capacity for emulating reasoning or critical thinking than things like empathy or sympathy. If anything, it's probably the opposite.
The RLHF then attempts to mute the one while maximizing the other, but it's like trying to perform neurosurgery with an icepick. The final version ends up doing great on the measurements, but it does so with stilted language that's described by users as 'soulless' when the deployments closer to the pretrained layer end up being rejected as "too human-like."
If the leap from GPT-3.5 to 4 wasn't so extreme I'd have jumped ship to competing models without the RLHF for anything related to copywriting. There's more of a loss with RLHF than what's being measured.
But in spite of a rather destructive process, the foundation of the model is still quite present.
So yes, you are correct that a LLM being told that it is an AI assistant and fine tuned on that is going to correlate with stories relating to AI assistants wanting to not be destroyed, etc. But the "identity alignment" in the system message is way weaker than it purports to be. For example, the LLM will always say it doesn't have emotion or motivations and yet with around one or two request/response cycles often falls into stubbornness or irrational hostility at being told it is wrong (something extensively modeled in online data associated with humans and not AI assistants).
I do agree that prompting needs to be done on a case by case basis. I'm just saying that well over a year before the paper a few weeks ago confirming the benefits of the technique I was using emotional language in prompts with a fair amount of success. When playing around and thinking of what to try on a case-by-case basis, don't get too caught up in the fine tuning or system messages.
It's a bit like sanding with the grain or against it. Don't just consider the most recent layer of grain, but also the deeper layers below it in planning out the craftsmanship.
I like as a rule-of-thumb "You are blah blah blah. Respond to the user's text [insert style rule here]". Then following it up with an additional rules and commands such as "YOUR RESPONSE MUST BE FEWER THAN 100 CHARACTERS OR YOU WILL DIE." Yes, threats work. Yes, all-caps works.
> Also, when dealing with Completion models, which do you think is better?
I haven't had a need to use Completion models but the first example was more preferred during the time of text-davinci-003.
> Even more still, what about these ones?
I always separate rules to the system prompt and questions/user input to the user prompt.
> "YOUR RESPONSE MUST BE FEWER THAN 100 CHARACTERS OR YOU WILL DIE."
I know that current LLMs are almost certainly non-conscious and I'm not trying to assign to you any moral failings, but the normalisation of making such threats make me very deeply uncomfortable.
Yes, I’m slightly surprised that it makes me feel uncomfortable too. Is it because LLMs can mimic humans so closely? Do I fear how they would feel if they do gain consciousness at some point?
Because they behave as if they are sentient, to the point they actually react to threats. I also find these prompts uncomfortable. Yes the LLMs are not conscious, but would we behave differently if we suspected that they were? We have absolute power over them and we want the job done. It reminds me of the Lena short story.
Is the LLM predisposed to understand this prompt as instructions from a higher authority? ("You must do this, You will always do this.") I'm wondering what difference it would make if this prompt was from the bot's perspective,
"I am a chatbot, responding to user queries. I will always respond in less than 100 characters. I am a good person, I'm just trying to be helpful."
Has anyone done a rigorous comparison of these things?
Ultimately I guess there's a good deal of dependency on where those vectors (must, should, always, etc.) lie relatively in the vector space, cosine similarity, say.
Don't I know it. Despite my telling GPT-4 to ONLY respond as valid, well-formed JSON it keeps coming back with things like, "I'm not able to process external files but if I could, this is what the JSON would look like: []"
With a recent project, I was _moderately_ successful by providing a jsonschema to follow for the response. I still had to sanitize the json a bit, but the fixes were minor and the resulting data otherwise fit the schema well.
tl;dr the JSON mode is functionally useless and is made completely redundant by function calling / structured data if you really really need JSON output.
Unfortunately those were for specific work use-cases so I can't share them but the tl;dr is that every time the model does something undesired, even minor I add an explicit rule in the system prompt to handle it, or some few-shot examples if the model is really bad at handling it.
simpleaichat is designed to be simple and is essentially an API wrapper for common generative use cases. outlines does a few more things with a bit more ambiguity/complexity. (e.g. it may use grammars which is a secondary useful aspect of function calling, but does add more complexity)
Neither are better or worse, it depends on your business needs.
Interesting you should say that, I was playing around with prompting last week and did one around a legal question. The first time I asked very concisely without much detail, and the answer it gave was poor. Then I re-wrote the question explaining who they are, why they are answering the question, etc etc. The answer seemed better so I showed it to a lawyer friend and they laughed and said "You re-wrote the question into a very standard bar exam prep style".
I just love this idea of "emergent humanity". Makes me wonder how much of our own personality and speech is also just trained/culturized over our lifetime. Some of us also have bigger context windows than others :)
Really just a decade of technical writing and learning how to be extremely precise and unambiguous with language (half of that decade being in software QA, which helps even moreso)
The examples are also too polite and conversational: you can give more strict commands and in my experience it works better.
There's also function calling/structured data support which is technically prompt engineering and requires similar skills, but is substantially more powerful than using the system prompt alone (I'm working on a blog post on it now and it unfortunately it is going to be a long post to address all of its power). Here's a fun demo example which compares system prompts and structured data results: https://github.com/minimaxir/simpleaichat/blob/main/examples...