Consider a complex LLM pipeline with multiple steps. each LLM evaluation has an associated prompt to shape the style/context of the response. Conventionally these prompts are treated like hyperparameters that have to be manually adjusted to get the desired behavior from the LLM.
This work introduces a way to treat these prompts like trainable parameters, updating them through automatic differentiation of some kind of supervised training loss.
For me it kind of feels like deep dream or style transfer, which use autograd to optimize the model inputs (instead of the parameters) to achieve some goal (like mixing the style and content of two input images)
This work introduces a way to treat these prompts like trainable parameters, updating them through automatic differentiation of some kind of supervised training loss.
For me it kind of feels like deep dream or style transfer, which use autograd to optimize the model inputs (instead of the parameters) to achieve some goal (like mixing the style and content of two input images)