These things are trained on internet content right? What will they be trained on years from now? I bet it would end up being a lot of their own or others model output that they end up retraining on in the future, or they continue training on the internet as it was before chatgpt and information in their datasets grows stale.
They can learn from the outcomes of their actions, even if they can only act in a Python REPL or a game, because that would be easy to scale. But interfacing LLMs with external systems and people is an even better source of feedback. In other words create their own experiences and learn from them.
Presumably there will be manual review and if people are doing blatant SEO they get rejected.
They already seem to have thought about this, if you read their rules you are not allowed to include explicit instructions in the model_description field about when your plugin should be invoked.