But then those models are possibly used downstream, EG for Mistral's "medium" API model (and many other startups).
I guess if its behind an API and no one discloses the training data, OpenAI can't prove anything? Even obvious GPTisms could ostensibly be from internet data.
I guess if its behind an API and no one discloses the training data, OpenAI can't prove anything? Even obvious GPTisms could ostensibly be from internet data.