Other than that, just a bit of common sense tells you all you need to know about where the data comes from (datasets never released, outputs of the LLMs suspisciously close to original copyrighted content, AI founders openly saying that paying for copyrighted content is too costly etc. etc. etc.)
- https://arstechnica.com/tech-policy/2025/02/meta-torrented-o... - https://news.bloomberglaw.com/ip-law/openai-risks-billions-a...
Other than that, just a bit of common sense tells you all you need to know about where the data comes from (datasets never released, outputs of the LLMs suspisciously close to original copyrighted content, AI founders openly saying that paying for copyrighted content is too costly etc. etc. etc.)