Fair use defenses rest on the fact that a limited excerpt was used for limited distribution, among other criteria.
For example, if I'm a teacher and I make 30 copies of one page of a 300-page novel and I hand that out to my students, that's a brief excerpt for a fairly limited distribution.
Now if I'm a social media influencer and I copy all 300 pages of a 300-page book and then I send it out to all 3,000 of my followers, that's not fair use!
Also if I'm a teacher, and I find a one-page infographic and I make 30 copies of that, that's not fair use, because I didn't make an excerpt but I've copied 100% of the original work. That's infringement now.
So if LLMs went through en masse in thousands of copyrighted works in their entirety and ingested every byte of them, no copyright judge on the planet would call that fair use.
There is no way in Hell that this is fair use!
Fair use defenses rest on the fact that a limited excerpt was used for limited distribution, among other criteria.
For example, if I'm a teacher and I make 30 copies of one page of a 300-page novel and I hand that out to my students, that's a brief excerpt for a fairly limited distribution.
Now if I'm a social media influencer and I copy all 300 pages of a 300-page book and then I send it out to all 3,000 of my followers, that's not fair use!
Also if I'm a teacher, and I find a one-page infographic and I make 30 copies of that, that's not fair use, because I didn't make an excerpt but I've copied 100% of the original work. That's infringement now.
So if LLMs went through en masse in thousands of copyrighted works in their entirety and ingested every byte of them, no copyright judge on the planet would call that fair use.
For reference, the English Wikipedia has a policy that allows some fair-use content of copyrighted works: https://en.wikipedia.org/wiki/Wikipedia:Non-free_content_cri...