… and I have absolutely no obligation to provide *any particular response* to *a...

nradov · on June 15, 2024

Sure. So just return an HTTP 4XX response to requests you don't like. What's the problem?

marcusb · on June 15, 2024

Or, I return whatever content I want, within the bounds of the law, based on whatever parameters I decide. What's your problem with that? Again, connect to my server or don't. But don't tell me what type of response I'm obligated to provide you.

If I think a given request is from an LLM training module, I don't have any legal obligation whatsoever to return my original content. Or a 400-series response. If I want to intersperse a paragraph from Don Quixote between every second sentence, that's my call.

skeledrew · on June 15, 2024

This argument of freedom seems applicable on both sides. A site owner/admin is free to return whatever response they wish based on the assumed origin of a request. An LLM user/service is free to send whatever info in the request that elicits a useful response.

marcusb · on June 15, 2024

I don’t have any problem with that.

int_19h · on June 15, 2024

But nobody is arguing for that. Instead, what the server owners want is to mandate the clients connecting to them to provide enough information to reliably reject such connections.

marcusb · on June 15, 2024

There are literally people in this thread arguing that it is "unethical" to discriminate based on user agent.

SpaghettiCthulu · on June 15, 2024

> What you cannot do is dictate how my server responds to your request.

The client is under no obligation to be truthful in its communications with a server. Spoofing a User-Agent doesn't "dictate" anything. Your server dictates how it responds all on its own when it discriminates against some User-Agents.

Too · on June 16, 2024

With enough sophistication and bad intent, at some point being untruthful to a server falls under computer intrusion laws, eg using a password that is not yours. I don't believe spoofing user agent would be determinant for any such case though.

Even redistributing secret material you found on an accidentally open S3 bucket, without spoofing UA, could be considered intrusion if it was obvious the material was intended to be secret and you acted with bad intent.