Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm curious to know where you draw the line for what constitutes legitimate manipulation by a person and when it becomes distribution.

I'm assuming that if I write code by hand for every part of the TCP/IP and HTTP stack I'm safe.

What if I use libraries written by other people for the TCP/IP and HTTP part?

What if I use a whole FOSS web browser?

What about a paid local web browser?

What if I run a script that I wrote on a cloud server?

What if I then allow other people to download and use that script on their own cloud servers?

What if I decide to offer that script as a service for free to friends and family, who can use my cloud server?

What if I offer it for free to the general public?

What if I start accepting money for that service, but I guarantee that only the one person who asked for the site sees the output?

Can you help me to understand where exactly I crossed the line?



Obviously not legal advice and I doubt it's entirely settled law, but probably this step

> What if I decide to offer that script as a service for free to friends and family, who can use my cloud server?

You're allowed to make copies and adaptations in order to utilize the program (website), which probably covers a cloud server you yourself are controlling. You aren't allowed to do other things with those copies though, like distribute them to other people.

Payment only matters if we're getting into "free use" arguments, and I don't think any really apply here.

I think you're probably already in trouble with just offering it to family and friends, but if you take the next step offering it to the public that adds more issues because the copyright act includes definitions like "To perform or display a work “publicly” means (1) to perform or display it at a place open to the public or at any place where a substantial number of persons outside of a normal circle of a family and its social acquaintances is gathered; or (2) to transmit or otherwise communicate a performance or display of the work to a place specified by clause (1) or to the public, by means of any device or process, whether the members of the public capable of receiving the performance or display receive it in the same place or in separate places and at the same time or at different times."


Why is that the line and not a paid web browser? What about a paid web browser whose primary feature is a really powerful ad blocker?


Why would a paid web browser be the line?

No one is distributing copies of anything to anyone then apart from the website that owns the content lawfully distributing a copy to the user.

Also why is a paid web browser any different than a free one?


Paid is arguably different than free because the code that is actually asking for the data is owned by a company and licensed to the user, in much the same way as a cloud server licenses usage of their servers to the user. That said, I'll note that my argument is explicitly that the line doesn't exist, so I'm not saying a paid browser is the line.

I'm unfamiliar with the legal questions, but in 2024 I have a very hard time seeing an ethical distinction between running some proprietary code on my machine to complete a task and running some proprietary code on a cloud server to complete a task. In both cases it's just me asking someone else's code to fetch data for my use.


Great, so we agree that your previous comment asking I address "paid browsers" in particular was an unnecessary distraction.

> I have a very hard time seeing an ethical distinction between running some proprietary code on my machine to complete a task and running some proprietary code on a cloud server to complete a task

It's important to recognize that copyright is entirely artificial. Congress went "let's grant creators some monopolies on their work so that they can make money off of it", and then made up some arbitrary lines for what they did and did not have a monopoly over. There's no principled ethical distinction between what is on one side of the line and the other, it's just where congress drew the arbitrary line in the sand. It then (arguably) becomes unethical to do things on the illegal side of the line precisely because we as a society agreed to respect the laws that put them on the illegal side of the line so that creators can make money in a fair and level playing field.

Sometimes the lines in the sand were in fact quite problematic. Like the fact that the original phrasing meant that running a computer program would almost certainly violate that law. So whenever that comes up congress amends the exact details of the line... in the US in the case of computers carving out an exception in section 117 of the copyright act. It provides that (in part)

> it is not an infringement for the owner of a copy of a computer program to make or authorize the making of another copy or adaptation of that computer program provided:

> (1) that such a new copy or adaptation is created as an essential step in the utilization of the computer program in conjunction with a machine and that it is used in no other manner

and provides the restriction that

> Adaptations so prepared may be transferred only with the authorization of the copyright owner.

By my very much not a lawyer reading of the law, those are the relevant parts of the law, they allow things like local ad-blockers, they disallow a third party website which downloads content (acquiring ownership on a lawfully made copy), modifies it (valid under the first exception if that was a step in using the website) and distributes the adapted website to their users (illegal without permission).


How is using perplexity any more so making a copy than your browser is making a copy? Unless you are distributing your website on thumb drives or floppy disks all distribution is achieved by making a copy. That's how networks work.

Your logic would also imply that viewing a website through a VPN not operated by yourself would require the VPN operator to have a redistribution license for all the content on the website which is not the case.

How do you think google is able to scrape whatever they like and redistribute summaries of the pages they have visited without consulting everyone who has ever made a website for a redistribution license.

That being said, Copyright is not enforced or interpreted consistently. It seems that individual cases can be decided based on what people ate for lunch on the day of the case, who the litigants are, and maybe the alignment of the planets.


> How is using perplexity any more so making a copy than your browser is making a copy

Both are, the difference is that your browser doesn't transfer the copy to a new legal entity after modifying it. Rather the browser is under the control of the end user and the end user owns the data (not the copyright, but the actual instance of the data) the whole time.

> Your logic would also imply that viewing a website through a VPN not operated by yourself would require the VPN operator to have a redistribution license for all the content on the website which is not the case.

It doesn't because the VPN doesn't modify it, and the law explicitly distinguishes between the two cases and allows for transferring in the case of exact copies (provided you transfer all rights). I left this part of section 117 out because it wasn't relevant, but I'll quote it here

> Any exact copies prepared in accordance with the provisions of this section may be leased, sold, or otherwise transferred, along with the copy from which such copies were prepared, only as part of the lease, sale, or other transfer of all rights in the program. [And then the portion of the paragraph I quoted above] Adaptations so prepared may be transferred only with the authorization of the copyright owner.

> How do you think google is able to scrape whatever they like and redistribute summaries of the pages they have visited without consulting everyone who has ever made a website for a redistribution license.

A fair use argument, which I think is less likely (and I'd go so far as to say unlikely) to apply to a service like perplexity.ai but is ultimately a judgement call that will be made by the legal system and like all fair use arguments has no clear boundaries.


TECHNICAL ANALYSIS

The key, as many here have missed, is authentication and authorization. You may have authorization to log in and view movies on Netflix. Not to rebroadcast them. Even the question of a VCR for personal use was debated in the past.

Distributing your own scripts and software to process data is not the same as distributing arbitrary data those scripts encountered on the internet for which you don’t have a license.

If someone wrote an article, your reader transforms it based on your authenticated request, and your user would have an authorized subscription.

But if that reader then sent the article down to a remote server to be processed for distribution to unlimited numbers of people, it would be “pirating” that information.

The problem is that much of the Web is not properly guarded against this. Xanadu had ideas about micropayments 30 years ago. Take a look at what I am building using the current web: https://qbix.com/ecosystem

LEGAL ANALYSIS

Much of the content published on the Web isn’t secured with subscriptions and micropayments, which is why the whole thing becomes a legal battle as silly as “exceeding authorized access” which landed someone like Aaron Swartz in jail.

In other words, it is the question of “piracy”, which has acquired a new character only in that the AI is trained on your data and transforms it before it republishes it.

There was also a lawsuit aboot scraping LinkedIn, which was settled as follows: https://natlawreview.com/article/hiq-and-linkedin-reach-prop...

Legally, you can grant access to people subject to a certain license (eg Creative Commons Share Alike) and then any derived content must have its weights opened. Similar to, say, Affero GPL license for derivative software.


Why are you ignoring his main argument?


I'm not. I'm asking why this flow is "distribution":

* User types an address into Perplexity

* Perplexity fetches the page, transforms it, and renders some part of it for the user

But this flow is not:

* User types an address into Orion Browser

* Orion Browser fetches the page, transforms it, and renders some part of it for the user

Regardless of the legal question (which I'm also skeptical of), I'm especially unconvinced that there's a moral distinction between a web service that transforms copyrighted works in an ad hoc manner upon a user's specific request and renders them for that specific user vs an installed application that does exactly the same thing.


The moral case is pretty obviously that Perplexity is preventing traffic from reaching the people who made the content.


How so? TFA pretty clearly shows that traffic does reach the server, how else would it show up in the logs?

Also, the author of TFA has already gotten themselves deindexed, the behavior they're complaining about now is that if someone copies and pastes a link into Perplexity it will go fetch the page for the user and summarize it.

This scenario presupposes that the user has a link to a specific page. I suspect that in nearly all cases that link will be copied from the address bar of an open tab. This means that most of the time the site will actually get double the traffic: one hit when the user opens it in the browser and a second when Perplexity asks for the page to summarize it.


Where exactly you crossed the line is a question for the courts. I am not a lawyer and will there for not help with the specifics.

However, please see the Aereo case [0] for a possibly analogous case. I am allowed to have a DVR. There is no law preventing me from accessing my DVR over a network. Or possibly even colocating it in a local data center. But Aereo definitely crossed a line. Also see Vidangel [1]. The fact that something is legal to do at home, does not mean that I can offer it as a cloud service.

[0] https://www.vox.com/2018/11/7/18073200/aereo

[1] https://en.m.wikipedia.org/wiki/Disney_v._VidAngel


Which is offensive and the legal structure underlying that should be changed. Renting out machines, where a person could legally install and use the exact same machine, makes zero sense to count as "distribution".


>Where exactly you crossed the line is a question for the courts. I am not a lawyer and will there for not help with the specifics.

I expect you're right. Although Perplexity thinks they're well within the law[0]. Are they correct? I guess we'll see....

[0] https://www.perplexity.ai/search/Why-are-you-2wJteqZ4SUCqPjk...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: