Hacker Newsnew | past | comments | ask | show | jobs | submit | Terretta's commentslogin

> they can get you places really effectively!

But those who require them to get anywhere won't get very far without power.


Sure, sure, and next you'll tell me there's a way to know which side of the car the gas cap is on.

We replaced the periodic table with elements for five times the reaction.

See also rtings.com, with 4,359 products bought and tested.

https://www.rtings.com/company/how-we-make-money


> the volume of AI scrapers is making hosting untenable

Aside from that potential, it's also not true.

A Pentium Pro or PIII SSE with circa 1998-99 Apache happily delivers a billion hits a month w/o breaking a sweat unless you think generating pages for every visit is better than generating pages when they change.


I think it is true that it is a real problem (EDIT: but doesn't necessarily make "hosting untenable"), but you are correct to point out that modern pages tend to be horribly optimized (and that's the source of the problem). Even "dynamic" pages using React/Next.js etc. could be pre-rendered and/or cached and/or distributed via CDNs. A simple cache or a CDN should be enough to handle pretty much any scrapping traffic unless you need to do some crazy logic on every page visit – which should almost never be the case on public-facing sites. As an example, my personal site is technically written in React, but it's fully pre-rendered and doesn't even serve JS – it can handle huge amounts of bot/scrapping traffic via its CDN.

OK, I agree with both of you. I am an old who is aware of NGINX and C10k. However, my question is: what are the economic or technical difficulties that prevent one of these new web-scale crawlers from releasing og-pagerank-api.com? We all love to complain about modern Google SERP, but what actually prevents that original Google experience from happening, in 2026? Is it not possible?

Or, is that what orgs like Perplexity are doing, but with an LLM API? Meaning that they have their own indexes, but the original q= SERP API concept is a dead end in the market?

Tone: I am asking genuine questions here, not trying to be snarky.


What prevents it is that the web in 2026 is very different than it was when OG pagerank became popular (because it was good). Back then, many pages linked to many other pages. Now a significant amount of content (newer content, which is often what people want) is either only in video form, or in a walled garden with no links, neither in or out of the walls. Or locked up in an app, not out on the general/indexable/linkable web. (Yes, of course, a lot of the original web is still there. But it's now a minority at best.)

Also, of course, the amount of spam-for-SEO (pre-slop slop?) as a proportion of what's out there has also grown over time.

IOW: Google has "gotten worse" because the web has gotten worse. Garbage in, garbage out.


Thanks for the reply. I mentioned tech, but forgot about time. Yeah, that makes solid sense.

> Or locked up in an app...

I believe you may have at least partially meant Discord, for which I personally have significant hate. Not really for the owners/devs, but why in the heck would any product owner want to hide the knowledge of how to user their app on a closed platform? No search engine can find it, no LLM can learn from it(?). Lost knowledge. I hate it so much. Yes, user engagement, but knowledge vs. engagement is the battle of our era, and knowledge keeps losing.

r/anything is so much better than a Discord server, especially in the age of "Software 3.0"


Please see my reply to the other child comment. That is my actual question, apologies for not being more clear.

Intern generated code does not substitute for tech lead thinking, testing, and clean up/rewrite.

No, the code is generated by a tool that's "smarter than people in many ways". So which parts of "thinking, testing, and clean up/rewrite" can we trust it with?

Trust is a function of responsibility, not of smarts.

You may hire a genius developer that's better than you at everything, and you still won't trust them blindly with work you are responsible for. In fact, the smarter they are than you, the less trusting you can afford to be.


Very little, until it stops being stupid in many ways. We don't need smart, we need tools to not be stupid. An unreliable tool is more dangerous and more useless than having no tool.

The marketing is irrelevant. The AIs are not aware of what they are doing, or motivated in the ways humans are.

### The answer that fits everything (and what to do about it)

Maybe we need a real AI which creates new phrases and teaches the poor LLMs?

Looking back we already had similar problems, when we had to ask our colleagues, students, whomever "Did you get your proposed solution from the answers part or the questions part of a stackoverflow article?" :-0


cant wait for chatgpt to make me read about grandmas secret recipe and scroll through 6 ads to see the ingredients for my chicken teriyaki dinner

Some feedback about the primary use case.

Your Lix doc (LLM written but with typos?) is sort of weird, handwaving how Lix does version control over, say, Excel, to say it's about working with SQL databases:

How does Lix work?

Lix adds a version control system on top of SQL databases that let's you query virtual tables like file, file_history, etc. via plain SQL. These table's are version controlled.

Then it gets weirder:

Why this matters:

Lix doesn't reinvent databases — durability, ACID, and corruption recovery are handled by battle-tested SQL databases.

This seems like a left turn from the value prop and why the value prop matters?

A firm-wide audit trail of changes to typically opaque file types (M365 files in particular) could be tremendously valuable -- and additive -- compared to the versioning that's baked into the file bundles. The version control is already embedded by the app, what adds value is reporting on or managing that from outside the app.

As for how it works, both in the docs and in the comment I'm replying to, it's unclear how any of this interacts with the native version control embedded in M365 apps or why this tool can be trusted as effective at tracking M365 content changes.


Does the following make more sense to you in respect to SQL?

Lix uses SQL databases as storage and query engine. Aka you get a filesystem on top of your SQL database that is version controlled.

Or, the analogy to git: Git uses the computers filesystem as storage layer. Lix uses SQL database (with the nice benefit of making everything queryable via SQL).

> Lix doesn't reinvent databases — durability, ACID, and corruption recovery are handled by battle-tested SQL databases.

>> This seems like a left turn from the value prop and why the value prop matters?

Better wording might be "Lix uses a SQL database as storage layer"?

The SQL part is crucial for two reasons. First, the hard part like data loss guarantees, transactions, etc. are taking care of by the database. We don't have to build custom stuff. Which secondly, reduces the risk for adapters that data loss can occur with lix.

> As for how it works, both in the docs and in the comment I'm replying to, it's unclear how any of this interacts with the native version control embedded in M365 apps or why this tool can be trusted as effective at tracking M365 content changes.

It doesn't interact with version control in M365.

I'll update the positioning. Lix is a library to embed version control in whatever developers are building. Right now, lis is mostly interesting for startups that build AI-first solutions. They run into the problem "how do customers verify the changes AI agents make?".

The angle of universal version control, and using docx or Excel as an example, triggers the wrong comparisons. By no means is Lix competing with Sharepoint or existing version control solutions for MS Office.


Tracking versions is easy. Controlling versions is hard. Knowing what the actual semantic deltas are in binary files (what your doc examples claimed) is hard.

Now you say you're not tackling that problem, so the docs are doubly weird.

Would also imagine less of this is a SQL-shaped problem, per se, so plenty tech is better for tracking and controlling changes than SQL. The shape of the problem seems less theory of querying sets, more branching journal with proof chain, or hash tree.


> Knowing what the actual semantic deltas are in binary files (what your doc examples claimed) is hard.

Hm that is what lix provides?

SQL is just the interface to query the deltas.

Anyhow, lesson learned. The primary use case for lix is embedding. Positioning lix for binary files leading to existing systems. None of those support the embedded use case. Thus, don't position lix for binary files :)


Not only by LLM, but the same astroturf soliciting engagement bot style that's all over Reddit -- as is the single comment reply.

The Daylight Computer is a delight — the perfect tablet for blue skies and no shade.

https://daylightcomputer.com/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: