Hacker Newsnew | past | comments | ask | show | jobs | submit | AstroBen's commentslogin

For how long would it stay $200 of you can rack up 5 figures if usage..

That is the reason they severely limited Claude Max subscriptions. Some users racked up 1k+ in API equivalent cost per day.

> Defendant SerpApi, LLC (“SerpApi”) offers services that “scrape” this copyrighted content and more from Google, using deceptive means to automatically access and take it for free at an astonishing scale and then offering it to various customers for a fee. In doing so, SerpApi acquires for itself the valuable product of Google’s labors and investment in the content, and denies Google’s partners compensation for their works

this has to be satire. Is Google not the #1 entity guilty of exactly this?


No, Google doesn't use deceptive means. They identify their crawler as GoogleBot, and obey robots.txt.

Google doesn't have to do that now after already having established its own monopoly... just like SerpApi wouldn't have to act deceptively if they had a monopoly on search.

Because they've forced everyone to allow them. They're the internet traffic mafia. Block them and you disappear from the internet

They abuse this power to scrape your work, summarize it and cut you out as much as possible. Pure value extraction of others' work without equal return. Now intensified with AI

But yeah, you're right. They're not deceptive


> Because they've forced everyone to allow them.

nobody is forcing anyone. This is the same argument that people said about google search. Nobody is forcing anyone to use google search, google chrome, or even allow googlebot for scraping.

Thousands of poeple have switched over to chatgpt, brave/firefox ..

Your argument sounds like "I dont like Apple's practices, and I'm forced to buy iPhones. No buddy, if you dont like Apple, dont buy their products"


> Thousands of poeple have switched over to chatgpt, brave/firefox ..

If you want people to visit your website, limiting yourself to the "thousands" of people who don't use google isn't really an option.

> Your argument sounds like "I dont like Apple's practices, and I'm forced to buy iPhones. No buddy, if you dont like Apple, dont buy their products"

Well, I don't like Apple's or Google's practices, but I basically [1] have to use either iOS or Android.

[1]: yes there are things like GrapheneOS and librem, but those aren't really practical for most people.


> Your argument sounds like "I dont like Apple's practices, and I'm forced to buy iPhones. No buddy, if you dont like Apple, dont buy their products"

No, not really. There are alternatives to Apple. Whereas here Google controls the gate to the majority of internet traffic

For many it's "block Google and your business dies"


What about for their LLM products? We know that OpenAi does not respect the robots.txt file

Google uses the same crawler and robots.txt file for training data.

Check out Fizzy from 37signals. They used a similar HATEOAS approach to build it with Hotwire

https://www.fizzy.do

source: https://github.com/basecamp/fizzy

I don't think this is a better approach than React. It's just an approach. It's viable. It's fine


I really feel this myself.

If I write home-grown organic code then I have no choice but to fully understand the problem. Using an LLM it's very easy to be lazy, at least in the short term

Where does that get me after 3 months? I end up working on a codebase I barely understand. My own skills have degraded. It just gets worse the longer you go

This is also coming from my experience in the best case scenario: I enjoy coding and am working on something I care about the quality of. Lots of people don't have even that


Give them real world problems you're encountering and see which can solve them the best, if at all

A full week of that should give you a pretty good idea

Maybe some models just suit particular styles of prompting that do or don't match what you're doing


What people here forget is coding is a tiny minority of the actual usage. ~5% if I remember correctly?

Their best market might just be as a better Google with ads


Yep, bulk of AI usage is generating marketing emails

Here's OpenAI's data on it: https://www.nber.org/system/files/working_papers/w34255/w342...

I don't think marketing emails are written enough to constitute the "bulk" of it, but writing in general seems to be


Did you notice much improvement going from Gemini 2.5 to 3? I didn't

I just think they're all struggling to provide real world improvements


Gemini 3 Pro is the first model from Google that I have found usable, and it's very good. It has replaced Claude for me in some cases, but Claude is still my goto for use in coding agents.

(I only access these models via API)


Using it in a specialized subfield of neuroscience, Gemini 3 w/ thinking is a huge leap forward in terms of knowledge and intelligence (with minimal hallucinations). I take it that the majority of people on here are software engineers. If you're evaluating it on writing boilerplate code, you probably have to squint to see differences between the (excellent) raw model performances. whereas in more niche edge cases there is more daylight between them.

what specalized usecases did you use it on and what were the outcomes.

can you share your experience and data for "leap forward" ?


Nearly everyone else (and every measure) seems to have found 3 a big improvement over 2.5.

oh yes im noticing significant improvements across the board but mainly having 1,000,000 token context makes a ton of difference, I can keep digging at a problem with out compaction.

I think what they're actually struggling with is costs. And I think they're all behind the scenes quantizing models to manage load here and there, and they're all giving inconsistent results.

I noticed huge improvement from Sonnet 4.5 to Opus 4.5 when it became unthrottled a couple weeks ago. I wasn't going to sign back up with Anthropic but I did. But two weeks in it's already starting to seem to be inconsistent. And when I go back to Sonnet it feels like they did something to lobotomize it.

Meanwhile I can fire up DeepSeek 3.2 or GLM 4.6 for a fraction of the cost and get almost as good as results.


Maybe they are just more consistent, which is a bit hard to notice immediately.

I noticed a quite noticeable improvement to the point where I made it my go-to model for questions. Coding-wise, not so much. As an intelligent model, writing up designs, investigations, general exploration/research tasks, it's top notch.

yes, 2.5 just couldnt use tools right. 3.0 is way better at coding. better than sonnet 4.5/

Gemini 3 was a massive improvement over 2.5, yes.

Seems to be getting more aerodynamic. A clear sign of AI intelligence

No-one should have the expectation LLMs are giving correct answers 100% of the time. It's inherent to the tech for them to be confidently wrong

Code needs to be checked

References need to be checked

Any facts or claims need to be checked


According to the benchmarks here they're claiming up to 97% accuracy. That ought to be good enough to trust them right?

Or maybe these benchmarks are all wrong


Something that is 97% accurate is wrong 3% of the time, so pointing out that it has gotten something wrong does not contradict 97% accuracy in the slightest.

Gemini routinely makes up stuff about BigQuery’s workings. “It’s poorly documented”. Well, read the open source code, reason it out.

Makes you wonder what 97% is worth. Would we accept a different service with only 97% availability, and all downtime during lunch break?


I.e. like most restaurants and food delivery? :). Though 3% problem rate is optimistic.

Does code work if it's 97% correct?

It's not okay if claims are totally made up 1/30 times

Of course people aren't always correct either, but we're able to operate on levels of confidence. We're also able to weight others' statements as more or less likely to be correct based on what we know about them


> Does code work if it's 97% correct?

Of course it does. The vast majority of software has bugs. Yes, even critical one like compilers and operating systems.


> Or maybe these benchmarks are all wrong

You must be new to LLM benchmarks.


"confidently" is a feature selected in the system prompt.

As a user you can influence that behavior.


No it isn't. It isn't intelligent, it's a statistical engine. Telling it to be confident or less confident doesn't make it apply confidence appropriately. It's all a facade

It also crashed my browser tab

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: