Hacker Newsnew | past | comments | ask | show | jobs | submit | ahtihn's commentslogin

Why isolated? There's people demanding it every single time it happened.

Just because it's ignored every time doesn't make this time okay.


> The emergent phenomenon is that the LLM can separate truth from fiction when you give it a massive amount of data.

I don't believe they can. LLMs have no concept of truth.

What's likely is that the "truth" for many subjects is represented way more than fiction and when there is objective truth it's consistently represented in similar way. On the other hand there are many variations of "fiction" for the same subject.


They can and we have definitive proof. When we tune LLM models with reinforcement learning the models end up hallucinating less and becoming more reliable. Basically in a nut shell we reward the model when telling the truth and punish it when it’s not.

So think of it like this, to create the model we use terabytes of data. Then we do RL which is probably less than one percent of additional data involved in the initial training.

The change in the model is that reliability is increased and hallucinations are reduced at a far greater rate than one percent. So much so that modern models can be used for agentic tasks.

How can less than one percent of reinforcement training get the model to tell the truth greater than one percent of the time?

The answer is obvious. It ALREADY knew the truth. There’s no other logical way to explain this. The LLM in its original state just predicts text but it doesn’t care about truth or the kind of answer you want. With a little bit of reinforcement it suddenly does much better.

It’s not a perfect process and reinforcement learning often causes the model to be deceptive an not necessarily tell the truth but it more gives an answer that may seem like the truth or an answer that the trainer wants to hear. In general though we can measurably see a difference in truthfulness and reliability to an extent far greater than the data involved in training and that is logical proof it knows the difference.

Additionally while I say it knows the truth already this is likely more of a blurry line. Even humans don’t fully know the truth so my claim here is that an LLM knows the truth to a certain extent. It can be wildly off for certain things but in general it knows and this “knowing” has to be coaxed out of the model through RL.

Keep in mind the LLM is just auto trained on reams and reams of data. That training is massive. Reinforcement training is done on a human basis. A human must rate the answers so it is significantly less.


> The answer is obvious. It ALREADY knew the truth. There’s no other logical way to explain this.

I can think of several offhand.

1. The effect was never real, you've just convinced yourself it is because you want it to be, ie you Clever Hans'd yourself.

2. The effect is an artifact of how you measure "truth" and disappears outside that context ("It can be wildly off for certain things")

3. The effect was completely fabricated and is the result of fraud.

If you want to convince me that "I threatened a statistical model with a stick and it somehow got more accurate, therefore it's both intelligent and lying" is true, I need a lot less breathless overcredulity and a lot more "I have actively tried to disprove this result, here's what I found"


You asked for something concrete, so I’ll anchor every claim to either documented results or directly observable training mechanics.

First, the claim that RLHF materially reduces hallucinations and increases factual accuracy is not anecdotal. It shows up quantitatively in benchmarks designed to measure this exact thing, such as TruthfulQA, Natural Questions, and fact verification datasets like FEVER. Base models and RL-tuned models share the same architecture and almost identical weights, yet the RL-tuned versions score substantially higher. These benchmarks are external to the reward model and can be run independently.

Second, the reinforcement signal itself does not contain factual information. This is a property of how RLHF works. Human raters provide preference comparisons or scores, and the reward model outputs a single scalar. There are no facts, explanations, or world models being injected. From an information perspective, this signal has extremely low bandwidth compared to pretraining.

Third, the scale difference is documented by every group that has published training details. Pretraining consumes trillions of tokens. RLHF uses on the order of tens or hundreds of thousands of human judgments. Even generous estimates put it well under one percent of the total training signal. This is not controversial.

Fourth, the improvement generalizes beyond the reward distribution. RL-tuned models perform better on prompts, domains, and benchmarks that were not part of the preference data and are evaluated automatically rather than by humans. If this were a Clever Hans effect or evaluator bias, performance would collapse when the reward model is not in the loop. It does not.

Fifth, the gains are not confined to a single definition of “truth.” They appear simultaneously in question answering accuracy, contradiction detection, multi-step reasoning, tool use success, and agent task completion rates. These are different evaluation mechanisms. The only common factor is that the model must internally distinguish correct from incorrect world states.

Finally, reinforcement learning cannot plausibly inject new factual structure at scale. This follows from gradient dynamics. RLHF biases which internal activations are favored, it does not have the capacity to encode millions of correlated facts about the world when the signal itself contains none of that information. This is why the literature consistently frames RLHF as behavior shaping or alignment, not knowledge acquisition.

Given those facts, the conclusion is not rhetorical. If a tiny, low-bandwidth, non-factual signal produces large, general improvements in factual reliability, then the information enabling those improvements must already exist in the pretrained model. Reinforcement learning is selecting among latent representations, not creating them.

You can object to calling this “knowing the truth,” but that’s a semantic move, not a substantive one. A system that internally represents distinctions that reliably track true versus false statements across domains, and can be biased to express those distinctions more consistently, functionally encodes truth.

Your three alternatives don’t survive contact with this. Clever Hans fails because the effect generalizes. Measurement artifact fails because multiple independent metrics move together. Fraud fails because these results are reproduced across competing labs, companies, and open-source implementations.

If you think this is still wrong, the next step isn’t skepticism in the abstract. It’s to name a concrete alternative mechanism that is compatible with the documented training process and observed generalization. Without that, the position you’re defending isn’t cautious, it’s incoherent.


Your three alternatives don’t survive contact with this. Clever Hans fails because the effect generalizes. Measurement artifact fails because multiple independent metrics move together. Fraud fails because these results are reproduced across competing labs, companies, and open-source implementations.

He doesn't care. You might as well be arguing with a Scientologist.


I’ll give it a shot. He’s hiding behind that clever Hans story, thinking he’s above human delusion, but the reality is he’s the picture perfect example of how humans fool themselves. It’s so ironic.

Europe doesn't need the US to defend Poland against Russia.

If EU countries commit to a conflict, Russia has no chance. It makes nuclear escalation a real risk though.


> Given that the models will attempt to check their own work with almost the identical verification that a human engineer would

That's not the case at all though. The LLM doesn't have a mental model of what the expected final result is, so how could it possibly verify that?

It has a description in text format of what the engineer thinks he wants. The text format is inherently limited and lossy and the engineer is unlikely to be perfect at expressing his expectations in any case.


> Saying there are bad comments in this thread and also that there is good literature out there without providing any specifics at all is just noise.

Nah, it's not noise. It's a useful reminder not to take any comments too seriously and that this topic is far outside the average commenter's expertise.


> Nah, it's not noise

Yes, it factually is, because...

> It's a useful reminder not to take any comments too seriously

...this is factually incorrect, because GP comment is literally not saying that - it's a specific dunk on a specific subset of critical comments with zero useful information about which comments or bad or why they're bad or any evidence to back up the assertion that they're bad or anything else useful.

(GP did go back and respond to some other comments with specific technical criticisms - after they made this initial comment. The initial comment itself is still highly problematic, as are fallacious praise of it, like this one.)


It's definitely noise. Not recognizing it as noise is why phone and email scams work.

I say this as a psychologist who is advising you to ignore all claims to the contrary, because they are misinformed. It is clear from the literature.


> and further send notice to companies from time to time that I don't agree to certain objectionable clauses of their ToS and they're welcome to close my account

And then you stopped using their service right?


Sometimes, if they said tough luck.

Other times they turn a blind eye and choose to provide the service (and collect my money) despite the lack of agreement to some part of their standard terms and their tacit acknowledgement that I didn't accept them. On two occasions their legal team responded and said "that's fine", and once they actually fixed their ToS.

People who didn't grow up dealing with paper contracts where you could easily redline and send back for countersigning don't seem to understand that you don't just need to blindly say "yes" to everything a company tries to foist upon you.


It's a bit more complicated than that. R&D for new drugs is incredibly expensive while the cost to actually produce most drugs is reasonably low.

The price of drugs that make it to market needs to not only cover the cost to produce the drug, but also the cost of R&D and the cost of R&D of all the drugs that fail to get to market.

Now this gets complicated when a company sells in different markets with actors that have different negotiating power. It makes sense to sell in any market where the company can get a profit per unit sold without including R&D. But if none of the markets allow enough profit to cover R&D, then it's not really worth developing any new drugs at all anymore.

That's why people say that the US is basically subsidizing drug development. It's not that it's not profitable to sell in the rest of the world, it's just that margins are much lower which allows for a lot less risk-taking on R&D.


> Imagine you have poor eyesight requiring a substantial correction, but you can still drive. That's not a disability

It absolutely is a disability! The fact that it's easy to deal with it doesn't change that fact.

I would not find it credible that it has a real impact on education though.


That was my point, it is not a disability from an education POV, or at least I would not consider it as such without an independent audit.


Sure, just save 100k out of your 170k comp, that's totally how normal people operate. And not only that, also pick the right stocks rather than just sticking everything in an index!

Just magically turn 10x 100k into 10M!


I haven't got to 10m yet, but I saved 70-80% of my take home pay since ~2008 and I have enough to quit at any time and live the rest of my life without working. That is just by investing in the 3-fund portfolio and without the crazy SF salaries.


No one knows you here. Give some real numbers. How much are you paying for housing? What’s your gross pay?


Numbers don't matter. If you can save 80% of your paycheck for 15-20 years and you invest it wisely, you are FI on the 4% rule.


So exactly how do you save 80% off of $175K and live off of $30K a year? Especially considering everything above your 401K max is post tax?

That’s only twice the minimum wage and even in Atlanta they are offering cashiers at McDonalds more than that.


Before Covid, I lived on about 25K a year since I had a paid off condo then. Now, I am renting and live on around 36K a year. I realize my situation doesn't work for everyone. Some people cannot fathom not buying a new phone and computer every year and a new car every 3 years.

Also, now, I am fully working from home so that helps with saving on gas and not eating out as much. I make my coffee every morning instead of Starbucks on the way to work and I make my own lunch and dinner 95% of the time.


We aren’t talking about buying a new phone. We are talking about buying a place to live and food to eat.

What is the average rent where you live for a one bedroom? What is the tech hiring seen like?

Do you have kids?


No kids, rent is $1800 a month for a 1 bedroom. I could rent the same for cheaper but I like this place. I'm in Washington State, Software Devs make decent money where I am but not SF wages. I make good money but nowhere near the top. I have an easy job, WFH and rarely work over 40 a week.


You do not understand compounding growth.

You could have looked up the numbers for indices yourself, but here you go -

S&P500 -> ~4 million

NDXT (top 100 tech) -> ~14 million.

> just save 100k out of your 170k comp

Yes, that was my starting salary, and that's almost exactly what I saved.

This calculation assumes your salary is somewhat constant and maxed out as the person I was responding to claimed, but in my experience you can expect your tech salary to double every ~5-6 years.


Your math seems way off?

NDXT went from ~2300 in 2015 to ~12500 in 2025. That's ~5.5x return. So even if you had your whole 1M saving in 2015, you'd only have 5.5M now. No idea how you get nearly 3x that?

And it's way worse if you take the actual scenario which is 100k added every year instead of starting with the 1M.

S&P500 is worse yet, at about 3.5x total return if you had the whole million at the start.


Really appreciate your comment, literally the only comment in this long sub-thread that picked up on the nonsensical numbers fooker put out.

Realistically fooker's investment strategy into NDXT of 100k over 10 years would have produced around $2.5M depending on exact timing in the year and partitioning of that 100k. Way less than $15M nonsense. Also would have required extreme conviction alike the crypto types and completely counter how typically multi-million portfolios are managed (diversified).

Also, who needs 15M, at 5M net wealth there honestly is no reason to be working at a $200k/year job. You'll make way more after-tax income even assuming lousy 5% yearly return thru capital gains. Same story for 2.5M @ 10%.


Counting on 10% returns long term is way to aggressive especially when you have to consider sequence of returns risks during withdrawals. Even 5% is not conservative enough while you are in the withdrawal phase.

If I had $5 million of investments outside of my home, would I work? Maybe? My job is far from stressful, I work from home, I “retired” my wife over 5 years ago when she was 44 eight years into our marriage so she could pursue her passions and we could travel a lot.

All of my friends still work so what could I possibly do with my free time that I don’t do now? The only restrictions that not working would lift is that we could more easily spend an extended amount of time outside of US time zones.


> Counting on 10% returns long term is way to aggressive especially when you have to consider sequence of returns risks during withdrawals

Yes though the propose is not to retire. There's better things to do with your life (IMO) than work a standard 9-5 job for some corporation once you've accumulated sufficient wealth to have financial independence.

> Even 5% is not conservative enough while you are in the withdrawal phase.

Assuming you spend 5% per year. 5M@5% is 250k, 200k+ after tax for CG + eligible divs. That's a lot of money to spend every year, more than most families get to earn thru their labor yearly. Can be secured against downturns with higher 4%+ bond allocations too. 5M is financial independence for vast majority of households. 2.5M can be as well for many if their baseline spend remains at 100k.

> All of my friends still work so what could I possibly do with my free time that I don’t do now?

Financial independence provides vast opportunities for those with ideas but lacking time. There's a reason most businesses are pursued by those who have financial wealth on their side.


I calculated saving $100K out of $170K gross some pre tax and most post tax is living off of around $1800 a month.

And again, expecting your salary to double every six or seven years is not realistic for most developers.

Most developers making around $120K in 2013, aren’t making $480K now.


$170K as a starting salary, scratch that, as any salary, is already way beyond the average SWE salary.


That’s true. I said “top out at”. When I left AWS working in the ProServe (cloud consulting division) in 2023, I was seeing “architect” positions in Atlanta - I didn’t live there any more, but most of my network was still there - topping out at $175k. But for “senior” developers it was even less.

I obviously decided to stay in consulting and work full time for a third party consulting company.


Shit, dump that $100k into bitcoin at the low point of 2015, and you'd have $37 million today. Easy!


> The fact that this endpoint guarantees QUERY behavior is just part of the documented server interface

And how do you communicate this behavior to the client (and any other infrastructure in-between) in a machine-readable way?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: