Hacker Newsnew | past | comments | ask | show | jobs | submit | dcx's commentslogin

Hello! I'm one of the main authors of the paper. Thanks for engaging with our work so thoughtfully – that's a very clear and valid question.

We didn't get around to addressing this within the paper itself – 80 pages is a lot, and deadlines, etc. But I have unpublished experiments that show that in a reasonably broad setting I'm doing some work in, verbalized probabilities are restoring a distribution that looks almost identical to the base distribution. It is not possible to demonstrate this on frontier models, since their public models are already mode-collapsed, and they don't share the base model or logprobs anyway. But I've established this to my personal satisfaction on large local models which offer base / post-trained pairs.

To share some intuition on why one might believe this is occurring: there are a bunch of tasks implicit in the pre-training corpus that encourage the model to learn this capability. Consider sentences in news and research articles like: "Scientists discover that [doing something] increases [some outcome] on [some population] by X%". It seems quite natural that the model might learn a pathway by which it can translate its base probabilities into the equivalent numeric tokens in order to "beat" the task of reducing loss on the "X%" prediction. I can even almost visualize how this works mechanically in terms of what the upper layers of an MLP would do to learn this, i.e. translating from weights into specific token slots. And this is almost certainly more parameter-efficient than constructing an entire separate emulated reality for filling in X. Although I'm not ruling out that the latter might still be happening – perhaps some future interp research might be able to validate this!

I'm actually working on a paper that packs up some of the above findings in passing. But if helpful in the meantime, this is also building on related work by Tian et al. 2023, "Just Ask for Calibration" [1] and Meister et al. 2024, "Benchmarking Distributional Alignment of LLMs" [2], that give some extra confidence here. Their findings indicate that whether or not they are rooted in the model's base probabilities, they seem to be useful for the purposes that people care about. (Oh, and you can probably set up an experiment to verify this independently with vLLM in a few Claude Code requests!)

Hope that was helpful – feel free to ping with follow-ups! (Although replies might be a little delayed, I happened to see this at a good time; having quite a crunchy week)

[1] https://arxiv.org/abs/2305.14975

[2] https://arxiv.org/abs/2411.05403


This is very funny to me. It took about a decade for them to receive a scientific name – because people were too busy eating them the whole time! The "note on the Bathynomus fishery" really makes the circumstances of this "discovery" quite clear.

Sadly, within the taxonomy itself the authors restrain themselves from sharing their findings on the most delicious parts and preparations of the animal. Darwin would have been disappointed [1], but at least as a species we've gotten our time down from 300 years [2].

1. https://www.npr.org/sections/thesalt/2015/08/12/430075644/di...

2. https://www.youtube.com/watch?v=zPggB4MfPnk


Formal species are focused in the regions where formal biologists work. Outside of those regions, people don't care.

This paper appears to define the new species by physical characteristics, although it also remarks that differential characteristics within the new species "are regarded as intraspecific variation for the time being".

This isn't a question that people approach with any kind of rigor.

Compare the recent paper ( https://www.cell.com/current-biology/fulltext/S0960-9822%282... ) finding that, despite obvious phenotypic differences, snail darters are "not a species but a subpopulation of the Stargazing Darter" because the authors couldn't find an obvious genetic difference.† Whatever your view of speciation is, it can't accommodate both papers.

† The claim that they must not be a "species" loosely implies that a genetic difference doesn't exist, or that what difference does exist "doesn't count". That second option is not a scientific claim. The first option is possible - it might be that, if you dropped eggs from other stargazing darters into the waters inhabited by snail darters, those eggs would develop into snail darters, with the phenotype being driven by environmental input. But I don't think that's especially likely. The phenotypic difference is more likely evidence that there is a genetic difference, and we just can't see it.


I'm quite curious, what are people doing with AI-powered browser automation at the moment?

This is such a new capability that I'm having a hard time getting a sense of interesting use cases. I'm quite sure this is more than just a shim for web services which don't expose APIs. But I also wonder whether LLMs are good enough to be trusted with more open-ended tasks at this stage.


Pretty wide variety of use cases tbh - everything from classic scraping on the very simple side to integrating with 3P services that don’t offer APIs, agentic QA testing etc

The companies we’ve seen automatic agentic workflows well typically send a bunch of context to the LLM and somewhat constrain the actions that the model can take. Actually works better than you’d expect :)


If the claim that the blogosphere is dying is true, does that imply the public intellectual commons is dying too? I suspect that while the cosyweb is more pleasant for most, this retreat might hinder vital testing and cross-pollination of ideas, and make it much harder for people to polarize into being intellectually active. For example, I've never been an active participant on ribbonfarm, but Rao's writing has made me a little smarter and inclined towards certain vectors of thought. And you can see ripples of his work in later writing by others.

What a shame it would be for this culture to be lost; while there's a lot of dross in the blogosphere, I don't know if the brightest jewels will still be possible in a future system of local, private, transient clusters of thought.


I would say it's dead. Killed by a change in cultural attitude to one that sees an opposing idea as a declaration of war. Retreat into private walled gardens seems like the only option.


Speaking as someone who went through this, running experiments with your diet is absolutely worth trying. It worked for me. There is actually a specific medical practice for this: look up "FODMAP". The idea is to temporarily cut out all likely suspects for a short period, see if that fixes things, and then gradually reintroduce them to identify the culprit. A gastroenterologist recommended this to me. It didn't help with my issues at the time as gluten is not covered by this cluster, but struck me as a very sensible approach.

In my experience the medical system is unusually useless and dismissive with digestive issues. I think this is probably related to how little it can do in this area. 10-15% of the US has IBS, and this is a disease of exclusion. That literally means that the medical system acknowledges a cluster of symptoms, but has no idea what is causing them or how to cure them. I can imagine that blaming patients is easier than the alternatives for some doctors.


Sorry, I didn't want to turn my original post into an essay, but I've already done low FODMAP and various other restricted diets for diagnostic purposes, without any noticeable shift in symptoms. Only bread and sugar seem to be correlated, and not strongly. To me it's a curious symptom rather than the root of the issue.


Have you looked into Mutaflor probiotics? It’s a beneficial E. Coli (seriously) that has research showing it can colonize the gut and clinically improve e.g. IBS symptoms. I’ve personally seen it solve (or significantly mitigate) issues for multiple people.


Thanks! I'll take a look at this.


Have you had the celiac blood test?


I wish the medical world would not consider a "syndrome" as a diagnosis. No, it's a symptom! Maybe that's all the information you have, but it's not an answer.

As for FODMAP--I've gone that route and gotten a few surprises. The origin of something can matter. The storage can matter.


Oh wow, I have the first half of this situation. I went through a period where my digestion was so bad that it was affecting my ability to function from day to day. I didn't get anything useful from my gastro; I even had a negative celiac antibody test. Eventually I started rigorously tracking everything I ate against my symptoms, and after a few months I was able to draw a strong correlation with gluten intake. From memory it was in the 0.7 range. The day I cut out gluten, a set of awful digestive symptoms completely left my life. They return any time I am glutened.

I was fortunate that over time I managed to return myself to full capacity, through reading a ton of research and running dozens of experiments like the above. But it was so damn hard. The symptoms reduced my ability to use my brain to fix myself. And if you're not a careful eater, it's not at all intuitive which foods contain gluten. This was also almost a decade ago while living in a developing country, so it wasn't even apparent that gluten might be a suspect.

I'm currently based in the US - does anyone know how one might get properly tested for chronic giardiasis, as a person who isn't themselves in microbiology? I almost certainly encountered poorly treated water in that period of my life.

Also - I can't help but suspect that a nontrivial percentage of the developing world is living below their full capacity due to something like this. Neglected tropical diseases are a horrendous category.


> I even had a negative celiac antibody test.

Note you can also check your genetic config.log to see if there was a -DALLOW_CELIAC flag in your source build.

Unfortunately your body's settings dialog is shit and does not show you whether or not that feature is set to on or off. But if you were built without that flag then you lack the code for the Celiac algorithm altogether and are good to go. (There may be other sensitivities to gluten, but at least nothing that corrupts your nutrient slurping event loop.)


I just hope that when the time comes for my next version, I’m recompiled with the -ALLOW_GLUTEN_INTAKE flag


Veering offtopic, your phrasing reminds me of the short story that made me discover Cory Doctorow:

https://www.salon.com/2002/08/28/0wnz0red/

(shit, >20 years ago!)


I'd say he wrote that just after reading Snow Crash in about 1998


For testing: As the article says, find a doctor that has experience with it and ask for an antigen test. The below capacity thing can be very real, supposedly a different parasite in the us is responsible for people in southern us having the stereotype of lazy. In that case it could infect you through your feet from tainted soil. https://www.pbs.org/wgbh/nova/article/how-a-worm-gave-the-so...


Thanks, I'm certainly going to try that. I was more asking if anyone has experience getting tests done properly in light of their low accuracy. From what I understand, an antigen test is still a stool test, meaning they are only 50% accurate. As a commenter on this post shared, managing the health system is challenging in this area. I just did a bit of googling, and found a couple of leads here:

> CDC recommends collecting three stool samples from patients over several days for accurate test results. Commercial testing products for diagnosing giardiasis are available in the United States. [1]

Perhaps running three tests is the standard of care, or if not one might advocate for this based on the CDC recommendation. And if dismissed, perhaps there are commercial products available at the consumer level.

[1] https://www.cdc.gov/giardia/hcp/diagnosis-testing/index.html


Where I live, microbiologists work the diagnosis by examining stool through a microscope. Nowadays, though, doctors are lazy and just prescribe antiparasitaries without a diagnosis.

I was taught to suspect worms only in children and immunicompromised adults. And I never found the exception.


Odd that you never found what you weren't looking for ...


Depends on where you live. Parasites are utterly endemic in areas as close as a 30 minutes drive away from me. They are commonly found in patients of all ages, including otherwise normal functioning adults.

Depending on the epidemiology, testing a population is a waste of time and money. They have a very high chance of having the disease and a very high chance of reinfection even after treatment. So what happens is those patients come in every once in a while and they straight up ask for their periodic albendazole dose. And then they go back to their homes and they drink the exact same water and eat the exact same food.


> From what I understand, an antigen test is still a stool test, meaning they are only 50% accurate.

“Accuracy” is too vague. You want to find out what the sensitivity and specificity are.

https://ebn.bmj.com/content/23/1/2

For instance, a rapid covid test might have low sensitivity but high specificity. Meaning if it’s negative, you could still have the disease. But if it’s positive, you’re almost certainly sick. Ie the false negative rate is a lot higher than the false positive rate.


Technically a "rapid covid test" only detects the presence of certain viral genetic material. This usually means the patient is or recently was infected with SARS-CoV-2 (the virus) but it doesn't indicate anything about whether the patient has COVID-19 (the disease). Many infections are asymptomatic and thus not medically classified as a disease state.

This distinction might seem pedantic but it's important to be precise when discussing medical issues.


If you want to be precise… There are different types of “rapid COVID test”, the most popular of which detect antigens, not ‘viral genetic material’. PCR tests detect genetic material. Both tests seem to have differing levels of sensitivity to each variant of the virus.


Stool tests are questionable to begin with.


> And if dismissed, perhaps there are commercial products available at the consumer level.

You can walk into Tractor Supply with a $20 bill and walk out with a horse-sized tube of fenbendazole paste and a few bucks in change.


It's true though, southerners aren't as productive as yanks.


I have a positive antigen test for celiac disease (I had 2 elevated antigens actually, both associated with celiac disease). The gastroenterologist told me I have celiac disease. Yet I've never experienced symptoms.

I stopped eating gluten and the associated antigens went down to normal levels. I don't feel any better or worse though.

The literature says there are false positives, and I've always wondered if might be one of them. I've searched celiac forums and I've never encountered anyone with a false positives diagnosis. Lots of false negatives or non-celiac gluten sensitivities though.

I do have the gene required for celiac disease, but most who have this gene do not have celiac disease.


... Take it from someone with celiac. You don't always feel it. I didn't for decades.

But it caught up with me. Really badly. And when I say badly, it nearly killed me. I got hypocalcima, and fuck me if it wasn't the most painful thing I ever endured. I've broken bones, etc. All that shit is child's play to every muscle in your body locking up and your body feeling like it is getting stabbed with needles all over. Thankfully once they give you calcium, it goes down. But... I was probably an hour or two away from dead, from asphyxiation.

I played with fire again, and caught it quicker the next time. But I had no concrete diagnosis. Now, I do.

Oh, as a bonus, I got the bones of an 85 year old woman with osteopenia.

Don't fuck with this shit internet stranger. Please.


Wow, I'm the same. Celiac with very few symptoms.

Hearing this makes me want to keep on track. But I would like regular blood tests to find out if my nutrition absorption is improving. That would at least motivate me to keep eating this restricted diet.

Good to hear your story as a warning.


Then push for them.

For me the motivation is maybe not having to worry about breaking a hip if I fall.


Gluten can also trigger certain thyroid conditions, which can wreck havoc everywhere in the body, including the intestines.

IIRC, the problem is that gluten is similar to thyroid tissue, and for some people, the immune system will then attack the thyroid as well as causing trouble in the intestines where the gluten was found.


I can't speak for the US but a friend in the UK was recently diagnosed with giardiasis, not chronic though. As far as I can tell it was simply a routine check because of the symptoms. He didn't give me the impression that it was a difficult diagnosis to obtain. The medics reckoned that it came from a bag of contaminated salad leaves.

Surely you just go to your GP, explain your concerns and symptoms and get tested. Here is the UK NHS page on the subject: https://www.nhs.uk/conditions/giardiasis/


Here is a thought. If you're having difficulty finding proper testing, just have a doctor prescribe the treatment for giardiasis. After finishing it, you can test whether things are now better for you.


In your experience, are doctors generally willing to do this?


If you're in the US, I'm sure you can convince a doctor to help you. Since the treatment is not life threatening or a narcotic and just a course of antibiotics, you should have no problem.


>> can't help but suspect that a nontrivial percentage of the developing world is living below their full capacity due to related disease burdens, e.g. from this or other neglected tropical diseases.

WHO literally estimated 1 billion living people were infected with hookworm at some point throughout childhood. Once that happens in areas with poor food security to begin with your brain is likely fucked for life from stunted development due to childhood malnutrition.

Diminished capacity due to disease burden is definitely high.


I'd love to hear more about your experience with this platform. Is there any public information about this platform or research? How did you guys manage the bad behaviors that come with anonymity, like trolling?

I'm currently feeling out a little research project around the use of AI in improving group decision-making, specifically with the hope of improving our political systems. There are so few IRL case studies, it'd be great to have more intuition for this!

(I'm @dch on twitter if you'd prefer to DM)


That's very interesting - the Chinese "oi" has pretty much exactly the same usages, down to the little details! (I'm most familiar with the Cantonese 喂 wai2, though it's not my first language)

I wonder if that suggests this is one of those universal pre-linguistic words, like "huh" [1]. That feels especially believable given the utter simplicity of the mouth sound. It's almost just a yell, in the same way that "huh" is almost just an outbreath.

[1] https://www.smithsonianmag.com/science-nature/everybody-almo...


This is so insightful! What did they believe the main process was, that was causing things to break?

My guesses - this is about (a) inequality reaching a breaking point, where non-elites are starting to reject the social contract, making elite signaling dangerous, and (b) that elites at that moment don't have enough justification for their elevated status (mystification, misrecognition, etc)


Social contract. Every year I pay its passive. Can anyone tell me when the profits will be distributed?


The profits were initially distributed before you were born. Every year more profits are added to the value of society (I assume you are a part of society; if not, I do apologize) because potholes are filled, food is verified safe to eat, electricity is regulated, and generally people leave you and your stuff alone so you can do whatever you want within the bounds of the social contract.


FYI OP is answering my question on what the historian's theory was! (They were downvoted)

This route makes sense to me; I do think the deal has been getting worse and worse lately, and periodic renegotiation is required to keep things running properly.


Well, it can't be (a). The inequality in Rome had already reached a breaking point and was completely resolved problem by then.

And by that I mean a revolutionary people's party had replaced a corrupt elected government with a divine monarchy.

(I don't think I need to describe the wonders that did for equality in the Roman state. Archeological record indicates when barbarians conquered Roman lands, the nutrition of the lower classes often improved. But for some reason those same lower classes that had destroyed the republic never complained about inequality in imperial times.)


I think OP is right but hasn't fully unrolled what they mean. This seems like it creates a moral hazard [1], because it reduces the penalty to airlines for losing luggage.

If airlines were prepared to accept X in costs from lost luggage before, and now get paid Y, they can now accept about X+Y in lost luggage costs. Meaning more lost luggage! After the system dynamics resettle, from a certain angle this is basically stealing: customers pay the same ticket price, but are paying some percentage more in expected lost luggage.

And on top of that, this is just considering the financial cost - what is actually happening is that more travelers are being deprived of their personal belongings. These are worth more to people than insurance payouts! It's surprising, but when you zoom out this seems downright awful.

[1]: https://en.wikipedia.org/wiki/Moral_hazard


An airline owes you $2500 for lost luggage on domestic flights. They aren’t going to make that up by selling to places like this. It’s not like they tell the department responsible for getting your luggage back to you to go through the luggage and determine whether they can make a profit on you.

Anecdotally, every thing my wife and I own are in four suitcases that we take across the country 7 months a year making one way trips.

https://news.ycombinator.com/item?id=36306966

We have to take all of our personal belongings with us when we leave our home so our place can be rented out to cover our mortgage and expenses while we travel over half of the year.

For $2500 we could replace everything that we have in any one piece of checked luggage and have money left over. We keep anything of value in our carry on backpack.

https://www.amazon.com/dp/B0B12SPPXV?


> On flights within the U.S., airlines are responsible for lost-luggage reimbursement up to $2,500 per person; on international flights, airlines owe you a mere $9.07 per pound, with a ceiling of $640. (That rate was set by an international treaty in 1929.)

> Beyond that, airlines owe you nothing for your most valuable items. Most contracts of carriage specifically exempt from compensation things like antiques, art, books, documents, money, cameras, collectibles, electronics, or "fragile or perishable items." [1]

Also this is not a guaranteed payout, you may need to provide receipts for your things [2]. In general I would prefer to keep the contents of my bag vs receiving "up to" $2,500. We're mostly working people here. The time and money cost of travel disruptions, replacing stuff, navigating bureaucracy etc. is not low. And my stuff has emotional value.

I agree that this setup seems reasonable in a perfect world. But knowing how large, complex companies function, I feel it is unwise to create any kind of loop which rewards airlines as a function of luggage lost.

[1] https://www.frommers.com/tips/airfare/the-bottom-line-what-d...

[2] https://www.peopleclerk.com/post/airline-lost-delayed-luggag...


The stuff you take in your suitcase when you travel has “emotional value”? What do you take with you when you travel?

Since our case is quite unusual, if I travel somewhere like a typical person and carry a weeks worth of clothes, that’s five- seven pairs of pants, 5-7 shirts, some underwear and some socks and maybe a change of shoes.

When I traveled for work, I also had some gym clothes.

That’s around $1500 - $2000 of clothes max. I fly mostly Delta and occasionally AA. Delta is not going to quibble about $2K for lost luggage. They claim less around .66% of luggage is lost or delayed.

This is how often the average person flies in a year.

https://news.gallup.com/poll/388484/air-travel-remains-down-...

The chance of the average person experiencing luggage being lost or delayed is slim.


I mean… the grocery store can make more money by double scanning an item I purchase, or McDonald’s can skim some money by giving everyone one less pickle. At some point the service quality has to be intrinsically motivated by the company wanting to preserve their brand, right? How much could they make off stolen luggage that they’d legitimately lose more on purpose? Surely the payout they have to make of losing luggage both monetarily and indirectly through brand devaluation is greater than the amount of money they’d make selling the luggage?


I agree in the abstract, but people and companies operate under bounded rationality. Customers can't easily price the expectation of lost luggage. Companies are made of subunits with independent budgets, each optimizing for incentives with short time horizons.

McDonald's is actually a good example, they did exactly this. IMO one reason their brand has been devalued is death by a thousand cost cuts. Each cut is imperceptible and seems like a win. But over several decades the net effect is a disaster. (IIRC Fast Food Nation documented some ideas about this process)


The obvious solution to the moral hazard is to require that the company not profit from the disposal of the lost luggage. Unfortunately this is difficult to achieve, even imperfectly. “Give the proceeds of the lot sales to charity” sounds like a straightforward solution, but who’s charity? Whoever ends up profiting now has incentive to cause more to be lost; the further they’re disconnected from the airline industry and the specific airline the better. Anonymous donation of proceeds would be best, if it can be managed.


It is not possible. This is the same problem I had with government fines such as traffic tickets. People can't afford to spend time away from work fighting stupid tickets So they pay to settle When the district attorney says pay and we will let you go.

What happens when government lowers it's taxes and instead gets funded with fees and fines?

I'd say airlines should not be allowed to sell any luggage. It is not theirs to sell. If they are caught selling items that don't belong to them, Put the CEO and the entire board in prison for life.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: