More

alecbz · 2026-02-03T00:03:50 1770077030

This happened to me once, they just brought out someone (supervisor?) who asked questions about what addresses I've lived at, other similar questions I'd probably only know the answer to.

It does take longer than regular screening (most of the time was just spent waiting for the supervisor -- I'm not sure they were spending time collecting some data first), if that causes you to miss your flight you miss your flight.

It seems plausible to me that $45 could be about a TSA employee's wage times how much longer this takes. In aggregate, this (in theory) lets them hire additional staff to make sure normal screening doesn't take longer due to existing staff being tied up in extra verifications.

alecbz · 2026-02-02T23:56:22 1770076582

It's not that they'd pay individual employees more, it's that they'd hire more workers to account for the fact that their existing workers are tied up doing extra verification.

Though they might not do that either.

ibejoeb · 2026-02-03T00:00:02 1770076802

Even that fails a sanity test. They're not doing anything more than they would have done 25 years ago when the whole damn thing started.

alecbz · 2026-02-03T00:06:38 1770077198

I wasn't flying 25 years ago but I'm not sure what you mean, or how that's relevant actually. The point is just that it takes them more time to do the "extra screening" if you don't have your ID than the standard screening if you did have your ID.

ibejoeb · 2026-02-03T00:21:57 1770078117

Sure. A couple of things to clarify:

1. They're not doing screening. The screening comes later. At this stage, they're attempting to identify someone. That has never been the job. The job is to prevent guns, knives, swollen batteries, or anything else that could be a safety threat during air travel.

2. Regardless, the reality is that they do identify travelers. Even so, the job has not changed. If you don't present sufficient identification, they will identify you through other mechanisms. The only thing the new dictate says is that they don't want this document, they want that document.

FireBeyond · 2026-02-03T04:24:42 1770092682

> That has never been the job. The job is to prevent guns, knives, swollen batteries, or anything else that could be a safety threat during air travel.

A job that by their own internal testing, they do well less than 5% of the time (some of their audits showed that 98% of fake/test guns that were sent through TSA got through checkpoints).

alecbz · 2026-02-02T17:48:33 1770054513

Trying to get LLMs to understand bugs that I myself am stuck on has had an approximately 0% success rate for me.

They're energetic "interns" that can churn out a lot of stuff fast but seem to struggle a lot with critical thinking.

jama211 · 2026-02-03T18:09:59 1770142199

I’m not going to take bitter advice from someone who either hasn’t used them in a long time, or is terribly bad at using them. Especially as it seems like you hate them so much.

I don’t particularly like them or dislike them, they’re just tools. But saying they never work for bug fixing is just ridiculous. Feels more like you just wanted an excuse to get on your soapbox.

alecbz · 2026-02-03T18:44:33 1770144273

It's not that they can't fix bugs at all, but I find that if I've already attempted to debug something and hit a wall, they're rarely able to help further.

chrisjj · 2026-02-02T20:23:09 1770063789

> seem to struggle a lot with critical thinking.

It is an illusion arising from anthropomorphisation. They aren't thinking at all. They are just parotting the output of thinking that has long gone.

alecbz · 2026-02-02T23:47:31 1770076051

This feels too strong IMO.

Just focusing on the outputs we can observe, LLMs clearly seem to be able to "think" correctly on some small problems that feel generalized from examples its been trained on (as opposed to pure regurgitation).

Objecting to this on some kind of philosophical grounds of "being able to generalize from existing patterns isn't the same as thinking" feels like a distinction without a difference. If LLMs were better at solving complex problems I would absolutely describe what they're doing as "thinking". They just aren't, in practice.

chrisjj · 2026-02-03T10:14:28 1770113668

> Just focusing on the outputs we can observe, LLMs clearly seem to be able to "think" correctly on some small problems that feel generalized from examples its been trained on (as opposed to pure regurgitation).

"Seem". "Feel". That's the anthropomorphisation at work again.

These chatbots are called Large Language Models for a reason. Language is mere text, not thought.

If their sellers could get away with calling them Large Thought Models, they would. They can't, because these chatbots do not think.

alecbz · 2026-02-03T15:42:37 1770133357

> "Seem". "Feel". That's the anthropomorphisation at work again.

Those are descriptions of my thoughts. So no, not anthropomorphisation, unless you think I'm a bot.

> These chatbots are called Large Language Models for a reason. Language is mere text, not thought. If their sellers could get away with calling them Large Thought Models, they would. They can't, because these chatbots do not think.

They use the term "thinking" all the time.

----

I'm more than willing to listen to an argument that what LLMs are doing should not be considered thought, but "it doesn't have 'thought' in the name" ain't it.

chrisjj · 2026-02-03T16:39:09 1770136749

> Those are descriptions of my thoughts. So no, not anthropomorphisation

The result of anthromorphisation. When we treat a machine as a machine, we less need to understand it by seems and feel.

> They use the term "thinking" all the time.

I find not. E.g. ChatGPT:

Short answer? Not like you do.

Longer, honest version: I don’t think in the human sense—no consciousness, no inner voice, no feelings, no awareness. I don’t wake up with ideas or sit there wondering about stuff. What I do have is the ability to recognize patterns in language and use them to generate responses that look like thinking.

jama211 · 2026-02-03T18:11:28 1770142288

This thread is the most pathetic internet argument I’ve read in quite a while. Look at yourselves, both of you.

alecbz · 2026-01-27T21:34:17 1769549657

> Even this evidence of woodworking is largely unremarkable .... this find is most notable for its preservation.

This somewhat contradicts the subheading, no?

> The finding, along with the discovery of a 500,000-year-old hammer made of bone, indicates that our human ancestors were making tools even earlier than archaeologists thought.

throwup238 · 2026-01-27T23:58:56 1769558336

That subheading is complete nonsense and I can't think of a single charitable reading of that sentence that in any way makes sense. Archaeologists have known that our ancestors have been making tools for over a million years since the Acheulean industry was conclusively dated in the 1850s. It took half a century for archaeologists to figure that out after William Smith invented stratigraphy. Scientists didn't even know what an isotope was yet.

The original paper's abstract is much more specific (ignore the Significance section, which is more editorializing):

> Here, we present the earliest handheld wooden tools, identified from secure contexts at the site of Marathousa 1, Greece, dated to ca. 430 ka (MIS12). [1]

Which is true. Before this the oldest handheld wooden tool with a secure context [2] was a thrusting spear from Germany dated ~400kYA [3]. The oldest evidence of woodworking is at least 1.5 million years old but we just don't have any surviving wooden tools from that period.

[1] https://www.pnas.org/doi/10.1073/pnas.2515479123

[2] This is a very important term of art in archaeology. It means that the artefact was excavated by a qualified team of archaeologists that painstakingly recorded every little detail of the excavation so that the dating can be validated using several different methods (carbon dating only works up to about 60k years)

[3] https://humanorigins.si.edu/evidence/behavior/getting-food/o...

alecbz · 2026-01-23T17:38:27 1769189907

Even ignoring determinism, with traditional source code you have a durable, human-readable blueprint of what the software is meant to do that other humans can understand and tweak. There's no analogy in the case of "don't read the code" LLM usage. No artifacts exist that humans can read or verify to understand what the software is supposed to be doing.

luckydata · 2026-01-23T19:26:49 1769196409

yeah there is. it's called "documentation" and "requirements". And it's not like you can't go read the code if you want to understand how it works, it's just not necessary to do so while in the process of getting to working software. I truly do not understand why so many people are hung up on this "I need to understand every single line of code in my program" bs I keep reading here, do you also disassemble every library you use and understand it? no, you just use it because it's faster that way.

notpachet · 2026-01-23T20:11:10 1769199070

> do you also disassemble every library you use and understand it?

Sometimes.

alecbz · 2026-01-23T20:24:31 1769199871

> it's called "documentation" and "requirements"

What I mean is an artifact that is the starting point for generating the software. Compiled binaries can be completely thrown away whenever because you know you have a blueprint (the source code) that can reliably reproduce it.

Documentation & requirements _could_ work this way if they served as input to the LLMs that would then go and create the source code from scratch. I don't think many people are using LLMs this way, but I think this is an interesting idea. Maybe soon we'll have a new generation of "LLM-facing programming languages" that are even higher level software blueprints that will be fed to LLMs to generate code.

TDD is also a potential answer here? You can imagine a world where humans just write test suites and LLMs fill out the code to get it to pass. I'm curious if people are using LLMs this way, but from what I can tell a lot of people use them for writing their tests as well.

> And it's not like you can't go read the code if you want to understand how it works

In-theory sure, but this is true of assembly in-theory as well. But the assembly of most modern software is de-facto unreadable, and LLM-generated source code will start going that way too the more people become okay with not reading it. (But again, the difference is that we're not necessarily replacing it with some higher-level blueprint that humans manage, we're just relying on the LLMs to be able to manage it completely)

> I truly do not understand why so many people are hung up on this "I need to understand every single line of code in my program" bs I keep reading here, do you also disassemble every library you use and understand it? no, you just use it because it's faster that way.

I think at the end of the day this is just an empirical question: are LLMs good enough to manage complex software "on their own", without a human necessarily being able to inspect, validate, or help debug it? If the answer is yes, maybe this is fine, but based on my experiences with LLMs so far I am not convinced that this is going to be true any time soon.

alecbz · 2026-01-23T17:34:34 1769189674

Wait, so you're a radiologist and you're using software you vibecoded to generate radiology reports for real patients? Is that, like, allowed?

mbesto · 2026-01-23T18:05:45 1769191545

Not saying it's right, but boy do I have stories about the code used in <insert any medical profession> healthcare applications. Not sure how "vibecoded" programming lines of code is any worse.

dullcrisp · 2026-01-23T20:46:43 1769201203

Because that code is presumably working and the vibe code is probably not?

alecbz · 2026-01-23T21:10:14 1769202614

Honestly even if this wasn't vibe-coded I'm still a bit surprised at individual radiologists being able to bring their own software to work, for things that can have such a high effect on patient outcomes.

mbesto · 2026-01-23T21:55:24 1769205324

do you have evidence that all vibe coded solutions dont work? Because thats what you're implying.

dullcrisp · 2026-01-23T23:29:12 1769210952

If I wanted to prove murder, not negligence.

azan_ · 2026-01-23T23:29:36 1769210976

Of course it’s allowed. It’s just kind of text editor but with support of speech to text and structured reports (e.g. when reporting spine if I say l3 bd it automatically inserts description of bulging disc in the correct place in the report). I then copy paste it to RIS so there’s absolutely nothing wrong or illegal in that.

d1sxeyes · 2026-01-23T17:58:04 1769191084

Depends where in the world they are. Here in Hungary, it’s not uncommon to email your-family-doctor@gmail.com

direwolf20 · 2026-01-23T19:24:41 1769196281

What does that have to do with vibe-coding?

alecbz · 2026-01-23T17:31:58 1769189518

I have some success but by the time I'm done I'm often not sure if I saved any time.

alecbz · 2025-11-18T17:47:11 1763488031

So scary to go from diagnosis to passing in such a short time.

Cherish every sunrise.

alecbz · 2025-11-03T23:10:21 1762211421

> basically everyone knew the internet would be revolutionary long before 1995. Being able to talk to people halfway across the world on a BBS? Sending a message to your family on the other side of the country and them receiving it instantly? Yeah, it was pretty obvious this was transformative.

That sounds pretty similar to long-distance phone calls? (which I'm sure was transformative in its own way, but not on nearly the same scale as the internet)

Do we actually know how transformative the general population of 1995 thought the internet would or wouldn't be?

xwolfi · 2025-11-04T02:37:39 1762223859

In 1995 in France we had the minitel already (like really a lot of people had one) and it was pretty incredible, but we were longing for something prettier, cheaper, snappier and more point to point (like the chat apps or emails).

As soon as the internet arrived, a bit late for us (I'd say 1999 maybe) due to the minitel's "good enough" nature, it just became instantly obvious, everyone wanted it. The general population was raving mad to get an email address, I never heard anyone criticize the internet like I criticize the fake "AI" stuff now.

alecbz · 2025-10-14T21:15:36 1760476536

Regularly trying to use LLMs to debug coding issues has convinced me that we're _nowhere_ close to the kind of AGI some are imagining is right around the corner.

surgical_fire · 2025-10-14T21:56:38 1760478998

At least Mother Brain will praise your prompt to generate yet another image in the style of Studio Ghibli as proof that your mind is a tour de force in creativity, and only a borderline genius would ask for such a thing.

ben_w · 2025-10-14T22:19:55 1760480395

Sure, but also the METR study showed the rate of change is t doubles every 7 months where t ~= «duration of human time needed to complete a task, such that SOTA AI can complete same with 50% success»: https://arxiv.org/pdf/2503.14499

I don't know how long that exponential will continue for, and I have my suspicions that it stops before week-long tasks, but that's the trend-line we're on.

alecbz · 2025-10-15T18:48:54 1760554134

Only skimmed the paper, but I'm not sure how to think about "length of task" as a metric here.

The cases I'm thinking about are things that could be solved in a few minutes by someone who knows what the issue is and how to use the tools involved. I spent around two days trying to debug one recent issue. A coworker who was a bit more familiar with the library involved figured it out in an hour or two. But in parallel with that, we also asked the library's author, who immediately identified the issue.

I'm not sure how to fit a problem like that into this "duration of human time needed to complete a task" framework.

conception · 2025-10-15T21:38:31 1760564311

This is an excellent example of human “context windows” though and it could be the llm could have solved the easy problem with better context engineering. Despite 1M token windows, things still start to get progressively worse after 100k. LLMs would overnight be amazingly better with a reliable 1M window.

alecbz · 2025-10-20T18:16:52 1760984212

What does "better context engineering" mean here? How/why are the existing token windows "unreliable"?

ben_w · 2025-10-15T21:43:28 1760564608

Fair comment.

While I think they're trying to cover that by getting experts to solve problems, it is definitely the case that humans learn much faster than current ML approaches, so "expert in one specific library" != "expert in writing software".

Pulcinella · 2025-10-15T02:39:44 1760495984

But will it actually get better or will it just get faster and more power efficient at failing to pair parentheses/braces/brackets/quotes?

ben_w · 2025-10-15T09:12:47 1760519567

Read the linked METR study please.

Or watch the Computerphile video summary/author interview, if you prefer: https://m.youtube.com/watch?v=evSFeqTZdqs