Hacker Newsnew | past | comments | ask | show | jobs | submit | carpo's commentslogin

But saying it's a confidence trick is saying it's a con. That they're trying to sell someone something that doesn't work. Th op is saying it makes then 10x more productive, so how is that a con?

The marketing says it does more than that. This isn't just a problem unique to LLMs either. We have laws about false advertising for a reason. It's going on all the time. In this case the tech is new so the lines are blurry. But to the technically inclined, it's very obvious where they are. LLMs are artificial, but they are not literally intelligent. Calling them "AI" is a scam. I hope that it's only a matter of time until that definition is clarified and we can stop the bullshit. The longer it goes, the worse it will be when the bubble bursts. Not to be overly dramatic, but economic downturns have real physical consequences. People somewhere will literally starve to death. That number of deaths depends on how well the marketers lied. Better lies lead to bigger bubbles, which when burst lead to more deaths. These are facts. (Just ask ChatGPT, it will surely agree with me, if it's intelligent. ;p)

How does one go about competing at the IMO without "intelligence", exactly? At a minimum it seems we are forced to admit that the machines are smarter than the test authors.

"LLM" as a marketing term seems rational. "Machine learning" also does. We can describe the technology honestly without using a science fiction lexicon. Just because a calculator can do math faster than Isaac Newton doesn't mean it's intelligent. I wouldn't expect it to invent a new way of doing math like Isaac Newton, at least.

Just because a calculator can do math faster than Isaac Newton doesn't mean it's intelligent.

Exactly, and that's the whole point. If you lack genuine mathematical reasoning skill, a calculator won't help you at the IMO. You might as well bring a house plant or a teddy bear.

But if you bring a GPT5-class LLM, you can walk away with a gold medal without having any idea what you're doing.

Consequently, analogies involving calculators are not valid. The burden of proof rests firmly on the shoulders of those who claim that an LLM couldn't invent new mathematical techniques in response to a problem that requires it.

In fact, that appears to have just happened (https://news.ycombinator.com/item?id=46664631), where an out-of-distribution proof for an older problem was found. (Meta: also note the vehement arguments in that thread regarding whether or not someone is using an LLM to post comments. That doesn't happen without intelligence, either.)


That doesn't appear to be what happened. But the marketing sure has a lot of people working quick to presume so.

I would guess it's only a matter of days before that proof, or one very similar, is found in the training data, if that hasn't happened already, just as has been the case every time.

No fundamental change in how the LLM functions has been made that would lead us to expect otherwise.

Similar "discoveries" occurred all the time with the dawn of the internet connecting the dots on a lot of existing knowledge. Many people found that someone had already solved many problems they were working on. We used to be able to search the web, if you can believe that.

The LLMs are bringing that back in a different way. It's functional internet search with an uncanny language model, that sadly obfuscates the underlying data while making guesswork to summarize it (which makes it harder to tell which of its findings are valuable, and which are not).

It's useful for some things, but that's not remotely what intelligence is. It doesn't literally understand.

>* if you bring a GPT5-class LLM, you can walk away with a gold medal without having any idea what you're doing.*

My money won't be betting on your GPT5-class business advice unless you have a really good idea what you're doing.

It requires some (a lot of) intelligence and experience to usefully operate an LLM in virtually every real world scenario. Think about what that implies. (It implies that it's not by itself intelligent.)


You need to read the IMO papers, seriously. Your outlook on what happened there is grossly misinformed. No searching or tool use was involved.

You cannot bluff, trick, or "market" your way through a test like that.


I didn't say anything about cheating. In fact, if it did cheat, that would make for a much stronger argument in your favor.

If scoring highly on an exam implies intelligence then certainly I'm not intelligent and the Super Nintendo from the 90s is more sentient than myself, given I'm terrible at chess.

I personally don't agree with that definition, nor does any dictionary I'm familiar with, nor do any software engineers with whom I'm familiar, nor any LLM specialists, including the forefront developers at OpenAI, xAI, Google, etc. as far as I'm aware.

But for some reason (it's a very obvious reason $$$), marketers, against the engineers' protest, appear to be claiming otherwise.

This is what you're up against and what you'll find the courts, and lawyers, will go by when this comparison comes to a head.

In my opinion, I can't wait for this to happen.

Thrilled to know if I shouldn't wait for that. If you're directly involved with some credible research to the contrary, I would love to hear more.

But IMO, in this case at least, has nothing to do with intelligence. It's performing a search against its own training data, and piecing together a response in line with that data, while including the context of the search term (aka the question). This is run through a series of linear regressions, and a response is produced. There is nothing really groundbreaking here, as best I can tell.


These arguments usually seem to come down to disagreements about definitions, as you suggest. You've talked about what you don't consider evidence of intelligence, but you haven't said anything about the criteria you would apply. What evidence of intelligent reasoning would change your mind?

It is unsupportable to claim that ML researchers at leading labs share your opinion. Since roughly 2022, they understand that they are working with systems capable of reasoning: https://arxiv.org/abs/2205.11916


Based on an English dictionary definition, I would expect an intelligence exhibits understanding, don't you? I would hope people are reading the dictionary before they market a multibillion dollar product set to reach the masses. It seems irresponsible not to.

The article you linked discussed reasoning. That's really cool. But, consider that we can say that a chess game computer opponent is reasoning. It's using a preprogrammed set of instructions to predict out to some number of possible moves ahead, and choosing the most reasonable. A calculator, essentially, it is in fact reasoning. But that doesn't have much to do with intelligence. As we read in the dictionary, intelligence implies understanding, and we certainly can't say that the Chess Masters opponent from the Super Nintendo literally understands me, right?

More to the point, I don't see that any LLM has thus far exhibited remotely any inkling of understanding, nor can it. It's a linear regression calculator. Much like a lot of TI84 graphing calculators running linear algebraic functions on a grand scale. It's impressive that basic math can achieve results across word archives that sound like a person, but it's still not understanding what it outputs, and really, not what it inputs beyond graphing it algebraically either.

It doesn't literally understand. So, it is not literally intelligent, and it will require some huge breakthroughs to change that. I very much doubt that such a discovery will happen in our lifetime.

It might be more likely that the marketers will succeed in revising the dictionary. We've seen often times that if you use words wrong enough, it becomes right. But so far at least, that hasn't happened with this word.


OK, now let's talk about what it means to "understand" something.

Let's say a kid who's not unusually gifted/talented at math somehow ends up at the International Math Olympiad. Smart-enough kid, regularly gets 4.0+ grades in normal high school classes, but today Timmy got on the wrong bus. He does have a great calculator in his backpack -- heck, we'll give him a laptop with Mathematica installed -- so he figures, why not, I'll take the test and see how it goes. Spoiler: he doesn't do so well. He has the tools, but he lacks understanding of how and when to apply them.

At the same time, the kid at the next desk also doesn't understand what's going on. She's a bright kid from a talented family -- in fact Alice's old man works for OpenAI -- but she's a bit absent-minded. Alice not only took the wrong bus this morning, but she grabbed the wrong laptop on the way out the door. She shrugs, types in the problems, and copies down what she sees on the screen. She finishes up, turns in the paper, and they give her a gold medal.

My point: any definition of "understanding" you can provide is worthless unless it can somehow account for the two kids' different experiences. One of them has a calculator that does math, the other has a calculator that understands math.

I very much doubt that such a discovery will happen in our lifetime.

So did I, and then AlphaGo happened, and IMO a few years later. At that point I realized I wasn't very good at predicting what was and was not going to be possible, so I stopped trying.


Calculators do not understand math, while both kids understand each other and the world around them. The calculator relies on an external intelligence.

Don't stop trying. Predictability is an indicator of how well a theory describes the universe. That's what science is.

The engineers have long predicted this stuff. LLM tech isn't really new. The size and speed of the machines is new. The more you understand about a topic, the better your predictions.


The more you understand about a topic, the better your predictions.

Indeed.


That's not true. I've paid for a one time license for software before and received updates until the next major release.


Man I love that story.


Reading your comment made me think of the Roman generals returning to a triumph and someone constantly following them saying "memento mori", reminding them they are not a god. Now, instead of humility it would just be seen as a challenge.


An AMS is useful just so you can have 4 different filaments ready to go at any time. Doesn't need to be for multi material models. I have an A1 with the AMS lite and a Prusa mk3s, and manually changing materials is a chore.


Fair point, I don't print enough (nevermind change material often enough) that it's such a bother that I thought of it. I expected the argument to be keeping it dry, to which I'd have said a drybox and/or dehumidifier is better and (could be) cheaper.


Would you really? when it's the only thing you've ever known you'd probably just accept it as normal.


... which begs the question of who would really arrive at the destination. Our own civilization starts to rebel at things that were heralded by the previous generation because the current generation doesn't remember the problems that were solved. In two generations, the humans that remain might not leave the ship at all despite having a whole planet (or multiple) to inhabit.


Their kids will leave the spaceship though. And some dare devils of their generation.


I doubt that number would be sufficient. Such ship would have to be very stable society. So getting enough people to harshness of unsettled planet is very tall ask.

I believe historically it was either for profit, which there is unlikely to be much in medium term. Or because the new place was expected to be better. Mostly due to resource constraints. But generation ship should be quite optimal. And well outside magic level tech there is not much to do on empty planet.


I think you'd have to manufacture a culture, with rituals and habits designed to keep people focused so that the meaning of their lives was tied to the end-goal. It would make a good story :)


Maybe developers are using it in a less visible way? In the past 6 months I've used AI for a lot of different things. Some highlights:

- Built a windows desktop app that scans local folders for videos and automatically transcribes the audio, summarises the content into a structured JSON format based on screenshots and subtitles, and automatically categorises each video. I used it on my PC to scan a couple of TB of videos. Has a relatively nice interface for browsing videos and searching and stores everything locally in SQLite. Did this in C# & Avalonia - which I've never used before. AI wrote about 75% of the code (about 28k LOC now).

- Built a custom throw-away migration tool to export a customers data from one CRM to import into another. Windows app with basic interface.

- Developed an AI process for updating a webform system that uses XML to update the form structure. This one felt like magic and I initially didn't think it would work, but it only took a minute to try. Some background - years ago I built a custom webform/checklist app for a customer. They update the forms very rarely so we never built an interface for making updates but we did write 2 stored procs to update forms - one outputs the current form as XML and another takes the same XML and runs updates across multiple tables to create a new version of the form. For changes, the customer sends me a spreadsheet with all the current form questions in one column and their changes in another. It's normally just wording changes so I go through and manually update the XML and import it, but this time they had a lot of changes - removing questions, adding new ones, combining others. They had a column with the label changes and another with a description of what they wanted (i.e. "New Question", "Update label", "Combine this with q1, q2 and q3", "remove this question"). The form has about 100 questions and the XML file is about 2500 lines long and defines each form field, section layout, conditional logic, grid display, task creation based on incorrect answers etc, so it's time consuming to make a lot of little changes like this. With no expectation of it working, I took a screenshot of the spreadsheet and the exported XML file and prompted the LLM to modify the XML based on the instructions in the spreadsheet and some basic guidelines. It did it close to perfect, even fixing the spelling mistakes the customer had missed while writing their new questions.

- Along with using it on a daily basis across multiple projects.

I've seen the stat that says developers "...thought AI was making them 20% faster, but it was actually making them 19% slower". Maybe I'm hoodwinking myself somehow, but it's been transformative for me in multiple ways.


What did you use for transcription? Local whisper via ffmpeg?


Yeah, the app lets you configure which whisper model to use and then downloads it on first load. Whisper blows me away too. Ive only got a 2080 and use the medium model and it's surprisingly good and relatively fast.


Unless that's where they want to put their base of operations.


When my kids were younger I tried to to replace my swearing by saying "sugarplum fairies". It was fairly successful in becoming a natural replacement. However, the other day I kicked my toe really badly and instinctively yelled "sugarplum FUCKING fairies" and my kids (now early teen) found it extremely funny.


This is great. Can see you put a lot of work into it. I like it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: