Hacker Newsnew | past | comments | ask | show | jobs | submit | more fnordsensei's commentslogin

As someone who gets useful Clojure out of Claude quite consistently, I’m not sure that volume is the only reason for output quality.


I personally enjoy the “You’re absolutely right!” exclamation. It signals alignment with my feedback in a consistent manner.


You’re overlooking the fact that it still says that when you are, in reality, absolutely wrong.


That’s not the purpose of it, as I understand it; it’s a token phrase generated to cajole it down a particular path.[1] An alignment mechanism.

The complement appears to be, “actually, that’s not right.”, a correction mechanism.

1: https://news.ycombinator.com/item?id=45137802


It gets annoying because A) it so quickly dismisses its own logic and conclusion from less than two minutes ago (extreme confidence with minimal conviction), and B) it fucks up the second time too (sometimes in the same way!) about 33% of the time.


Gemini 2.5 Pro seems to have a tic where after an initial failed task, it then starts asserting escalating levels of confidence for each subsequent attempt. Like it's ever conscious of its failure lingering in its context and feels the need to over compensate as a form of reassuring both the user and itself that it's not going to immediately faceplant again.


ChatGPT does the same thing, to the point that after several rounds of pointing out errors or hallucinations it will say things like “Ok, you’re right. No more foolish mistakes. This is it, for all the marbles. Here is an assured, triple-checked, 100% error-free, working script, with no chance of failure.”

Which fails in pretty much the exact same way it did before.

Once ChatGPT hits that supremely confident “Ok nothing was working because I was being an idiot but now I’m not” type of dialogue, I know it’s time to just start a new chat. There’s no pulling it out of “spinning the tires while gaslighting” mode.

I’ve even had it go as far as outputting a zip file with an empty .txt that supposedly contained the solution to a certain problem it was having issues with.


I’ve had the opposite experience with GPT-5, where it’s utterly convinced that its own (incorrect) solution is the way to go that it turns me down and preemptively launches tools to implement what it has in mind.

I get that it’s tradeoffs, but erring on the side of the human being correct is probably going to be a safer bet for another generation or two.


Hmmh. I believe your explanation, but I don't think that's the full story. It's also a sycophancy mechanism to maximize engagement from real users and reward hack AI labelers.


That doesn’t seem plausible to me. Not that LLMs can’t be sycophantic, but I don’t think this phrase in particular is part of it.

It’s a canned phrase in a place where an LLM could be much more creative to much greater efficacy.


I think there’s something to it.

Part of me thinks that when they do their “which of these responses do you prefer” A/B test on users… whereas perhaps many on HN would try to judge the level of technical detail, complexity, usefulness… I’m inclined to believe the midwit population at large would be inclined to choose the option where the magic AI supercomputer reaffirms and praises the wisdom of whatever they say, no matter how stupid or wrong it is.


I don't disagree exactly, it's just that it smells weird.

LLMs are incredibly good at social engineering when we let them, whereas I could write the code to emit "you're right" or "that's not quite right" without involving any statistical prediction.

Ie., as a method of persuasion, canned responses are incredibly inefficient (as evidenced by the annoyance with them), whereas we know that the LLM is capable of being far more insidious and subtle in its praise of you. For example, it could be instructed to launch weak counter arguments, "spot" the weaknesses, and then conclude that your position is the correct one.

But let's say that there's a monitoring mechanism that concludes that adjustments are needed. In order to "force" the LLM to drop the previous context, it "seeds" the response with "You're right", or "That's not quite right", as if it were the LLMs own conclusion. Then, when the LLM starts predicting what comes next, it must conclude things that follow from "you're right" or "that's not quite right".

So while they are very inefficient as persuasion and communication, they might be very efficient at breaking with the otherwise overwhelming context that would interfere with the change you're trying to affect.

That's the reason why I like the canned phrases. It's not that I particularly enjoy the communication in itself, it's that they are clear enough signals of what's going on. They give a tiny level observability to the black box, in the form of indicating a path change.


But the there’s also the negative psychological impact on the user having the model so strongly agree with them all the time. —— I cannot be the only one who half expects humans to say this to me all the time now?


And that it often spits out the exact same wrong answer in response.


Next week is next month.


I swear I forgot :sob:

I AM LAUGHING SO HARD RIGHT NOWWWWW

LMAOOOO

I wish to upvote this twice lol


If you don’t mind a tangent, this is close to my way of working out thoughts as well, including with code.

When I write code, it’s as much a cognitive tool as it is a tool to make things happen in the system. It develops thoughts as much as it develops system behavior.

Involving AI changes this quite a bit, but I feel like I’m making my way to a balance where it supports rather than replaces (or worse: disrupts) my cognitive processes.


Not at all, I'm all about tangents. The blog post itself is a tangent.

Programming is writing for me. So, yes I am the same... I need to type (or sometimes write it longhand), to make progress.

I gave LLMs a fair shake, but generative mode usage overwhelms my nervous system. I could use 'em, maybe, for pattern-recognition. But using an expressive language (Clojure) means I can eyeball my source code and/or grep through it to maintain a good enough view of my system. This also applies to most third-party code I use from the Clojure ecosystem. Libraries tend to be small (a few thousand lines of code), and I can skim-read through them quick enough.

I know there is a black art to it that one is supposed to learn, in order to get useful results, but so far, the incentive isn't strong enough for me.

So, hand typing / writing it is... might as well feel satisfied using my nice keyboard and little notebook, on my way to obsolescence. No?


Sorry, but what does one have to do with the other?

Are you one of those people who think you can’t criticize a movie unless you’re a director yourself?


> Sorry, but what does one have to do with the other?

Let me spell it out for you:

Daring Fireball has a really ugly website. On that really ugly website, they are criticizing Apple's ugly icons. This is ironic.

> Are you one of those people who think you can’t criticize a movie unless you’re a director yourself?

No.


Alright, then my analogy holds.

He can’t criticize Apple unless he first makes a really pretty website.

You can’t criticize the movie unless you first direct a better one.


The brain can literally not process any piece of information without being changed by the act of processing it. Neuronal pathways are constantly being reinforced or weakened.

Even remembering alters the memory being recalled, entirely unlike how computers work.


I've always find it interesting that once I take a wrong turn finding my way through the city and I'm not deliberate about remembering this was, in fact, a mistake, I am more prone to taking the same wrong turn again the next time.


> once I take a wrong turn finding my way through the city... I am more prone to taking the same wrong turn again

You may want to stay home then to avoid getting lost.


For humans, remembering strengthens that memory, even if it is dead wrong.


> By being particularly bad at anything outside of the most popular languages and frameworks, LLMs force you to pick a very mainstream stack if you want to be efficient.

Do they? I’ve found Clojure-MCP[1] to be very useful. OTOH, I’m not attempting to replace myself, only augment myself.

1: https://github.com/bhauman/clojure-mcp


Thanks for the link! I used to use Clojure a lot professionally, but now just for fun projects, and to occasionally update my old Clojure book. I had bookmarked Clojure-MCP a while ago, but never got back to it but I will give it a try.

I like your phrasing of “OTOH, I’m not attempting to replace myself, only augment myself.” because that is my personal philosophy also.


No, you've got it wrong. If US models are suddenly required to praise their monarch, or hide his past affiliations or whatever, that warrants them more critique.

Chinese models aren't don't become exempt "because the US is also bad", they both rightfully become targets of criticism.

Other than that, testing the boundaries of the model with a well-established case is very common when evaluating models, and not only with regards to censorship.


Just because it’s open source doesn’t mean it’s well tested, or well pen tested, or whatever the applicable security aspect is.

It could also mean that attacks against it are high value (because of high distribution).

Point is, license isn’t a great security parameter in and of itself IMO.


Inline on the component itself. Then reach for the button component when you need to make a button.


So your <Button> component still has 60 tailwind classes on it?

I think that might work in React, but might have a payload impact on server-rendered React.

Another interesting point for using something like this (specifically, using shorter semantic class names instead of multiple tailwind classes) is: Phoenix LiveView

LiveView streams DOM changes over the websocket, so afaik can't really meaningfully be compressed to eliminate bytes. By using `btn` instead of 30 tailwind classes, your payloads will be smaller over the wire.

A bit niche, but something to think about.

The fact that your `<Button>` React component renders 60 tailwind classes might not seem bad (because gzip or otherwise might actually make it ~negligible if you have a ton of buttons on the page and you're server rendering with compression enabled), but in LiveView's case, I don't think there's really any other option (not enough of a text corpus to compress?).

Not sure if this was a factor in Phoenix's recent default inclusion of DaisyUI or not.

Even in Phoenix, I'm still using a `<.button>` Phoenix component, but that uses a smaller semantic classname most of the time


> So your <Button> component still has 60 tailwind classes on it?

Yes, and this is better I think, because you still have only one button. So you just reuse the component.

But if you make a .button class, now people are going to be tempted to use that to style their own buttons. And now, you have a dozen buttons and your app breaks in tiny little ways and your codebase is a hot mess.


> But if you make a .button class, now people are going to be tempted to use that to style their own buttons.

How is having consistent design across the entire application a bad thing?


There has never been an application where I worked that anyone is happy with the base design of the button. There is always some variant, some edge case, some requirement that doesn't fit. This is the whole reason why BEM became a thing briefly and it was tedious. Tailwind solves all that. Just have a base set of styles, pass in any additional styles to the component based on the specific requirements and where the component is used, use tailwind-merge in the component and now I never have to care ever again.


I agree, same boat. You always end up with dozens of button styles and most devs treat CSS as read-only.

Changing the .button class is risky, you'll break the entire application. So you just create a new class. Um, oops.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: