Hacker Newsnew | past | comments | ask | show | jobs | submit | amrrs's commentslogin

Have you tried the new GLM 4.7?


I've been using GLM 4.7 alongside Opus 4.5 and I can't believe how bad it is. Seriously.

I spent 20 minutes yesterday trying to get GLM 4.7 to understand that a simple modal on a web page (vanilla JS and HTML!) wasn't displaying when a certain button was clicked. I hooked it up to Chrome MCP in Open Code as well.

It constantly told me that it fixed the problem. In frustration, I opened Claude Code and just typed "Why won't the button with ID 'edit' work???!"

It fixed the problem in one shot. This isn't even a hard problem (and I could have just fixed it myself but I guess sunk cost fallacy).


I've used a bunch of the SOTA models (via my work's Windsurf subscription) for HTML/CSS/JS stuff over the past few months. Mind you, I am not a web developer, these are just internal and personal projects.

My experience is that all of the models seem to do a decent job of writing a whole application from scratch, up to a certain point of complexity. But as soon as you ask them for non-trivial modifications and bugfixes, they _usually_ go deep into rationalized rabbit holes into nowhere.

I burned through a lot of credits to try them all and Gemini tended to work the best for the things I was doing. But as always, YMMV.


Exactly the same feedback


Amazingly, just yesterday, I had Opus 4.5 crap itself extensively on a fairly simple problem -- it was trying to override a column with an aggregation function while also using it in a group-by without referring to the original column by its full qualified name prefixed with the table -- and in typical Claude fashion it assembled an entire abstraction layer to try and hide the problem under, before finally giving up, deleting the column, and smugly informing me I didn't need it anyway.

That evening, for kicks, I brought the problem to GLM 4.7 Flash (Flash!) and it one-shot the right solution.

It's not apples to apples, because when it comes down to it LLMs are statistical token extruders, and it's a lot easier to extrude the likely tokens from an isolated query than from a whole workspace that's already been messed up somewhat by said LLM. That, and data is not the plural of anecdote. But still, I'm easily amused, and this amused me. (I haven't otherwise pushed GLM 4.7 much and I don't have a strong opinion about about it.)

But seriously, given the consistent pattern of knitting ever larger carpets to sweep errors under that Claude seems to exhibit over and over instead of identifying and addressing root causes, I'm curious what the codebases of people who use it a lot look like.


> I can't believe how bad it is

This has been my consistent experience with every model prior to Opus 4.5, and every single open model I've given a go.

Hopefully we will get there in another 6 months when Opus is distilled into new open models, but I've always been shocked at some of the claims around open models, when I've been entirely unable to replicate them.

Hell, even Opus 4.5 shits the bed with semi-regularity on anything that's not completely greenfield for my usage, once I'm giving it tasks beyond some unseen complexity boundary.


yes I did, not on par with Opus 4.5.

I use Opus 4.5 for planning, when I reach my usage limits fallback to GLM 4.7 only for implementing the plan, it still struggles, even though I configure GLM 4.7 as both smaller model and heavier model in claude code


Thanks for sharing your repo..looks super cool.. I'm planning to try out. Is it based on mlx or just hf transformers?


Thank you, just transformers.


Unethical conduct is negotiating with Zuck? :D jokes aside. but it must something serious enough for them to part ways with the co-founder



On fal, it takes less than a second many times.

https://fal.ai/models/fal-ai/z-image/turbo/api

Couple that with the LoRA, in about 3 seconds you can generate completely personalized images.

The speed alone is a big factor but if you put the model side by side with seedream and nanobanana and other models it's definitely in the top 5 and that's killer combo imho.


I don't know anything about paying for these services, and as a beginner, I worry about running up a huge bill. Do they let you set a limit on how much you pay? I see their pricing examples, but I've never tried one of these.

https://fal.ai/pricing


It works with prepaid credits, so there should be no risk. Minimum credit amount is $10, though.


This. You can also run most (if not all) of the models that Fal.ai directly from the playground tab including Z-Image Turbo.

https://fal.ai/models/fal-ai/z-image/turbo


For images I like them: https://runware.ai/ super cheap and super fast, they also support Loras and you can upload your own models.

And you work with credits


Why the down vote? Are they scam?


Honestly speaking Netflix has good catalog, much more comparable to Hollywood. Take the latest Frankenstein for example.

Don't look at only series. They also have recipes repurposed. But they acquire good titles and also produce some good ones.


I have 459 titles on my IMDB watchlist and a tiny percentage of it is available on Netflix (if at all), but this is anecdotal and might have to do something to where I live.


Netflix outside of the US is a very different experience.

In the US, it's mostly their own productions and older content they explicitly acquired, but elsewhere, especially in markets that don't have a local HBO or Disney streaming service, they have incredible backlogs.

I remember finding basically everything I could wish for on there when traveling in SE Asia almost a decade ago, compared to a still decent offering in Western Europe, and mostly cobwebs in the US.


459!? It must take a while to check your list…


After checking 20 titles and getting no results you can notice the pattern.


If at all anything, Claude Code's success disproved this


It's actually an interesting example, because unlike Warp that tries to be a CLI with AI, Claude defaults to the AI (unless you prefix with an exclamation mark). Maybe it says more about me, but I now find myself asking Claude to write for me even relatively short sed/awk invocations that would have been faster to type by hand. The uncharitable interpretation is that I'm lazy, but the charitable one I tell myself is that I don't want to context-switch and prefer to keep my working memory at the higher level problem.

In any case, Claude Code is not really CLI, but rather a conversational interface.


Claude Code is a TUI (with "text"), not a CLI (with "command line"). The very point of CC is that you can replace a command line with human-readable texts.


Let's not be overly reductive, Claude Code is a TUI with a CLI for all input including slash commands.


You may think that's pedantic but it really isn't. Half-decent TUIs are much closer to GUIs than they are to CLIs because they're interactive and don't suffer from discoverability issues like most CLIs do. The only similarity they have with CLIs is that they both run in a terminal emulator.

"htop" is a TUI, "ps" is a CLI. They can both accomplish most of the same things but the user experience is completely different. With htop you're clicking on columns to sort the live-updating process list, while with "ps" you're reading the manual pages to find the right flags to sort the columns, wrapping it in a "watch" command to get it to update periodically, and piping into "head" to get the top N results (or looking for a ps flag to do the same).


Claude Code is a Terminal User Interface, not a Command Line Interface.


Well, it is if you just run

claude -p "Question goes here"

As that will print the answer only and exit.


But that's not how it's typically used, it's predominantly used in TUI mode so the popularity of CC doesn't tell us anything about popularity of the CLI.



I think it's more like Apple - use ChatGPT as the iPhone and build an ecosystem around it


It's heartbreaking and sad. His video a couple of days back was literally titled "You thought I was gone! Speedrun retruns"

The comments on that video was so kind and heartwarming where people wished him well.

While we don't know the exact cause, we can all agree that he was subjected to extreme bullying and no one stood up for him - most importantly FIDE!


Kramnick.


From what I understand, Kramnick pointed out Danya's behaviour was erratic and suspected alcohol or drug use (everyone else broached it much more sensitively, saying drugs was a ridiculous notion, and giving him space/privacy but perhaps suspecting possible mental health, mental breakdown, or maybe narcolepsy).

Kramnick may have been forthright and lacking tact, but it was clear from Danya's behaviours that he sadly had an underlying psychological condition that could happen to any of us.


Kramnick repeatedly accused Danya of cheating, which prompted a lot of ongoing abuse from his acolytes. Danya spoke publicly about the stress this caused him several times.

Of course, as ever, Kramnick had nothing to back up his claims, a fact which in no way prevented him repeating them ad-nauseum.

That's not being "forthright" or "lacking tact". That's being an abusive asshole.


I hadn't follow closely enough and wasn't aware - thanks for pointing out.

Kramnick accuses Danya of cheating (several pieces of circumstantial 'evidence' in the thread): https://x.com/VBkramnik/status/1911179469773033512

A tweet by IM Kostya Kavutskiy of ChessDojo from November 2024 (paraphrased):

> Our intention was to express concern for what's going on. Many of us supported Kramnik's fight against online cheating for some time.

> But then he started naming players left & right, some of whom were likely 100% innocent, who got their name dragged through the mud regardless. Kramnik also started naming actual children at one point, without any real evidence from what I could tell

> And it's not "just asking questions", it's casting aspersions. And the words of any world champion obviously carry a lot of responsibility.

https://x.com/hellokostya/status/1852388806143390131


This whole cheating kerfuffle would have never happened if people would have switched to games or variations of games that are not beaten by computers.


This has been discussed to death but to reiterate here: people have been much too polite about Kramnick's nonsense. Danya and Hikaru are (were) probably the two people in the world whose bullet play is least suspicious. Cheating isn't very powerful in short time controls, and they have streamed thousands of games playing them.

Kramnick's bullshit never made any damn sense at all.


no doubt they're two of the least suspicious, particularly Danya, but you're mistaken in thinking that cheating isn't very powerful in short time controls


Are you running the bot with the free tier api?


I'm using Anthropic's pay-as-you-go API, since it was easier to set up on the server than CC's CLI/web login method. Running the bot costs me ~$1.8 per month.

The bot is based on Mario Zechner's excellent work[1] - so all credit goes to him!

[1] https://mariozechner.at/posts/2025-08-03-cchistory/


I think amrrs is referring to the x.com API.


Oh, I'm sorry. Yes, I'm using x's free tier.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: