I read through this to see if my AI cynicism needed any adjustment, and basically it replaced a couple basic greps and maaaaybe 10 minutes of futzing around with markdown. There's a lot of faffing about with JSON, but it ultimately doesn't matter to the end result.
It also fucked up several times and it's entirely possible it missed things.
For this specific thing, it doesn't really matter if it screwed up, since the worst that would happen is an incomplete blog post reporting on drama.
But I can't imagine why you would use this for anything you need to put your name behind.
It looks impressive, sure, but the important kernel here is the grepping and there it's doing some really basic tinkertoy stuff.
I'm willing to be challenged on this, so by all means do, but this seems both worse and slower as an investigation tool.
The hardest problem in computer science in 2025 is showing an AI cynic an example of LLM usage that they find impressive.
How about this one? I had Claude Code run from my phone build a dependency-free JavaScript interpreter in Python, using MicroQuickJS as initial inspiration but later diverging from it on the road to passing its test suite: https://static.simonwillison.net/static/2025/claude-code-mic...
Here's the latest version of that project, which I released as an alpha because I haven't yet built anything real on top of it: https://github.com/simonw/micro-javascript
Again, I built this on my phone, while engaging with all sorts of other pleasant holiday activities.
> For this specific thing, it doesn't really matter if it screwed up
These are specifically use cases where LLMs are a great choice. Where the stakes are low, and getting a hit is a win. For instance if you're brainstorming on some things, it doesn't matter if 99 suggestions are bad if 1 is great.
> the grepping and there it's doing some really basic tinkertoy stuff
The boon is you can offload this task and go do something else. You can start the investigation from your phone while you're out on a walk, and have the results ready when you get home.
I am far from an AI booster but there is a segment of tasks which fit into the above (and some other) criteria for which it can be very useful.
Maybe the grep commands etc look simple/basic when laid bare, but there's likely to be some flailing and thinking time behind each command when doing it manually.