This video [0] is relevant, though it actually supports your point - it shows Claude Code struggling with non-trivial tasks and needing significant hand-holding.
I suspect videos meeting your criteria are rare because most AI coding demos either cherry-pick simple problems or skip the messy reality of maintaining real codebases.
Great video! Even more, shows a few things - how good it is with such a niche language but also exposes some direct flaws.
First off, Rust represents quite a small part of the training dataset (last I checked it was under 1% of code dataset) in most public sets, so it's got waaay less training then other languages like TS or Java. You added 2 solid features, backed with tests and documentation and nice commit messages. 80% of devs would not deliver this in 2.5 hours.
Second, there was a lot of time/token waste messing around with git and git messages. Few tips I noticed that could help you in the workflow:
#1: Add a subagent for git that knows your style, so you don't poison direct claude context and spend less tokens/time fighting it.
#2: Claude has hooks, if your favorite language has a formatter like rust fmt, just use hooks to run rust fmt and similar.
#3: Limit what they test, as most LLM models tend to write overeager tests, including testing if "the field you set as null is null", wasting tokens.
#5: Saying "max 50 characters title" doesn't really mean anything to the LLM. They have no inherent ability to count, so you are relying on probability, which is quite low since your context is quite filled at this point. If they want to count the line length, they also have to use external tools. This is an inherent LLM design issue and discussing it with an LLM doesn't get you anywhere really.
> #3: Limit what they test, as most LLM models tend to write overeager tests, including testing if "the field you set as null is null", wasting tokens.
Heh, I write this for some production code too (python). I guess because python is not typed, I'm testing if my pydantic implementation works.
Oh you're about to unlock a whole new level of token burning.
There is an /agents command that lets you define agents for specific tasks or areas. Each of them has their own context and their own rules.
Then claude can delegate the work to them when appropriate, or you can tell it directly to use the subagent, i.e. a subagent for your frontend, backend, specific microservice, database, etc etc.
Quite depends on your workflow which ones you create/need, but they are a really nice quality of life change.
Agents are basically separate "threads" with their own context window.
So the main claude can tell the test-runner agent "Run tests using `task test` and return the results"
Then the test-runner agent runs the tests, "wasters" its context by reading 500 lines of test results, sees that it's ok. Returns "tests ok" to the main context.
This way the main context is spared from the useless chatter and can go on for longer.
You ask claude to use an agent, and it’ll spawn a sub agent that takes a bunch of actions in a new context, then lets the original agent only know a summary of the results.
> I suspect videos meeting your criteria are rare because most AI coding demos either cherry-pick simple problems or skip the messy reality of maintaining real codebases.
Or we’re just having too much fun making stuff to make videos to convince people that are never going to be convinced.
I took a quick informal poll of my coworkers and the majority of us have found workflows where CC is producing 70-99% of the code on average in PRs. We're getting more done faster. Most of these people tend to be anywhere from 5-12 yrs professional experience. There are some concerns that maybe more bugs are slipping through (but also there's more code being produced).
We agree most problems stem from:
1. Getting lazy and auto-accepting edits. Always review changes and make sure you understand everything.
2. Clearly written specification documents before starting complex work items
3. Breaking down tasks into a managable chunk of scope
4. Clean digestible code architecture. If it's hard for a human to understand (e.g: poor separation of concerns) it will be hard for the LLM too.
But yeah I would never waste my time making that video. Having too much fun turning ideas into products to care about proving a point.
> Having too much fun turning ideas into products to care about proving a point.
This is a strange response to me. Perhaps you and others aren’t aware that there’s a subculture of folks who livestream coding in general? Nothing to do with proving a point.
My interest in finding such examples is exactly due to the posting of comments like yours - strong claims of AI success - that don’t reflect my experience. I want to see videos that show what I’m doing wrong, and why that gives very different results.
I don’t have an agenda or point to prove, I just want to understand. That is the hacker way!
2, 3, 4 are all what human coders need to be efficient too :)
I'm kinda hoping that this LLM craze will force people to be better at it. Have documentation up to date and easily accessible is good for everyone.
Like we're (over here) better at marking lines in the road, because the EU mandated lane keeping assist needs the road markings to be there or it won't work.
M.Sc. Computer Science graduate with 5+ years student experience in cyber security research, Machine Learning and backend development. Strong background in malware analysis, reverse engineering, and distributed systems. Developed production microservices handling e-commerce operations. Multiple research projects at Fraunhofer FKIE focusing on dynamic analysis, unpacking, and emulation introspection (all with top grades). Maintain extensive homelab infrastructure using Proxmox and NixOS for personal projects and continuous learning. Looking primary for roles in cyber security, low-level development, or backend engineering. Available from May 2025 after a sabbatical.
I'm a native Chinese speaker. I knew Detexify and used it a lot. This is the first time I see this software. I tried with hand written \succeq, and until its anti-crawler mechanism is triggered, it fails to give the answer. You can argue it is a software with a different purpose (e.g. to convert a piece of content rather than a single symbol), but to me it is not "even better" than detexify.
I am using NextDNS [0], which also integrates well within Tailscale across all my devices.
Or are you looking for a solution that works offline within OpenWRT, without relying on third parties?
It appears that there are AdBlock packages available for OpenWRT[1].
Yes, exactly this. The original `exa`'s description is
> exa is a modern replacement for `ls`
and it seems `eza` very recently changed the README to match that, given the confusion.
At the time, emphasizing it was actively maintained (in comparison to `exa`) made sense, but by now, `eza` has about 5x more daily downloads than `exa`:
Right; since the sentence mentions ls, of course, it must be referring to something other than ls.
Like when your wife finds a sexier, more romantic replacement for you, of course she's not comparing anyone to you. (Nobody is sexier or more romantic than you.) She means sexier and more romantic replacement compared to the previous lover she's just broken up with.
That could explain it if prices were high in general (i.e. also for industry).
https://de.statista.com/statistik/daten/studie/152973/umfrag... shows that 2014-2021 over six cents per kWh were going to the renewable energy subsidy scheme, and this is directly attributable, not some kind of hand-waving. I believe this was more than the wholesale cost of the electricity (without grid fees and taxes) in many of these years, and I think you also have to add another 19% of VAT on top of that.
In 2022 the subsidy started to be paid from taxes instead, and it was lower because market-based electricity prices went up (reducing the amount of subsidy necessary to reach the guaranteed prices).
That comparison lacks as France heavily subsidizes nuclear energy as well. At the same time having to demolish and rebuild multiple reactors. Nuclear energy was never cheap.
I don’t have the numbers but I doubt that France made the better net-deal.
I think a good indicator here is that EDF, after several years of partial privatisation, became an essentially (92%) state owned company again - partly because they've lost €19bln when it suffered nuclear power plant outages along with scheduled maintenance.
France doesn't really seem to have a plan for when their current fleet of nuclear reactors reaches decommissioning age. There's no schedule for replacing them, much less expanding generation capacity.
Yikes. Terrible video that showcases what's wrong with modern youtube and anti-informative entertainment videos. It could have been a three paragraph blog.
I had to stop watching because of all the cringy tweenertainment funny faces and jerky body movements and hands waving all over the place.
Highly recommended, didn't think I'd watch the whole thing but the production quality was great and it explains everything much better than the wired article.
After your reco after the GP's reco, I would have to agree. This is well done. However, coming from a coding/dev background, it was easy to follow and it all makes sense.
However, it goes to show why hacking will never be made interesting in movies without a bunch of fake nonsense like hacking the Gibson's 3D virtual environment.
“Hackers” is interesting because there’s two depictions of hacking in the same movie.
One is flying through the holographic city of files.
The other depiction is quite realistic: they show the protagonist spending all night reading through many pages of assembly to reverse engineer a virus, people do social engineering, etc. “Hackers” made this seem cool too!
Sneakers had them going through the trash, setting up a mark on a fake date, and staking out a building and the security company it used with all sorts of stuff not once looking at a computer screen to "hack" three years earlier.
I like Hackers for the campy side of things, but Sneakers will still take a higher spot on my list.
The best and worst examples were in the same movie, IMO: Nedry's finger-wagging admonishment and all hell breaking loose, then later, "it's a Unix system, I know this!" and some exotic file manager visualization.
Mr Robot has some decent hacking scenes. At least they put up prompt windows with commands that are generic enough to not be hackTheGibson.exe type lame.
I used to work with a couple of the guys who consulted on the technical aspects of Mr Robot. From what I recall, the general idea was to use realistic hacks, but speed through the boring parts to keep the show interesting.
Are you thinking of the one shown in Jurassic park? The scene in hackers was much more CGI, and while I don't doubt it was inspired by fsn, I'd be very surprised if it actually was fsn.
Interesting to see this pop up, especially after watching Jon Gjengset's stream yesterday on
Implementing (parts of) git from scratch in Rust <https://www.youtube.com/watch?v=u0VotuGzD_w>
You should be able to run zig build run.