It is kind of funny that throughout my career, there has always been pretty much a consensus that lines of code are a bad metric, but now with all the AI hype, suddenly everybody is again like “Look at all the lines of code it writes!!”
I use LLMs all day every day, but measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
It all comes from "if you can't measure it you can't improve it". The job of management is to improve things, and that means they need to measure it and in turn look for measures. When working on an assembly line there are lots of things to measure and improve, and improving many of those things have shown great value.
They want to expand that value into engineering and so are looking for something they can measure. I haven't seen anyone answer what can be measured to make a useful improvement though. I have a good "feeling" that some people I work with are better than others, but most are not so bad that we should fire them - but I don't know how to put that into something objective.
Yes, the problem of accurately measuring software "productivity" has stymied the entire industry for decades, but people keep trying. It's conceivable that you might be able to get some sort of more-usable metric out of some systematized AI analysis of code changes, which would be pretty ironic.
Ballmer hasn’t been around for a long long time. Not since the Red Ring of Death days. Ever since Satya took the reins, MBAs have filled upper and middle management to try to take over open source so that Sales guys had something to combat RedHat. Great for open source. Bad for Microsoft. However, Satya comes from the Cloud division so he knows how to Cloud and do it well. Azure is a hit with the enterprise. Then along comes AI…
Microsoft lost its way with Windows Phone, Zune, Xbox360 RRoD, and Kinect. They haven’t had relevance outside of Windows (Desktop) in the home for years. With the sole exception being Xbox.
They have pockets of excellence. Where great engineers are doing great work. But outside those little pockets, no one knows.
I believe the "look at all the lines of code" argument for LLMs is not a way to showcase intelligence, but more-so a way to showcase time saved. Under the guise that the output is the/a correct solution, it's a way to say "look at all the code I would have had to write, it saved so much time".
It's all contextual. Sometimes, particularly when it comes to modern frontends, you have inescapable boilerplate and lines of code to write. Thats where it saves time. Another example is scaffolding out unit tests for series of services. There are many such cases where it just objectively saves time.
I wonder if we can use the compression ratio that an LLM-driven compressor could generate to figure out how much entropy is actually in the system and how much is just boilerplate.
Of course then someone is just going to pregenerate a random number lookup table and get a few gigs of 'value' from pure garbage...
it's still a bad metric and OP is also just being loose by repeating some marketing / LinkedIn post by a person who uses bad metrics about an overhyped subject
Ironically, AI may help get past that. In order to measure "value chunks" or some other metric where LoC is flexibly multiplied by some factor of feature accomplishment, quality, and/or architectural importance, an opinion of the section in question is needed, and an overseer AI could maybe do that.
My favorite movie quote as it pertains to software engineering has for a long time been Jurassic Park's: “Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.”
That’s how I feel about a lot of AI-powered development. Just because you can have 10 parallel agents cranking out features 24/7 and have AI write 100% of the code, that doesn’t mean you’re actually building a product that users want and/or that is a viable business.
I’m currently in this situation, working on a greenfield project as founder/solo dev. Yes, AI has been tremendously useful in speeding things up, especially in patching over smaller knowledge gaps of mine.
But in the end, as in all the projects before in my career, building the MVP has rarely been the hard part of starting a company.
In my experience with a combo of Claude Code and Gemini Pro (and having added Codex to the mix about a week ago as well), it matters less whether it’s CLI, backend, frontend, DB queries, etc. but more how cookiecutter the thing you’re building is. For building CRUD views or common web application flows, it crushes it, especially if you can point it to a folder and just tell it to do more of the same, adapted to a new use case.
But yes, the more specific you get and the more moving pieces you have, the more you need to break things down into baby steps. If you don’t just need it to make A work, but to make it work together with B and C. Especially given how eager Claude is to find cheap workarounds and escape hatches, botching things together in any way seemingly to please the prompter as fast as possible.
I'm not against AI art per se, but at least so far, most “AI artists” I see online seem to care very little about the artistry of what they’re doing, and much much more about selling their stuff.
Among the traditional artists I follow, maybe 1 out of 10 posts is directly about selling something. With AI artists, it’s more like 9 out of 10.
It might take a while for all the grifters to realize that making a living from creative work is very hard before more genuinely interesting AI art starts to surface eventually. I started following a few because I liked an image that showed up in my feed, but quickly unfollowed after being hit with a daily barrage of NFT promotions.
Do you think that for someone who only needs careful, methodical identification of “problems” occasionally, like a couple of times per day, the $20/month plan gets you anywhere, or do you need the $200 plan just to get access to this?
I've had the $20/month plan for a few months alongside a max subscription to Claude; the cheap codex plan goes a really long way. I use it a few times a day for debugging, finding bugs, and reviewing my work. I've ran out of usage a couple of times, but only when I lean on it way more than I should.
I only ever use it on the high reasoning mode, for what it's worth. I'm sure it's even less of a problem if you turn it down.
Listening to Dario at the NYT DealBook summit, and reading between the lines a bit, it seems like he is basically saying Anthropic is trying to be a reponsible, sustainable business and charging customers accordingly, and insinuating that OpenAI is being much more reckless, financially.
I think it's difficult to estimate how profitable both are - depends too much on usage and that varies so much.
I think it is widely accepted that Anthropic is doing very well in enterprise adoption of Claude Code.
In most of those cases that is paid via API key not by subscription so the business model works differently - it doesn't rely on low usage users subsidizing high usage users.
OTOH OpenAI is way ahead on consumer usage - which also includes Codex even if most consumers don't use it.
I don't think it matters - just make use of the best model at the best price. At the moment Codex 5.2 seems best at the mid-price range, while Opus seems slightly stronger than Codex Max (but too expensive to use for many things).
I would argue that many engineering “best practices” have become much more important much earlier in projects. Personally, I can deal with a lot of jank and lack of documentation in a early stage codebase, but LLMs get lost so quickly, or they just multiply the jank faster than anyone ever could have in the past, making it much, much worse for both LLMs and humans.
Documentation, variable naming, automated tests, specs, type checks, linting. Anything the agent can bang its proverbial head against in a loop for a while without involving you every step of the way.
I don't think Google is bad at building products. They definitely are excellent at scaling products.
But I reckon part of the sentiment stems from many of the more famous Google products being acquisitions orignally (Android, YouTube, Maps, Docs, Sheets, DeepMind) or originally built by individual contributors internally (Gmail).
Then here were also several times where Google came out with multiple different products with similar names replacing each other. Like when they had I don't know how many variants of chat and meeting apps replacing each other in a short period of time. And now the same thing with all the different confusing Gemini offerings. Which leads to the impression that they don't know what they are doing product wise.
Starting with an acquisition is a cheap way of accelerating once your company reaches a certain size.
Look at Microsoft - Powerpoint was an acquisition. They bought most of the team that designed and built Windows NT from DEC. Frontpage was an acquisition, Azure came after AWS and was led by a series of people brought in in acquisitions (Ray Ozzie, Mark Russinovich, etc.). It's how things happen when you're that big.
Because those were "free time" projects. It wasn't directed to do by the company, somebody at the company with their flex time - just thought it was a good idea and did it. Googlers don't get this benefit any more for some reason.
Leadership's direction at the time was to use 20% of your time in unstructured exploration and cool ideas like that, though good point of the other poster that that is no longer a policy.
I did the same, and switched to Apple Music. Soon after that, Apple Music started injecting their F1 movie soundtrack into suggested music for me. There really is no escape from this. They haven't done it since, so at least it's not as bad as what Spotify does. If I come across a good offline music player for iOS I will probably cancel my Apple Music subscription.
Last time I checked the sky tv financials, subscription revenue outweighed advertising revenue about 10:1, is for every £30 subscription they took £3 in adverts.
I agree that people shouldn’t rely solely on AI to decide how to vote.
Unfortunately, given the sorry state of the internet, wrecked by algorithms and people gaming them, I wouldn’t be surprised if AI answers were on average no more or even less biased than what people find through quick Google searches or see on their social media feeds. At least on the basics of a given topic.
The problem is not AI, but that it takes quite a bit of effort to make informed decisions in life.
In my opinion, if those influential programmers actually architected around some concrete metrics like 1,000 TPS and 10K daily users, they would end up with much simpler systems.
The problem I see is much more about extremely vague notions of scalability, trends, best practices, clean code, and so on. For example we need Kafka, because Kafka is for the big boys like us. Not because the alternatives couldn’t handle the actual numbers.
CV-driven development is a much bigger issue than people picking overly ambitious target numbers.
I use LLMs all day every day, but measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
reply