What example do you need? In every single benchmark AI is getting better and better.
Before someone says "but benchmark doesn't reflect real world..." please name what metric you think is meaningful if not benchmark. Token consumption? OpenAI/Anthropic revenue?
Whenever I try and use a "state of the art" LLM to generate code it takes longer to get a worse result than if I just wrote the code myself from the start. That's the experience of every good dev I know. So that's my benchmark. AI benchmarks are BS marketing gimmicks designed to give the appearance of progress - there are tremendous perverse financial incentives.
This will never change because you can only use an LLM to generate code (or any other type of output) you already know how to produce and are expert at - because you can never trust the output.
W.r.t code changes especially small ones (say 50 lines spread across 5 files), if you can't get an agent to make nearly exactly the code changes you want, just faster than you, that's a you problem at this point. If it maybe would take you 15 minutes, grok-code-fast-1 can do it in 2.
Right. With careful use of AIs, I can use it to gather information to help me make better designs (like giving me summaries of the current best available frameworks or libraries to choose for a given project), but as far as just generating an architecture and then generating the code and devops and so on for that? It's just not there, unless you're creating an app that effectively already exists, like some basic CRUD app.
If you're creating basic CRUDs, what on earth are you doing? That kind of thing should have been automated a long time ago.
CRUD apps are ridiculously simple and have been in existence my entire life. Yet it is surprisingly difficult to make a basic CRUD and host it somewhere. The bulk of useful but simple business apps are just a CRUD with a tiny bit of customisation and integration around them.
It is true that LLMs make it easier to build these kind of things without having to become a competent programmer first.
AI is getting better at every benchmark. Please ignore that we're not allowed to see these benchmarks and also ignore that the companies in question are creating the benchmarks that are being exceeded.
What metrics, that aren't controlled by industry, show AI getting better? Generally curious because those "ranking sites" to me seem to be infested with venture capital, so hardly fair or unbiased. The only reports I hear from academia are those being overly negative on AI.
AI is very satisfied in doing the job, just ask it.
AI is able to speed up the progress, to give more resources, to give the most important thing people have - time. The fact that these incredible gifts are misused (or used inefficiently) is not the problem of AI. This would be like complaining that the objective positive of increased food production is actually a negative, because people are getting fatter.
You misunderstood. This is how the conversation went:
1. Is there steady progress in AI?
2. What example do you need? In every single benchmark AI is getting better and better.
3. Job satisfaction and human flourishing.
Hence my answer "AI is very satisfied in doing the job, just ask it". It came about because of the stupid comment 3, which tried to link and put a blame on unrelatable things (akin to refering to obesity when asked what metrics make him say that agriculture/transportation have not made progress in the last 100 years) and at the same time anthropomorphed AI. I only accepted the premise and continued answering on the same level in order to demonstrate stupidity of their answer.
Before someone says "but benchmark doesn't reflect real world..." please name what metric you think is meaningful if not benchmark. Token consumption? OpenAI/Anthropic revenue?