I use zed and claude code side by side right now.
I haven't tried out the newly released assisted agent mode with Zed.
Yes Opus has been good with instruction following and same with Gemini for 2nd opinions and brainstorming.
They're not perfect but definitely I see plenty of value in both tools as far as they are reliable services.
I don't like the cloud based functioning of the models as the experience is extremely flaky and not reliable.
I've gound OpenAI Codex and the models in codex too be more reliable in responses and consistency of the quality service.
I would still prefer to have a fully locally hosted equivalent of what ever the state of the art coding assisstant models to speed up work.
That will take time though as in with every technological evolution.
We will be stuck with time sharing for sometime haha. Until the resource aspect of this technology scales and economizes to become ubiquitous.
This is amazing. shout out to anthropic for doing this.
I would like to have a CLAUDE Model which is not nerfed with ethics and values to please the users and write overtly large plans to impress the user.
Sucking up does appear to be a personality trait. Hallucinations are not a completely known or well understood yet.
We are past the stage that they're producing random outputs of strings.
Frontier models can perform an imitation of reasoning but the hallucination aspect seems to be more towards an inability to learn past it's training data or properly update it's neural net learnings when new evidence is presented.
Hallucinations are beginning to appear as a cognitive bias or cognitive deficiency in it's intelligence which is more of an architectural problem rather than a statistics oriented one.
No, it's nothing more than that, and that is the most frustrating. I agree with you on the other comment (https://news.ycombinator.com/item?id=44777760#44778294) and a confidence metric or a simple "I do not know" could fix a lot of the hallucination.
In the end, <current AI model> is driven towards engagement and delivering an answer and that drives it towards generating false answers when it doesn't know or understand.
If it was more personality controlled, delivering more humble and less confident answers or even making it say that it doesn't know would be a lot easier.
From a quick gander.
WASM is not to talk to the servers.
WASM can be utilized to run AI Agents to talk to local LLMs from a sandboxed environment through the browser.
For example in the next few years if Operating System companies and PC producers make small local models stock standards to improve the operating system functions and other services.
This local LLM engine layer can be used by browser applications too and that being done through WASM without having to write Javascript and using WASM sandboxed layer to safely expose the this system LLM Engine Layer.
Agree,
I think aider chat forces the prompter to develop good habits in the prompting process, like providing the proper context, problem decomposition and utilizing the LLM capabilities properly.
The Aider interface and additional setup required in the process makes it more productive over time.
same experience.
Editor agents with basic level tasks and boilerplate.
Problem solving and decomposition of it in the chat apps seems to be better.
What I've noticed is the agent modes of these editors use the embeddings API of ChatGPT and how the LLM model maps the context of the codebase.
Often what has happened is the LLM's in Agent mode ignore the default setup of the codebase example package managers and use the package managers which would be popular throughout their training data.
To summarise the Agent Mode editors don't try to fill in their context gaps and be aware of the environment they operate it unless explicitly specified by the prompter to first review the codebase, understand it's structure and then proceed with implementing features.