Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cody – The AI that knows your entire codebase (sourcegraph.com)
201 points by adocomplete on Aug 26, 2023 | hide | past | favorite | 82 comments


I have been using Cody in VSCode for a couple of months, and I am getting a ton of value out of it.

The key things I love are:

1. It really knows how to summarise a code blocks, this can be helpful to review code in other projects, or provide a refresher to your own, it misses very little!

2. It is very smart when it comes to filling in gaps in log statements, error messages or code comments.

3. Copy and paste is mostly dead, given a small hint it fills in the gaps for common patterns and is way less error-prone, and follows my prevailing style once the project is up and running.

4. Writing tests, this really surprised me but a lot of trivial, and some not so trivial ones are generated by Cody.

Things which annoy me when using Cody:

1. Suggests when writing in markdown are not very helpful, most are wordy, and always positive, it is almost impossible to get a negative or even snarky sentence out of it...

2. Inline suggestions are a bit annoying at times, it really doesn't "know your code", or that of your libraries, for common std library calls it is great, but anything more complex or obscure it is mostly wrong.

3. It is somewhat bolted onto VSCode using some creative solutions, with VSCode only allowing more fit for purpose APIs to be used by GitHub Copilot, which is sad.

Overall it is doing a lot of the heavy lifting turning my code into English, and either entirely building tests, or fleshing out enough for me to just tweak the code a little so big from me.


Great to hear you're getting a ton of value out of it!

On the other parts you mentioned:

Overly positive English prose in Markdown is probably a function of the underlying LLM in use (Claude/GPT-4, plus experimental support for others). I guess the risk is that we overcorrect too far and suddenly the Markdown suggestions are off-putting. If you have any specific examples you'd feel comfortable posting to https://github.com/sourcegraph/cody/discussions/358, that would be helpful.

On inline suggestions (autocomplete), we are under much tighter latency constraints than for chat or commands, so the context used for autocomplete is lighter right now. This is a huge area of effort for us, and we're watching completion acceptance rate really closely. We are making autocomplete use embeddings for context in more cases, and @beyang is adding a fast local context search path as well.

On the VS Code extension APIs used, yes, there are some new proposed APIs that are not yet freely available to extensions that will help. For now, the new `Cody: Fixup` command is much smoother UX than the inline comment `+` icon then typing `/fix whatever`—give that a try and let us know if that is better.

Thanks!


There are tools focused on Code Generation, and those that are focused on Code Analysis or Integrity. Their UX/UI might be very different as well as the underlying tech.

I wonder if you have tried tools that are dedicated to Code Integrity, e.g. generating tests?

I wrote a blog about it: https://www.codium.ai/blog/code-integrity-supercharges-code-...

disclaimer: I'm the co-maker of PR-Agent and CodiumAI


About the "know your code" part, I got burned when I asked for an overview of a few header files from an SDK I wasn't familiar with in a local repo. Claude hallucinated badly, making up functions and their descriptions for most files—in a very believable way, though. I've got the habit of double-checking with ChatGTP, just to be sure.


Marketing hint: Given that you'll be competing with Copilot I'd love to see examples where Copilot fails and Cody does better.

Also there are several typos on the linked page, you might want to read it over a few more times


I don't work there but immediately a big difference is Sourcegraph can be self hosted, Copilot cannot.


The prominence of such information cannot be understated. While Copilot is arguably the most recognized AI Code Assistant, Cody may offer superior features, such as enhanced privacy. However, this advantage isn't evident due to the lack of direct comparisons on their website. Even a search for "copilot vs cody" yields no dedicated page from Cody addressing the differences.


I've been trying to use Cody, but the user experience has been so bad that I've mostly given up. There is no clear instruction about how to add/remove repos, or how to get Cody to recognize the current repo you're in. And the UI seems poorly planned, with missing, hidden and inconsistent buttons. I spend more time figuring out how to get Cody to work than I save by using it. I think it has potential, but they seriously need to spend a sprint or two dogfooding and thinking about the UI/UX. An AI tool like this shouldn't be so damn hard to use.


Sorry for this bad experience. We are working on improving this (code/activity at https://github.com/sourcegraph/cody). In particular, we’re making it so you can do this all from your editor more easily.


I also went to try it out and gave up in frustration. Getting "embeddings" in particular seems critical to having it work well but also extraordinarily hard to try out, I'm not actually sure if you are required to pay sourcegraph money for that feature before you can use it or not.


Sorry. You don't need to pay for embeddings. What did not work for you? If the desktop app is where you ran into trouble, you can start with the editor extension (see https://github.com/sourcegraph/cody#readme) and just sign in via Sourcegraph.com instead of your local desktop app.


My first impulse with this new wave of tooling is to poke at the privacy policy, especially given there's only the free trial version available...

https://about.sourcegraph.com/terms/cody-notice seems generally reasonable though, so that's appreciated. Wouldn't mind a more transparency/confirmation on who the partner LLM entities are (Anthropic?), and some verbiage on the downstream impacts of their terms.

Should be interesting to give it a test-run with some embedded projects.


According to the FAQ,

"Cody has one third-party dependency, which is Anthropic's Claude API. In the config, this can be replaced with OpenAI API."

https://docs.sourcegraph.com/cody/faq


Oobabooga and fastchat support openai compatible APIs so we can use cody with open source local LLMs?


We've got some experimental work to support that. Not merged yet, but you can follow https://github.com/sourcegraph/cody. I've been polling Twitter (https://twitter.com/sqs/status/1675433337354330113) to see how much people actually are using self-hosted LLMs for code completion already, and it seems like not much yet, but Code Llama with infill is a big advance that we're quite excited about.


Looks like a nicely implemented product, unfortunately my experience with Claude is that it's no where near GPT-4 in terms of coding capabilities.


Does "Open"AI really have a near-monopoly on powerful AI tools these days? It seems like all of these various AI projects are just a wrapper / frontend that calls OpenAI APIs.


I think so but a few open models have closed the gap to maybe around 16% on HumanEval.


Presumably `Anthropic and OpenAI` according to various documentation.


We use OpenAI for their embeddings API.

We should have this in our docs, but for now you can read this whitepaper if you are interested: https://about.sourcegraph.com/whitepaper/cody-context-archit...


As a Sublime Text user I really wish more of these tools worked outside of VS Code.


Last week, I forced myself to move from Sublime Text, which I used professionally for a decade, to VSCode. Underwhelming community output bogged down by fundamental limitations, such as very limited extension UI API, ultimately results in a second rate experience compared to VSCode, where all the new cool experimental extensions appear earlier, if at all. You lose productivity because the community size didn't reach critical mass necessary to produce as much good extensions as VSCode. And I can finally perform search and replace without manually saving edits in dozens of files.


Maybe you already know this, but when I used to use sublime I'd save those dozens of files with cmd/ctrl-alt-s (save all open files). I think this is a universal-ish command for well-behaved applications on both windows and mac.

I miss sublime for how slick the ui feels, but I abandoned for the same reasons as you.


Hi we are tracking Sublime requests here: https://github.com/sourcegraph/cody/discussions/10


There needs to be an LSP-like protocol for that sort of stuff.


My open source ai pair programming tool runs in the terminal. You can work alongside in whichever editor you prefer.

https://github.com/paul-gauthier/aider


open sublime> start working

open vsc> 22 update popups and notifications + permission nags. > mojo gone


? I never have any update popups in Code. Are you on windows?

The only notification upon init is Copilot requesting Github login which probably uses an OAuth token that expires quickly.


> mojo gone

There's an extension for that too: "Mojo Post-upgrade AutoReloader" /s


VS Code is the Internet Explorer of code editors.


Could always switch!


Never.


Is this similar to cursor.sh, or does it have advantages over that product?

It's hard to tell if "knows your entire codebase" refers to small-context-window retrieval-augmented generation or something like loading all your code into Claude 100k.

Also, an ask: a voice-to-text input would be earthshatteringly cool. I am waiting with bated breath for the coming moment where it becomes possible to code for extended periods by rubberducking alone.


It does have your entire codebase context using the Claude 100k context model. You can have a look at this white paper for more info about how Cody uses the right context: https://about.sourcegraph.com/whitepaper/cody-context-archit...


I use Talon with Cursorless for developing by voice. Obviously theres a learning curve but its actually fun and efficient.


My open source ai coding tool has voice-to-code support.

https://aider.chat/docs/voice.html

Plus it also "knows your entire codebase" too:

https://aider.chat/docs/ctags.html


I use more scientific computing and I've found it amazingly, surprisingly good. Academic programming has some quirks where they might use a single character variable name, passed along to multiple functions. So if I'm 3 functions deep, I have a hard time tracking it - it gets it very often.

Even some versions of "I want to do this, how would I modify the code to make it happen" works pretty good


That is so awesome to hear!


Fedora 38 Xfce.

./cody_2023.7.11+1384.7d20a90ce7_amd64.AppImage

cody: error while loading shared libraries: libOpenGL.so.0: cannot open shared object file: No such file or directory

Solution:

sudo dnf install libglvnd-opengl


This is starting to become a competitive field and I have no idea how to compare these tools really. Seems like most of them are some kind of gpt variant in skins.

I’m looking forward to being able to form opinions on AI models at some point, but I’m not there yet. I can only try them and see how well they work, but as yet have no objective basis of comparison.

I’m sure folks more entrenched in the research are better equipped…


So it uploads my code to their server? That's a deal breaker for me.


I tried out the chat interface of this briefly with a project I built with a rather (ahem) convoluted codebase and I was actually pretty impressed. It was able to answer questions about how the code works pretty well. Asking it how to make changes varied in quality of results, but yeah, pretty cool stuff!


Can’t be worse than Github Copilot so i’m giving it a try.


I'm just curious -- what specific issues have you had with Copilot?

I use it constantly (probably accepting a suggestion for every few seconds of typing), to the point where I'm starting to just not think about it anymore like I previously didn't think about normal autocomplete. It has written hundreds of perfect unit tests for me (perfect as in I have read the code and it matches virtually exactly what I would have written down to comments and variable names). That's so much mundane low-value typing that I just didn't have to do. Write the name of the test, hit tab, review, and move to the next.

It's certainly not perfect, and you must ALWAYS review what it suggests rather than blindly accepting, but it's incredibly valuable imo.

For reference, I use Copilot in both Rider and VS Code.


> what specific issues have you had with Copilot?

If I put a single swear word in my code it refuses to complete anymore


Do you honestly think copilot is bad? I have really great experience with it.


Yeah I'm using gh copilot all the time, it's very effective for me. Saves me a ton of time.


We look forward to your feedback :)


This freaks me out. Even if it's not that good now, it looks like it's only going to get better and essentially either force programmers out of a job or force programmers to adopt non-free software to do their jobs. Grim.


Think about it this way – it will get better for turning natural language into code, but natural language will never get better than code for the job of theory-building behind a full-fledged application. Just because it's higher-level semantics are not as good for the business logic as code itself. Natural language is genetically bound with emotion and sentiment. It is inconsistent in the wildest ways and the words can have myriads of meanings. The best code craftsmen "think" code straight away, without the natural language shenanigans around it.


I personally don't see it that way. I see it as a tool to help us be more productive as programmers. So much of my day to day when writing code is getting context, looking up best practices, and then writing code. With Cody I can get up to speed much faster and be more productive as a developer, giving me more time to do the things I want to do.


Is it possible to use our own LLM account if we have anthropic or open AI accounts?


It's something we'd like to add (and you can follow along at https://github.com/sourcegraph/cody), but in beta we're still focused on making it really lovable on the most popular config path rather than exploding the number of possible ways people could configure Cody. We do support "bring your own LLM account" for customers using Cody for all their devs, but that hasn't yet made its way to client-side config in the editor extension.


Got it, thanks! For context, Im at Credal.ai (an LLM governance platform), and we have a shared customer who I'm pretty sure does fall into that description with you (use Cody for all their devs), and keeps asking us about whether they can use Cody in conjunction with Credal.ai (bring your own LLM account is probably all that would be needed to support this). We'd also be open to contributing this for what its worth!


Awesome. Check out the existing docs for "bring your own LLM account" on the server side at https://docs.sourcegraph.com/cody/overview/enable-cody-enter..., and let us know if that suffices or if more is needed. You can also email me (email in profile) if it would help to mention customer names in private.


How outdated are the dataset used to train Cody? Something I find frustrating with ChatGPT4 is that it is still blocked in 2021, making it a pain to use with more recent libraries or frameworks. I’ve been doing some unreal engine development this year, and while ChatGPT knows UE4 really well it often provides me with outdated APIs or concepts when trying to learn UE5 C++. If Cody can directly learn from UE5 sources, that would be awesome.


The dataset used to "train" is from Antrhopic's Claude LLM (so vaguely up to date as of year-end 2022)

The premise of cody is the ability to pass entire repos in via embeddings and only highlight the relevant functions in various code snippets to pass into the LLM prompt.


Tried using it yesterday. VIM mode stopped working in VS Code and then my cursor started moving around randomly and wouldn't stop. Uninstalled instantly.


I’m sorry about that. Our code (https://github.com/sourcegraph/cody), which is open source (Apache 2), doesn’t do anything to move the cursor, so I have no idea how that would have happened. Would you be willing to try to repro that in an issue or give us some more info so we can look into it?


I did 2 successful completions in VSCodium, then I checked and the Cody app said I used 17! I disabled autocompletions. Is there a way to manually trigger autocompletion when I'm ready instead of what I'm assuming is rapid-firing autocompletion requests while I'm typing?


If you're worried about hitting your rate limit right now, don't worry. We are lenient; you can join our Discord and ask for a higher one if needed. Right now, unfortunately, you can disable autocomplete entirely in the Cody menu in the status bar, but you can't make it manual-trigger-only. That's a great idea that we should implement.


If would be very helpful if we had articles comparing this and other code assist solutions (mainly copilot)

When marketing Cody ideally it would have a comparison and statements per the ease of switching from copilot. Happy copilot customers are your most likely adopters.


I would use it if the LLM they used were better. It keeps inventing files that don't exist. If I could just provide my GPT-4 API key so it can use GPT-4 I'd be happy.


They use Claude 2 as AI backend with a 100K context window. In my experience Claude is better at programming than GPT-4 because it has a much younger cut-off date and because of the context window to process knowledge. Halluzination happens with all LLM's.


Highly recommending filing issues on their repo if you have anything reproducible. I've had several bugs I filed get fixed this way.


Awesome. The repository is at https://github.com/sourcegraph/cody for anyone who hasn't seen it yet.


https://github.com/sourcegraph/zoekt seems to be doing a fair amount of heavy lifting for Cody.


How does it compare to CoPilot and Code Whisperer ? Given that these two services are distributed by software giants that also sell infrastructure, why would someone pick Cody ?


I would love to try this out, but I would like to see a headless version of the desktop app as I do all my programming over SSH.


There's an (experimental) Cody CLI: https://github.com/sourcegraph/cody/tree/main/cli. If you try it and like it, let us know over on that repo! (Or if by headless you just mean "editor extension only", just install the Cody VS Code extension and sign into Sourcegraph.com, and it'll work without the desktop app. The response quality will be slightly lower than if you have the desktop app, but we're working on that and it should be at parity soon.)


> Free forever for individual devs on public and private code, with a generous rate limit.

What's the catch?


i dont have a work email and no support on your page --when i install it in vscode, i press the chat to ask it something but on the bottom right it says "sign in" i press it and it doesn't do anything


I used it for something and looks like without their desktop app it's meh.


We're working on rolling that all into the editor extension itself, so if you just have the Cody editor extension now, you'll see it getting better context. Can follow along at https://github.com/sourcegraph/cody.


Will that allow me to ask about certain files? and imports?


Yes, it will. And if you have specific things in mind you want to ask it (or kinds of code you want it to generate for you), we always love to hear that kind of stuff (in a discussion topic at https://github.com/sourcegraph/cody or on our Discord).


Does Cody call GPT-4 API?


Any plans for an Emacs package?


A couple WIP ones: https://github.com/keegancsmith/emacs-cody and https://github.com/j-shilling/lsp-cody. Our head of eng (Steve Yegge) is going to start hacking on one this week, I hear. :)


out of curiosity did they have the rights to use GOT characters on this demo commercial?


No need.


Can you elaborate more than just those two words?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: