In our experimentation, we've found that it really depends what you're looking f...

In our experimentation, we've found that it really depends what you're looking for. That is you really need to break down down evaluation by task. Local models don't have the power yet to just "do it all well" like GPT4.

There are open source models that are fine tuned for different tasks, and if you're able to pick a specific model for a specific use case you'll get better results.

---

For example, for chat there are models like `mpt-7b-chat` or `GPT4All-13B-snoozy` or `vicuna` that do okay for chat, but are not great at reasoning or code.

Other models are designed for just direct instruction following, but are worse at chat `mpt-7b-instruct`

Meanwhile, there are models designed for code completion like from replit and HuggingFace (`starcoder`) that do decently for programming but not other tasks.

---

For UI the easiest way to get a feel for quality of each of the models (or, chat models at least) is probably https://gpt4all.io/.

And as others have mentioned, for providing an API that's compatible with OpenAI, https://github.com/go-skynet/LocalAI seems to be the frontrunner at the moment.

---

For the project I'm working on (in bio) we're currently struggling with this problem too since we want a nice UI, good performance, and the ability for people to keep their data local.

So at least for the moment, there's no single drop-in replacement for all tasks. But things are changing every week and every day, and I believe that open-source and local can be competitive in the end.