I have several local models I hit up first (Mixtral, Llama), if I don’t like the...

ifwinterco · on Dec 7, 2024

People keep talking about using LLMs for writing code, and they might be useful for that, but I've found them much more useful for explaining human-written code than anything else, especially in languages/frameworks outside my core competency.

E.g. "why does this (random code in a framework I haven't used much) code cause this error?"

About 50% of the time I get a helpful response straight away that saves me trawling through Stack Overflow and random blog posts. About 25% of the time the response is at least partially wrong, but it still helps me get on the right track.

25% of the time the LLM has no idea and won't admit it so I end up wasting a small amount of time going round in circles, but overall it's a significant productivity boost when I'm working on unfamiliar code.

mark_l_watson · on Dec 6, 2024

Right on, I like to use local models - even though I also use OpenAI, Anthropic, and Google Gemini.

I often use one or two shot examples in prompts, but with small local models it is also fairly simple to do fine tuning - if you have fine tuning examples, and if you are a developer so you get the training data in the correct format, and the correct format changes for different models that you are fine tuning.

TeMPOraL · on Dec 5, 2024

> But none of them are sufficient alone, you do need a “team” of them

Given the sensitivity to parameters and prompts the models have, your "team" can just as easily be querying the same LLM multiple times with different system prompts.

griomnib · on Dec 5, 2024

Other factor is I use local LLM first because I don’t trust any of the companies to protect my data or software IP.

404mm · on Dec 5, 2024

What model sizes do you run locally? Anything that would work on a 16GB M1?

mark_l_watson · on Dec 6, 2024

I ha e a 32G M2, but most local models I use fit into my 8G old M1 laptop.

I can run the QwQ 32G model with Q4 on my 32G M2.

I suggest using https://Ollama.com on Mac, Windows, and Linux. I experiments with all options on Apple Silicon and liked Ollama best.

griomnib · on Dec 5, 2024

I have an A6000 with 48GB VRAM I run from a local server and I connect to it using Enchanted on my Mac.