reach-vb's comments

reach-vb · 2025-07-09T09:11:40 1752052300

Hey Simon, VB from Hugging Face here and the person who added the model to MLX and llama.cpp (with Son). The PR hasn't yet landed on llama.cpp, hence it doesn't work OTB on llama.cpp installed via brew (similarly doesn't work with ollama since they need to bump their llama.cpp runtime)

The easiest would be to install llama.cpp from source: https://github.com/ggml-org/llama.cpp

If you want to avoid it, I added SmolLM3 to MLX-LM as well:

You can run it via `mlx_lm.chat --model "mlx-community/SmolLM3-3B-bf16"`

(requires the latest mlx-lm to be installed)

here's the MLX-lm PR if you're interested: https://github.com/ml-explore/mlx-lm/pull/272

similarly, llama.cpp here: https://github.com/ggml-org/llama.cpp/pull/14581

Let me know if you face any issues!

kosolam · 2025-07-09T11:33:28 1752060808

Could you please enlighten me regarding all these engines, I’m using lamacpp and ollama. Should I also try mlx, onnx, vllm, etc. I’m not quite sure whats the difference between all these. I’m running on CPU and sometimes GPU

pzo · 2025-07-09T15:00:29 1752073229

Ollama is a wrapper around llama.cpp thei using ggml format. Onnx is different ml model format and onnxruntime developer by microsoft. Mlx is ml framework from Apple. If you want the fastest speed on MacOS most likely stick with mlx

knowaveragejoe · 2025-07-10T23:25:43 1752189943

> similarly doesn't work with ollama since they need to bump their llama.cpp runtime

Just curious, how frequently does that happen?

reach-vb · 2025-05-27T17:25:51 1748366751

Nice! that's very cool!

PieterBecking · 2025-05-28T07:15:39 1748416539

Thanks!

reach-vb · 2025-02-07T16:41:56 1738946516

Brilliant job! Love how fast it is, I'm sure if the rapid pace of speech ML continues we'll have Speech to Speech models directly running in our browser!

dust42 · 2025-02-07T20:36:10 1738960570

It's already there, Hibiki by Kyutai.org was released yesterday with speech to speech, french to english on Iphone:

https://x.com/neilzegh/status/1887498102455869775

https://github.com/kyutai-labs/hibiki