I've made a major update to include HTTPS recording/proxying via h264 embedded passthrough. It's a super secure way to do proxy browsing on a secure network.
Hey HN! I've been building MLX-GUI as an open-source inference server that turns any Mac into a multi-user AI server. v1.2.4 just shipped with some major additions:
Complete Whisper ecosystem (99+ languages, word timestamps, any audio format)
23 embedding models across 13 families (E5, ModernBERT, Arctic, etc.)
Mistral Small 24B with vision capabilities
OpenAI-compatible API that's actually faster than Ollama on Apple Silicon
The goal was simple: I wanted to use my Mac Mini/Studio as proper inference servers without the complexity of managing Python environments or paying for cloud APIs while keeping data local.
It's packaged as a native macOS app (no Python install needed) with a beautiful web GUI for model management. The API is drop-in compatible with OpenAI, so existing apps like Jan.ai work immediately.
900+ lines of tests ensure production reliability. G
Benchmarks of what? Memory speed matters for some things but not others. It matters a lot for AI training, but less for AI inference. It matters a lot for games too, but nobody would play a game on this or a mac.
AI inference is actually typically bandwidth limited compared to training, which can re-use the weights for all tokens <sequence length> * <batch size>. Inference, specifically decoding, requires you to read all of the weights for each new token, so the flops per byte are much lower during inference!
I designed this for usage at home and at conferences, and public networks to share files and meet people. Double Encrypted AES 256 and RSA Public/Private key with unlimited file sizes. It's the beautiful themes i've been using.