Hacker Newsnew | past | comments | ask | show | jobs | submit | RamboRogers's commentslogin

I was hoping to find transforms in rust, all i found was a wrapper in rust running python.


I've made a major update to include HTTPS recording/proxying via h264 embedded passthrough. It's a super secure way to do proxy browsing on a secure network.


Native MAC app, 2MB, just works!


Hey HN! I've been building MLX-GUI as an open-source inference server that turns any Mac into a multi-user AI server. v1.2.4 just shipped with some major additions:

Complete Whisper ecosystem (99+ languages, word timestamps, any audio format) 23 embedding models across 13 families (E5, ModernBERT, Arctic, etc.) Mistral Small 24B with vision capabilities OpenAI-compatible API that's actually faster than Ollama on Apple Silicon

The goal was simple: I wanted to use my Mac Mini/Studio as proper inference servers without the complexity of managing Python environments or paying for cloud APIs while keeping data local. It's packaged as a native macOS app (no Python install needed) with a beautiful web GUI for model management. The API is drop-in compatible with OpenAI, so existing apps like Jan.ai work immediately. 900+ lines of tests ensure production reliability. G

NU GPL v3 licensed and actively maintained. GitHub: https://github.com/RamboRogers/mlx-gui

Would love feedback from the community - especially on the embedding pipeline and audio processing!


It looks pretty cool. How is it different or better than LM Studio?


Multiple users, automatic model loading, automatic model unloading


LM Studio does automatic model loading and unloading fyi.


it's single user and doesn't queue transactions. new transactions replace in process transactions.


Working in a single 3090 TI or a 3060 w/7b Model.


Memory speed is way slower than mac studio, be interested to see benchmarks.


Benchmarks of what? Memory speed matters for some things but not others. It matters a lot for AI training, but less for AI inference. It matters a lot for games too, but nobody would play a game on this or a mac.


AI inference is actually typically bandwidth limited compared to training, which can re-use the weights for all tokens <sequence length> * <batch size>. Inference, specifically decoding, requires you to read all of the weights for each new token, so the flops per byte are much lower during inference!


A Flux GUI with MCP for use with AI agents, for home AI assistants to protect your privacy and make a cool gallery!


I designed this for usage at home and at conferences, and public networks to share files and meet people. Double Encrypted AES 256 and RSA Public/Private key with unlimited file sizes. It's the beautiful themes i've been using.


Complete upgrade with ZTNA Proxy and Tagging. Free PAM and access control. Save millions and prevent the breach! https://cyberpam.org/


I didn't know this was missing in my life. I might need to go to a real museum.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: