Let me know when you open source it; I think there is a place for this and I think we could integrate it as a plug in pretty easily into the LlamaFarm framework :)
This is super interesting! I'm the founder of Muna (https://docs.muna.ai) with much of the same underlying philosophy, but a different approach:
We're building a general purpose compiler for Python. Once compiled, developers can deploy across Android, iOS, Linux, macOS, Web (wasm), and Windows in as little as two lines of code.
Oh! Muna looks cool as well! I've just barely glanced at your docs page so far, but I'm definitely going to explore further. One of the biggest issues in the back of our minds is getting models running on a variety of hardware and platforms. Right now, we're just using Ollama with support for Lemonade coming soon. But both of these will likely require some manual setup before deploying LlamaFarm.
We should collab! We prefer to be the underlying infrastructure behind the scenes, and have a pretty holistic approach towards hardware coverage and performance optimization.
I'm founding a company that is building an AOT compiler for Python (Python -> C++ -> object code) and it works by propagating type information through a Python function. That type propagation process is seeded by type hints on the function that gets compiled:
Have you talked to anyone about where this flat out will not work? Obviously it will work in simple cases but someone with good language understanding will probably be able to point out cases where it just won't. I didn't read your blog so apologies if this is covered. How does this compiler fit into your company business plan?
Our primary use case is cross-platform AI inference (unsurprising), and for that use case we're already in production by startups to larger co's.
It's kind of funny: our compiler currently doesn't support classes, but we support many kinds of AI models (vision, text generation, TTS). This is mainly because math, tensor, and AI libraries are almost always written with a functional paradigm.
Business plan is simple: we charge per endpoint that downloads and executes the compiled binary. In the AI world, this removes a large multiplier in cost structure (paying per token). Beyond that, we help co's find, eval, deploy, and optimize models (more enterprise-y).
I understood some of it. Sounds reasonable if your market already is running a limited subset of the language, but I guess there is a lot of custom bullshit you actually wind up maintaining.
This sounds even worse than Modular/Mojo. They made their language look terrible by trying to make it look like Python, only to effectively admit that source compatibility will not really work any time soon. Is there any reason to believe that a different take on the same problem with stricter source compatibility will work out better?
We're building native code generation for AI developers. We generate high-performance C++/Rust to power open-source and on-device AI for our customers. We have customers ranging from early stage startups to the Fortune 1000.
You'll be:
1. Writing open-source Python functions that run popular vision models and LLMs; or
2. Writing high-performance C++ and Rust code that targets different accelerators (CUDA, Metal, etc); or
3. Writing parts of our Python-to-C++ compiler in support of (1) and (2); or
When we trace Python code, devs have to explicitly opt-in dependency modules to tracing. Specifically, the `@compile` decorator has a `trace_modules` parameter which is a `list[types.ModuleType]`.
With this in place, when we trace through a dev's function, a given function call is considered a leaf node unless the function's containing module is in `trace_modules`. This covers the Python stdlib.
We then take all leaf nodes, lookup their equivalent implementation in native code (written and validated by us), and use that native implementation for codegen.
The majority of the innovation here is in building enough rails (specifically around lowering Python's language features to native code) so that LLM codegen can help you transform any Python code into equivalent native code (C++ and Rust in our case).
I think a more pedantic way to describe what I mean is:
"What if we could compile Python into raw native code *without having a Python interpreter*?"
The key distinguishing feature of this compiler is being able to make standalone, cross-platform native binaries from Python code. Numba will fallback to using the Python interpreter for code that it can't jit.