Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The author striked out the part about CedarDB not being available -- which is true -- but Umbra is available as a docker container[1] for some time now. The "Umbra paper" linked also contains an artifact[2] with a copy of Umbra as well as some instructions on how to control the back-ends used etc. (Cranelift is not available as an option in the docker versions however)

I kind of disagree with the assumption that baseline compilers are easy to build (depending on your definition of baseline). A back-end like DirectEmit is not easy to write and even harder to verify. If you have a back-end written for your own IR you will likely have to write tests in that IR and it will probably be quite hard to simply port over codegen (or run-) tests from other compilers. Especially in the context of databases it is not very reassuring if you have a back-end that may explode the second you start to generate code slightly differently. We're working on making this a bit more commoditized but in our opinion you will always need to do some work since having another IR (with defined semantics someone could write a code generator for you) for a back-end is very expensive. In Umbra, translating Umbra-IR to LLVM-IR takes more time than compiling to machine code with DirectEmit.

Also, if it is easy to write them, I would expect to see more people write them.

Copy-and-patch was also tried in the context of MLIR[3] but the exec-time results were not that convincing and I have been told that it is unlikely for register allocation to work sufficiently well to make a difference.

[1]: https://hub.docker.com/r/umbradb/umbra

[2]: https://zenodo.org/records/10357363

[3]: https://home.cit.tum.de/~engelke/pubs/2403-cc.pdf



> We're working on making this a bit more commoditized but in our opinion you will always need to do some work since having another IR (with defined semantics someone could write a code generator for you) for a back-end is very expensive.

I've been nipping at the edges of adapting the vmIDL from vmgen and using that to generate the machinery to do some jitting. But I'm slow and lazy...

The general idea is to have the hypothetical user define the operands of their IR along with the code snippets and use this to stitch together a jit compiler library. Or perhaps a wrapper around an existing jit library, dunno. Either way it gives me some yaks to shave which makes me happy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: