My take is that arenas are very useful, but not to use as a stack so much as to allocate a bunch of memory piecemeal but free it all at once. (The "pop" function in TFA is just not relevant to my use cases for arenas.)
For example, if you're decoding a certificate, or maybe something larger and more complex, a decoder might malloc() every little thing as it goes, which then necessitates free()ing each of those things when you are done with the whole decoded thing. But if you can have the decoder allocate from an arena, then when you're done using the decoded object you can just free the arena.
The decoding example is very common. Whether it's JSON, XML, ASN.1/DER/whatever, Protocol Buffers, Flat Buffers, or anything else, it is very common for decoders to create a ton of garbage to collect. Optimizing that garbage collection seems like a useful thing to do, but it's hard to do in a memory-safe language because every reference to a sub-object of the decoded object will need to be dropped in order for the object's arena to be released. How would one handle this in Rust, C++, or Java?
> How would one handle this in Rust, C++, or Java?
In Rust you'd prefer to write the parser to slice the original memory and not copy out until you're done parsing. You can see this in, for instance, the signature of methods in the httparse crate:
To translate, this means that you must provide a byte slice that lives at least as long as 'b, and a mutable array of Headers that lives at least as long as 'h, and those headers then may reference data that lives as long as 'b (that is, the original bytes).
This way we avoid creating the garbage in the first place by demanding that the original allocation live long enough.
It is odd to present arena allocation as a technique for C when it is most conveniently used in C++. C++'s Standard library has numerous accommodations to this method, and the core language definition acknowledges as legitimate constructing new objects over top of undestructed old objects. It has been used as long as C++ existed. Code using it is clean and maintainable.
I gather Rust is beginning to accumulate similar accommodations. It is already explicitly "safe" to seem to leak memory.
While it might be more conveniently used in C++ the use of arena allocators in C is ancient and it can be pretty convenient even in C. The PostgreSQL code base for example makes heavy use of arena allocators.
For example, if you're decoding a certificate, or maybe something larger and more complex, a decoder might malloc() every little thing as it goes, which then necessitates free()ing each of those things when you are done with the whole decoded thing. But if you can have the decoder allocate from an arena, then when you're done using the decoded object you can just free the arena.
The decoding example is very common. Whether it's JSON, XML, ASN.1/DER/whatever, Protocol Buffers, Flat Buffers, or anything else, it is very common for decoders to create a ton of garbage to collect. Optimizing that garbage collection seems like a useful thing to do, but it's hard to do in a memory-safe language because every reference to a sub-object of the decoded object will need to be dropped in order for the object's arena to be released. How would one handle this in Rust, C++, or Java?