A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux

breadbox · on Jan 7, 2013

Wow. I honestly wasn't expecting to ever see this on the front page of HN again, given the current ubiquity of 64-bit Linux. (And yes, before anyone asks, I've played around with minimizing 64-bit executables. Unfortunately they are both larger and less forgiving of tomfoolery. The smallest 64-bit ELF I've created is 84 bytes.)

Since it is here, though, I want to take the opportunity to say thanks to everyone who's expressed their appreciation of my essay. And I should note here that writing that essay, so many years ago now, is one of the better thing I've done for my career. Share what you have to learn the hard way; the effort won't be wasted.

richo · on Jan 7, 2013

Let me say thanks for writing it. This easily makes the top 5 pieces of text I'd recommend anyone interested in really knowing what happens to their code needs to read, at least once.

I probably go back over it once every 6 months and it always makes me smile.

80x86 · on Jan 7, 2013

another thank you from me as well. that was a great article!

sbierwagen · on Jan 6, 2013

Previously:

http://news.ycombinator.com/item?id=875077

http://news.ycombinator.com/item?id=68056

solarexplorer · on Jan 6, 2013

AFAIK even a file of 45bytes takes up at least an inode and an entire data block on disk. So on most systems it doesn't matter if the file is 4096bytes or any smaller. And if you need to send the file over the network, the source code still beats the binary in size...

0x0 · on Jan 6, 2013

Actually some file systems are capable of storing the file contents inside slack inode or directory entry space, for files that are very small.

solarexplorer · on Jan 6, 2013

True. There are also things like UFS fragments and ReiserFS tail packing etc, but does plain ext3 or ext4 implement any of these optimizations? I don't think so. So in the most common case nothing is gained by this optimization.

e12e · on Jan 7, 2013

It looks like support for in-line data in ext4 is comming to mainline in 3.8:

http://www.phoronix.com/scan.php?page=news_item&px=MTI1N...

As this has been discussed for quite some time, I actually thought it was already enabled in ext4 -- apparently not.

BTRFS supports "space efficient packing of small files" - by suballocating blocks. Not sure what the real mimimum size on disk for a file is though -- but apparently you can approach header+<data> size both on btrfs and ext4 (and on BSD under UFS2 afaik).

eliben · on Jan 6, 2013

Upvoting it even though it's a pretty old piece and has appeared on HN before. A very inspirational piece of work, and a creatively fun way to learn about the ubiquitous ELF format.

richo · on Jan 6, 2013

Despite being a stickler for normally calling out reposts, I'm not sure I've seen this one on HN before, and it's a FASCINATING read, even if you're not into ASM.

xxpor · on Jan 6, 2013

"Upon return from a system call, eax will contain the return value. If an error occurs, eax will contain a negative value, with the absolute value indicating the error."

I've always wondered about this. Why do errors always return as negative?

bediger4000 · on Jan 6, 2013

If I recall, DEC's VMS operating system had an error return convention where the high bit was negative on error, and the other bits meant different things: system the error occurred in, specific error, etc.

I believe it was a tradition to have a high bit indicate error at the time Unix was originated.

Hoff · on Jan 6, 2013

OpenVMS (née VMS) function calls and signals use the low bit set to indicate success, and the low bit clear to indicate failure; the condition value format.

The high bits of the condition value format can also tell you more about what component returned the error, and there were a few control flags up in the highest bits, including bit 31 (the sign bit) which inhibited message display.

Somewhat unusually, the low three bits of the condition status value are used to indicate varying levels of success or failure; the success (0b001), informational (0b011), warning (0b000), error (0b010) and severe/fatal (0b100) errors.

VAX had the BLBS and BLBS instructions for testing and branching on the low bit, too.

These success- and failure-testing instructions were separate from the BEQL/BEQU, BNEQ/BNEQU, BGTR/BGRTU, BGEQ/BGEQU, BLSS/BLSSU and BLEQ/BLEQU that looked at whether the signed or unsigned value was positive, zero or negative, or zero or greater, or zero or less. A condition value that had been displayed or otherwise inhibited was switched to a negative value.

Yes, VAX had a lot of instructions.

This usage of the low bit also led to the semi-inevitable "Success on VMS being odd" in-joke.

solarexplorer · on Jan 6, 2013

Because positive values contain the actual return value, e.g. the number of bytes read, etc.