E.g. for x * 5 gcc issues lea eax, [rdi+rdi*4].
So I guess this trick then only works for multiplication by 2, 3, 4, 5, 8 or 9?
x * 6: lea eax, [rdi+rdi*2] add eax, eax x * 7: lea eax, [0+rdi*8] sub eax, edi x * 11: lea eax, [rdi+rdi*4] lea eax, [rdi+rax*2]
And on modern CPUs multiplication might not be actually all that slow (but there may be fewer multiply units).
E.g. for x * 5 gcc issues lea eax, [rdi+rdi*4].