gcc actually can do this optimization, even for floating-point numbers. For example,
double foo(double a) {
return a*a*a*a*a*a;
}
becomes
foo(double):
mulsd %xmm0, %xmm0
movapd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm1, %xmm0
ret
with -O -funsafe-math-optimizations
. This reordering violates IEEE-754, though, so it requires the flag.
Signed integers, as Peter Cordes pointed out in a comment, can do this optimization without -funsafe-math-optimizations
since it holds exactly when there is no overflow and if there is overflow you get undefined behavior. So you get
foo(long):
movq %rdi, %rax
imulq %rdi, %rax
imulq %rdi, %rax
imulq %rax, %rax
ret
with just -O
. For unsigned integers, it's even easier since they work mod powers of 2 and so can be reordered freely even in the face of overflow.