This blog posting talks about machine code and whether 32-bit code or 64-bit code is more appropriate. When AMD released the Opteron way back in 2003, it was the first processor to support the x86-64 instruction set for supporting 64-bit code whilst maintaining backward compatibility with the old IA32 code. Or in other words, the Opteron could run both 32-bit code and 64-bit code.
Everybody leaped onto the 64-bit bandwagon without thinking too much about it – it was faster.
But if we look at the other processors that made the transition from 32-bit code to 64-bit code – such as the SPARC, MIPS, etc., we find something interesting. Much of the code running on the relevant operating systems remained 32-bit – not as a transitional measure, but because the 32-bit code was faster. If you look at a relatively modern Solaris system the contents of /bin contain 500-odd binaries that are 32-bit and just 11 that are 64-bit (most of which aren’t in fact part of Solaris but another add-on package).
It turns out that in general, 64-bit code is slower than 32-bit code. In the case of x86-64, 64-bit code is faster not because it is 64-bit, but because of the architectural changes that were also introduced – including (probably most significantly) extra registers.
How do we know this? Apart from it being obvious to those who have lived through the 32-bit to 64-bit transition multiple times, it turns out that people have been experimenting. As it turns out, the x86-64 architecture does allow for 32-bit code to be run with all the features of the x86-64 architecture and that architecture has been labelled as X32.
It turns out that X32 code can be anywhere from 5-40% faster than 64-bit code. The largest increases come from code that makes very heavy use of pointers, and at present no benchmarks of “ordinary” software have been released.
The “downside” of X32 is of course that the software is limited to 4Gbytes of memory, but most programmes don’t need that much memory because 4Gbytes is a lot. Forget that huge video editor you’re playing with – that will quite possibly need 64-bit pointers, but what about all the other software running on your machine?
There are over 400 processes running on my workstation, and none of those processes really requires more than 4Gbytes of memory. Sure I run software that does require more than 4Gbytes of memory, but not all the time.
And running things 10% quicker would be useful … or alternatively running things quicker means the processor can spend more time asleep making battery life longer.