Thursday, September 27, 2012

Micro optimization or why code runs slower on 64 bit

One of my friends works for the company doing video processing. He told me that the company code runs slowly on 64 bit processor than on 32. It can be number of reasons. I would like to talk about the most important one.

The main problem is the size of the integer.

When your code runs on 32 bit processor you were using int for all integer numbers. When you compile the same code on 64 bit processor, the integer stays 32 bit.

  int i;

NOTE: The size of int variable in C++ stays 32 bit when code is compiled for 64bit processor or for 32 bit.

So, when you have the following code:

  unsigned char * data = new char[n];
  int i = 10;
  data[i] = 'a';

When the code is compiled and executed on 64 bit processor, the memory addressing is 64 bit too. As a result when doing simple and quick operation like data[i] which equals to offset_to_data+I is converted to offset_to_data + convert_to_64(i). The int i variable needs to be converted to 64 bit.

As a result, when working with memory all the int variables are converted to 64 bit. And eventually additional code is executed each time you access memory !


Solution.

The solution is very simple. You need to convert your code to use ssize_t  type variable instead of integers and use size_t type variable instead of unsigned int.

When the code is compiled on 32 bit processor the ssize_t type variable has the size of 32 bit.

When code is compiled on 64 bit processor, the ssize_t type  variable has the size of 64 bit.


Any problems?

One issue discovered when working with code that must compile on windows and linux. The problem is with printing these variables. Somehow Linux and Windows compilers use different standards when printing these numbers.

Here is solution:

#ifdef WIN32
#define SIZE_T_FORMAT "%Iu"
#define SSIZE_T_FORMAT "%Id"
#else
#define SIZE_T_FORMAT "%zd"
#define SSIZE_T_FORMAT "%zu"
#endif
 

ssize_t num = 1234;
printf("ssize_t number: " SIZE_T_FORMAT "\n", num);


Feel free to leave a comments/questions here.

3 comments:

  1. Most important reason? That's definitely not correct.

    It well depends on specific case (you might be correct at the end) but most of performance impact of porting 32 bit code to 64 bit comes from larger memory footprint of code and data (which degrades L1/L2/L3 cache performance - it's size is definitely constant). Less code and data in cache - more memory references and thus worse performance.

    Actually, 64 bit code can perform much faster (on x86_64, do not know about other architectures) due to increased number of available registers, SSE as default floating point unit.
    Regarding special code for integer promotion - it is done during compile time when a value loaded to 64 bit register (though at this point I might be wrong. Just looked at some disassemblies...) Anyway, performance impact of this is minor if any.

    Regarding last passage - of course there is problem and you totally missed it in whole post. Same type can have different bit length in different compilers!
    In MSVC long is 32 bit (even for x64) but for gcc it is 64 bits.

    So your advice to your friend should be to look closely at their code and not to blame compiler or CPU. Their code is optimized for 32 bits and by merely porting it to 64 bits does not make it optimized for 64 bit.

    ReplyDelete
  2. Thanks Pavel for the comments.

    Int variable stays 32 bit when compiled for x64 bit architecture both in Linux and in Windows.

    Long variable is different story.

    Best regards,
    Yuli

    ReplyDelete
  3. Interesting, why not to change in MSVS Configuration manager from win32 to x64 to generate 64 bit code?

    ReplyDelete