64 bit CPUs are significantly faster at big integer arithmetic than 32 bit CPUs. My experience is a factor 2 with identical code and a factor 4 with specialized code.
In code written with x86 in mind many intermediate values have 64 bits. For example if you multiply two 32 bit integers you get 64 bits, which then need to be added, shifted finally split into 32 bit integers.
AMD64 (64 bit) CPUs have larger registers and more of them compared with x86 (32 bit) CPUs. So these intermediate values fit into a single register and the compiler doesn't need to stitch together two 32 bit registers to give the appearance of 64 bit integers in c. The additional registers mean you need to work with the stack less often.
This improves the performance of such code about two fold over the same CPU in 32 bit mode.
Another important difference is that AMD64 (64 bit) supports a 64x64->128 bit multiplication and x86 (32 bit) only supports 32x32->64 bit multiplication. This big multiplication is about twice as expensive, but does 4x as much.
This results in another factor 2 speedup if you write code that uses 128 bit integers to hold intermediate values.