Difference between revisions of "Different implementations of a simple Collatz iterator and their performance"
Line 18: | Line 18: | ||
*41.3 - 32 bit exe, one thread, unrolled 1000 times | *41.3 - 32 bit exe, one thread, unrolled 1000 times | ||
*47.7 - 32 bit exe, one thread, unrolled 10 times | *47.7 - 32 bit exe, one thread, unrolled 10 times | ||
+ | *65.6 - 32 bit exe, one thread, not unrolled | ||
Line 25: | Line 26: | ||
*55.4s - .net exe, one thread | *55.4s - .net exe, one thread | ||
=== Vista 32, 2.66 GHz core 2 E6750 === | === Vista 32, 2.66 GHz core 2 E6750 === | ||
− | *66. | + | *66.4s - .net exe, one thread |
* -- .net exe, two threads | * -- .net exe, two threads | ||
Line 31: | Line 32: | ||
Because of the suspected low quality of 32 bit code generated by modern C++ compilers a hand crafted assembly version was made for comparison. | Because of the suspected low quality of 32 bit code generated by modern C++ compilers a hand crafted assembly version was made for comparison. | ||
=== Vista 32, 2.66 GHz core 2 E6750 === | === Vista 32, 2.66 GHz core 2 E6750 === | ||
+ | *55.2s - optimized, not unrolled | ||
+ | *26.4s - optimized, unrolled 10 times | ||
+ | |||
== CUDA == | == CUDA == |
Revision as of 02:10, 3 January 2009
The Collatz conjecture was proposed by Lothar Collatz in 1937. The conjecture is also known as the 3n + 1 conjecture.
The procedure is that if n is divisible by two then divide by two, else multiply by 3 and add 1, iterate until n reaches 1. The unproven conjecture is that for all values of n the procedure will always reach 1.
The benchmark times the iteration of the 226 first values of n.
Contents
Visual C++
Vista 64, 2.4 GHz core 2 Q6600
- 22.1s - 64 bit exe, one thread, unrolled 1000 times
- 30.5s - 64 bit exe, one thread, unrolled 10 times
- 45.9s - 32 bit exe, one thread, unrolled 1000 times
- 49.2s - 64 bit exe, one thread, not unrolled
- 53.3s - 32 bit exe, one thread, unrolled 10 times
- 76.1s - 32 bit exe, one thread, not unrolled
Vista 32, 2.66 GHz core 2 E6750
- 41.3 - 32 bit exe, one thread, unrolled 1000 times
- 47.7 - 32 bit exe, one thread, unrolled 10 times
- 65.6 - 32 bit exe, one thread, not unrolled
Visual Basic 2008
The code is compiled to common intermediate language that is made executable by the .net just in time compiler. The advantage is that the JIT compiler can optimize the code for the specific CPU at runtime. In this case the slower CPU with 64 bit registers is the fastest because of the optimization. The code should also run unmodified on Linux with Mono installed.
Vista 64, 2.4 GHz core 2 Q6600
- 55.4s - .net exe, one thread
Vista 32, 2.66 GHz core 2 E6750
- 66.4s - .net exe, one thread
- -- .net exe, two threads
x86 assembly language
Because of the suspected low quality of 32 bit code generated by modern C++ compilers a hand crafted assembly version was made for comparison.
Vista 32, 2.66 GHz core 2 E6750
- 55.2s - optimized, not unrolled
- 26.4s - optimized, unrolled 10 times
CUDA
GeForce 1.625 GHz GTS 512
- 2.3s - 128 blocks x 256 threads