By Rainer Wichmann rainer@
la-samhna.de (last update: Dec 29, 2011)Benchmarking procedure
This work compares the speed of code generated by various versions of the GCC (GNU Compiler Collection) C compiler, ranging from 3.3.6 to 4.6.0. Also included are results for several versions (ranging from 2.3 to 3.0) of LLVM. For versions below 3.0, the llvm-gcc-4.2 C frontend was used while llvm 3.0 has been tested with the clang C frontend.
All compilers have been built from original source, thus there are no distribution-specific patches. Gcc builds are bootstrap builds, i.e. the compiler has been built under itself.
All benchmark programs were compiled with '-O3 -std=gnu99 -ffast-math'. For each benchmark/compiler combination, 30 executions were averaged.
The benchmarks used are a set of small benchmarking programs included in the Acovea optimization tool by Scott Robert Ladd. Basically, each of these programs benchmarks a single algorithm. Additionally a benchmark for the TIGER192 hash algorithm has been added; since most hash algorithms exercise similar (mostly bit shuffling) functions, this is probably indicative of the compiler's performance for cryptographic hash functions.
- almabench
- This benchmark calculates the daily ephemeris (at noon) for the years 2000-2099; results depend heavily on efficient floating-point math and functions such as sin, cos, and sqrt.
- evobench
- This benchmark uses a genetic algorithm to minimize a 2D function.
- huffbench
- This program implements the Huffman compression algorithm.
- linbench
- Uses LUP-decomposition to solve a large
linear equation.
- mat1bench
- Nothing special here -- just a brute-force matrix multiply.
- tigerbench
- The tiger192 cryptographic hash algorithm -- 32 bit reference implementation (not part of the benchmark suite by S. R. Ladd).
- treebench
- A tree-generating benchmark.
Results
Results are shown in the following graphs. The vertical axis is the execution time of the benchmark: lower is better, faster, more sexy.







Conclusions
It is interesting to see that the last version of the 3.x gcc series (3.4.6) still rivals or even outperforms the latest gcc 4.x compiler (4.6.0), except for 2/7 benchmarks: almabench shows a clear improvement with the start of the 4.x series, and huffbench improved a lot starting with 4.5.0. Other benchmarks have their ups and downs during the 4.x series, but there is usually no sign of a consistent trend (actually, evobench shows a clear performance drop starting at 4.5.0 and remaining constant).
Also, the LLVM compiler clearly has improved a lot over the last few years (though 3.0 shows a few remakable regressions). LLVM 3.0 produces code that is faster or equally fast than gcc 4.6.0 code in 4/7 benchmarks.
One may ask how, or whether, this result applies to larger programs. However, it is often the case that a large program spends almost all time in a small and computationally expensive part of the code. Which implies that it is of crucial importance that the compiler is able to deliver good code for a vast range of different algorithms.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Germany License.