More Microbenchmarking

[This is a follow-up on a previous entry.]
I did some more micro-benchmarking, adding four more languages to list: Perl, LISP, Dylan and Scheme.
To all critics of microbenchmarking: I think these benchmarks do yield more undisputable numbers than many “real-world” benchmarks: The LOOP and the CALL benchmarks mark absolute upper limits: You will not be able to execute more statements of the language, no matter, how hard you try. This means: if you have a “slow” language, your statements better be powerful. Or the other way around: An assembler instruction might be not very powerful, but you have a lot of them to waste.

I am also aware that I am actually not benchmarking the language, but a particular implementation. In theory, any language might be much faster, but in practice you get, what you see here. And for some scripting languages like Python, Perl or Ruby there are no real alternative implementations. Where I had choice, I tried to use mature implementations where the description said something like “native x86 code generation” or contained other performance-promising hints.

My primary intention was to just explore performance limits on some languages, but during the process I gathered some interesting personal first-hand experience on all these languages, especially about learning curves, documentation and library support on these languages. This is surprising, given the triviality of my three benchmarks, but they proved to be more demanding than I initially thought.

I also updated and reformatted the table on the benchmarks on Windows (XP SP2, VS .NET 2003), and added Perl and Scheme to the list. I changed one thing for optimized C++: actually, the Microsoft compiler replaces both loops with a arithmethic calculation of the result, so the execution time is constant for any value of n, making this loops “infinitely” fast. (The actual value would be about 1 ns for four billion “loops cycles”, or in the order of 10^18 per second.)


Scheme MIT Rel. 7.7.1
730000 560000 8000
Ruby 1.8.0 2.7Mio. 1.4Mio. 12000
JavaScript 3.1 Mio. 1.8 Mio. 20000
Python 2.3 4.5 Mio. 2.1 Mio. 52000
Perl 5.8.0 8.5Mio. 1.4 Mio. 23000
Lua 8.7 Mio. 4.3 Mio. 23000
Scheme MIT Rel. 7.7.1
3 Mio.
-60 Mio.
3.2 Mio.
-70 Mio.
Spike-A 30 Mio. 6.5 Mio. n.a.
Spike-B 112 Mio. 19 Mio. 2.5 Mio.
Java -Xint.Sun 1.5.0_04-b05 87 Mio. 22 Mio. 160000
Java JITC Sun 1.5.0_04-b05 260 Mio. 52o Mio. 2.3 Mio
C++DBG 270 Mio. 11 Mio.(!) 4.2 Mio
x86-Asm 2100 Mio. n.n. n.n.
C++OPT #inf #inf 14 Mio.

In order to use some languages (especially Lisp and Dylan) than are only well supported under Unix, I ran also the other benchmarks on Linux. (Ubuntu Hoary Hedgehog) There are some significant diffences, Ruby is much faster under Windows, while Perl and Java are faster under Linux. This is probably due to compiler and library differences.


Lisp CMU-CLrel.19a
330000 220000 4900
Scheme (MIT Rel. 7.7.90) 700000 690000 9000
Ruby 1.8 1.0Mio. 1.0Mio. 15000
Python 2.4.1 3.6 Mio. 1.8 Mio. 48000
Lua 5.0.2 5.7 Mio. 3.3 Mio. 23000
Perl 5.8.4 9.5 Mio. 1.5 Mio. 24000
Java -Xint. Sun 1.5.0_04-b05 53 Mio. 24 Mio. 220000
Lisp CMU-CLrel.19a
87 Mio. 68 Mio. 76000
C++ gcc3.3 300 Mio. 160 Mio. 7.3 Mio
Java JITC Sun 1.5.0_04-b05 430 Mio. 500 Mio. 2.1 Mio
Dylan (Gwydion d2c 2.4.0RC3) 1000 Mio. 1000 Mio. 27000(!)
C++ gcc3.3 -O2 2100 Mio. 2100 Mio. 14 Mio.

So how do the new old functional programming languages fare? The short version: Lisp is still unusable slow when interpreted, and compiled Lisp is slightly faster than interpreted java p-code. Dylan is both fast and slow, and optimized C++ runs at assembler speed.

No some remarks how the new languages came along. Installation went smooth under ubuntu, I used Ubuntus package manager for all installs.

The tight loop even beats Lua, everything else is not remarkable. It just worked.

With some help from a lisp programmer the port was much easier than if I had to do it all on my own, but even then it was harder than most other ports I did. The syntax is not as bad as it seems when coming from c-style languages, but the sheer number of macros and library functions is overwhelming, in a bad sense. It looks like it might take decades to become familiar with all that stuff, and even bright people with some years experience often have to consult the documentation, which reminds me of “Modern” C++. Lisp’s simple syntax seems to make complicated expressions hard, and the lack of a special array access syntax (e.g. other bracket type) does not help. I am aware that one could easily write a macro that implements it, but arrays are so commonplace and basic that IMO they deserve a special syntactic treatment.

At first glance, the Lisp interpreter I used felt somehow sluggish. The main effect of running a Lisp program seems be memory littering. Tons of garbage are piled up that take ages to collect. Compiled Lisp is much better here, but the compiler-generated native X86 code reaches just the speed of interpreted Java P-code. This is an overhead factor of 20-30 compared to C++.

My summary on the first impression of Lisp is: It sounds all great in theory, but in practice Lisp is an general purpose programming language that is only fit for special purposes, at least on today’s computers. Even with compiled lisp you pay a severe performance penalty, and interpreted Lisp performance is out of discussion. And I have serious doubts, that after fifty years of massage by some of the brightest brains on the planet, there is much undiscovered potential left in this language. But I can definitely recommend to play around with Lisp and try to understand the concepts, it is really inspiring. [Disclaimer: Maybe better Lisp compilers exist, but I was told that the CMU stuff is one of the best free compilers, and definitely not a bad one]

Scheme basically looks very similiar to Lisp for me, except all names are different. The performance figures are also very similiar to Lisp, except that there is an anomaly that makes the code run 20 times slower when I increase the loop count from 5 Mio. to 50 Mio. I mean, that 50 Mio. loops take 200 times longer than 5 Mio. loops. I don’t know why this happens, but I do know that I neither expect nor like such a behaviour from any high level language. Maybe this behaviour is easy to avoid, but why is it there at all? I can see no excuse why in-memory iteration should scale other than linear. And the longer I think about this, the more angry I get at these blithering idiots who screwed it up. Is it some memory littering effect? I read that in Scheme tail recursion should have constant memory requirement. Is it some brain-dead numeric type? I would be grateful if someone could lighten me up, but this disqualifies at least this Scheme implementation and raises at least concerns about the language in general.

Dylan (Dynamic Language) is a kind of Lisp with syntax and better static type support, and some friends are very religious about it for many years. In theory, Dylan combines the flexibility and power of Lisp with the performance of C. In practice, this is almost true. However, as you can see in the benchmarks, sometimes you also get the performance of interpreted Lisp. Personally I find it a little disconcerting to have a language that suddenly becomes four(!) orders of magnitude slower for some operations. For the tests I used Gwydeon Dylan, a compiler abandoned and open sourced by CMU, now beeing maintained by outside volunteering dylan enthusiasts. Probably they will provide me with a version of the benchmark that runs much faster soon after they read this here.

Some words on the language itself: Dylan has a lot of sugar in it’s syntax, and many of them sound very cool, but somehow Dylan seemed to be the opposite of Lisp: A lot of syntax, and a lot of typing. The Dylan code probably would have the most characters, if Lisp had an easier array access. And also Gwidion Dylan is at least two steps away from a production grade environment; the complier error reporting is so bad that you get wrong line numbers, if you get line numbers at all, and the error messages are absolutely confusing. The documentation and the libraries do not seem to be in a very consistent state right now. You have to be a real enthusiast to appreciate programming in this language.

So what’s next? There are some languages I really want to include, C# beeing one of them. Also Forth, Visual Basic and csh might be of interest, and maybe FORTRAN and COBOL should be on the list.

The Sources
Here you find most of the benchmark code I used; feel free to improve it, port it to other languages or just run it to see how your platform compares to my laptop, a Dell M70 with a Pentium M 2.13 Ghz. Also let me know if I screwed up somewhere.

And thanks to all the guys from the club who helped me around the corners of some languages I ran into.