Friday, 18 June 2010

Haskell's hash table woes (again)

Alessandro Stamatto recently asked a question on Stack Overflow about the current state of Haskell's hash tables following our rediscovery last year that Haskell's hash tables were up to 32× slower than common alternatives like C++ and .NET and even slower than Python.

Don Stewart edited Alessandro's original question, deleting the citation to the original study that provided a verifiable experiment with complete code, before writing an answer claiming that "the original citation doesn't present any code or benchmarks". Although Don provided a variety of Haskell-based performance measurements, he omitted any measurements of decent hash table implementations like those readily available from C++ or any .NET language that are still 10× faster than anything that can be written in Haskell.

Norman Ramsey's answer took this a step further by drawing the conclusion that "functional data structures perform pretty well" despite the overwhelming evidence to the contrary. Norman also claimed that the beliefs of a clique of self-proclaimed experts might be relied upon instead of verifying our findings by running the code we had already provided.

And around we go.

In reality, GHC 6.12.2 got a workaround to the performance bug in the garbage collector that was believed to have been responsible for this problem but it failed to fix the dire performance of Haskell's hash tables, which are still so slow that even purely functional trees can be faster, and that version still hasn't filtered through to the Haskell Platform. Here are our updated results using the complete code we already provided on a 2.0GHz E5405 Xeon running 32-bit Windows Vista with Visual F# 2010 and the latest Haskell Platform (with GHC 6.12.1) using ghc -O2 --make to compile:

Haskell IntMap: 32.4s
Haskell hash table: 19.2s
.NET Dictionary: 0.8s

So Haskell is still 24x slower than widely-used alternatives...


Alessandro said...

I really liked haskell ideias and syntax. But your observations on haskell are indeed true...

the .Net is a powerfull plataform, a great library. but i don't like it ( because it only comes "full" in a "big package". And because it's too tied to the closed and proprietary filosophy ). Those reasons are "Luxury" reasons since i program for fun and not profit

I wish haskell can someday do great in industry but your studies show otherwise =(

Atleast in the past you liked a bit haskell:
"However, the Haskell program is much simpler than the C++ and its parallelization is statically proven to be free of concurrency bugs such as race conditions. These benefits are well worth this level of performance degradation in many applications so parallel Haskell would seem to have immediate practical applications."

Alessandro said...

as a side note: Right now haskell is great for me, but i use it in an academic level ( where his extreme concepts really shine for learning functional paradigm's ideias)

I really liked Don Stewart observations and studies in haskell too. But researching it does seem that he is the only sucessfull one in industry with haskell.

Maybe haskell is too young still? (considering the "Monadic Haskell" that only surged recently)