Sunday, 18 July 2010

Haskell's hash tables revisited

Update: We have since discovered that these results were biased towards Haskell.

Mikhail Glushenkov recently announced the Haskell Platform 2010.2 RC for Windows. In particular, this is the first release to include a version of the Glasgow Haskell Compiler (6.12.3) that has the new garbage collector fix to address the performance problems Haskell programmers have been experiencing with mutable arrays of boxed values over the past 5 years, such as the spines of hash tables.

Naturally, we couldn't resist benchmarking the new release to see if it lives up to the promise of decent hash table performance. Even though this installer for the Haskell Platform is just a release candidate, we found that it installed smoothly and ran correctly first time.

First up, a repeat of our previous benchmark which inserted 10,000,000 bindings with int keys mapping to int values into an initially-empty hash table:

GHC 6.12.1: 19.2s
GHC 6.12.3: 4.48s
F# .NET 4: 0.8s

The new version of GHC is clearly a substantial improvement over previous versions in this context, completing this benchmark over 4× faster than its predecessor. The fastest Haskell is still 5.6× slower than F# but that is a more reasonable performance difference than before.

Altering the benchmark to build five hash tables containing 5,000,000 bindings each with float keys and values and using the floor or truncate functions as the hash function changes the relative performance considerably:

GHC 6.12.1: 131s
GHC 6.12.3: 30.2s
F# .NET 4: 2.98s

Here, the old GHC is a whopping 44× slower than F# and the latest GHC is still 10× slower.

Surprisingly, this turns out to be a 3 year old performance bug, this time in the implementation of basic numeric functions. Fortunately, this can be fixed simply by annotating the type of the truncate function. This gives the following performance:

GHC 6.12.1: 43.8s
GHC 6.12.3: 17.1s
F# .NET 4: 2.98s

The latest Haskell is now back to being only 5.7× slower than F#.

Here's the new F# code:

let cmp =
{ new System.Object()
interface System.Collections.Generic.IEqualityComparer with
member this.Equals(x, y) = x=y
member this.GetHashCode x = int x }
for _ in 1..5 do
let m = System.Collections.Generic.Dictionary(cmp)
for i=5000000 downto 1 do
m.[float i] <- float i
printfn "m[42] = %A" m.[42.0]

And here's the new Haskell code:

import qualified Data.HashTable as H
import GHC.Float

act 0 = return ()
act n = do
ht <- (==) (fromIntegral . (truncate :: Double -> Int)) :: IO (H.HashTable Double Double)
let loop 0 = return ()
loop i = do
H.insert ht (fromIntegral i) (fromIntegral $ i+n)
loop (i-1)
loop (5*(10^6))
ans <- H.lookup ht 42.0
print ans
act (n-1)

main :: IO ()
main = act 5


Paolo Giarrusso said...

You link is to bug 661, fixed 4 years ago, and is not a performance bug:

I guess the link was botched - could you please link the correct performance bug?

Flying Frog Consultancy Ltd. said...

@Paolo: I have corrected the link.

Alberto said...

Good GC performance comparison.

Where is the HashTable benchmark?

jay paul said...

Your post really cool and interesting. Thanks very much.

HP - Pavilion 17.3" Laptop - 4GB Memory - 500GB Hard Drive - Silver

HP - ENVY 15.6" Laptop - 8GB Memory - 750GB Hard Drive - Natural Silver (15-j011dx)