Sunday, 30 May 2010

Purely functional Heap Sort in OCaml, F# and Haskell

Here's Markus Mottl's OCaml translation of Okasaki's purely functional leftist heap:

module LeftistHeap (Element : ORDERED) : (HEAP with module Elem = Element) = struct module Elem = Element type heap = E | T of int * Elem.t * heap * heap let rank = function E -> 0 | T (r,_,_,_) -> r let makeT x a b = if rank a >= rank b then T (rank b + 1, x, a, b) else T (rank a + 1, x, b, a) let empty = E let is_empty h = h = E let rec merge h1 h2 = match h1, h2 with | _, E -> h1 | E, _ -> h2 | T (_, x, a1, b1), T (_, y, a2, b2) -> if Elem.leq x y then makeT x a1 (merge b1 h2) else makeT y a2 (merge h1 b2) let insert x h = merge (T (1, x, E, E)) h let find_min = function E -> raise Empty | T (_, x, _, _) -> x let delete_min = function E -> raise Empty | T (_, x, a, b) -> merge a b end

Here's a simple OCaml heapsort based upon the same idea:

type 'a heap = E | T of int * 'a * 'a heap * 'a heap let rank = function E -> 0 | T (r,_,_,_) -> r let t(x, a, b) = let a, b = if rank a > rank b then a, b else b, a in T(rank b + 1, x, a, b) let rec merge = function | h, E | E, h -> h | (T(_, x, a1, b1) as h1), (T(_, y, a2, b2) as h2) -> if x >= y then t(x, a1, merge(b1, h2)) else t(y, a2, merge(h1, b2)) let rec to_list xs = function | E -> xs | T(_, x, a, b) -> to_list (x::xs) (merge(a, b)) let heapsort fold xs = to_list [] (fold (fun h x -> merge(t(x, E, E), h)) E xs)

This takes 0.6s to sort 100k floats on this 2× 2.0GHz E5405 Xeon and it happily sorts millions of elements.

Here's a translation to F#:

type LeftistHeap<'a> = | E | T of int * 'a * LeftistHeap<'a> * LeftistHeap<'a> let rank t = match t with E -> 0 | T (r, _, _, _) -> r let T(x, a, b) = let a, b = if rank a > rank b then a, b else b, a T(rank b, x, a, b) let rec merge h1 h2 = match h1, h2 with | h, E | E, h -> h | T(_, x, a1, b1), T(_, y, _, _) when x >= y -> T(x, a1, merge b1 h2) | T(_, x, _, _), T(_, y, a2, b2) -> T(y, a2, merge h1 b2) let rec toList xs = function | E -> xs | T(_, x, a, b) -> toList (x::xs) <| merge a b let heapSort xs = toList [] (List.fold (fun h x -> merge (T(x, E, E)) h) E xs)

This takes 1.3s and also happily sorts millions of elements.

Here's translation to Haskell:

data Heap a = E | T Int a (Heap a) (Heap a) rank E = 0 rank (T r _ _ _) = r mk x a b = if rank a > rank b then T (rank b + 1) x a b else T (rank a + 1) x b a merge h E = h merge E h = h merge h1@(T _ x a1 b1) h2@(T _ y a2 b2) = if x >= y then mk x a1 (merge b1 h2) else mk y a2 (merge h1 b2) toList xs E = xs toList xs (T _ x a b) = toList (x:xs) $ merge a b heapSort xs = toList [] (foldr (\x -> \h -> merge (mk x E E) h) E xs)

This takes 1.3 second to sort 100k floats but it stack overflows on large inputs.

4 comments:

rl said...

This is because you are using a left fold in all languages except Haskell where you are using a right fold. Try

heapSort xs = toList [] (foldl' (\h x -> merge (mk x E E) h) E xs)

Please don't forget to update your results.

Qerub said...
This comment has been removed by the author.
Qerub said...

I have posted a Scala version here for comparison:

http://termos.vemod.net/purely-functional-heap-sort-in-scala

Choy Rim said...

The F# version appears to have an error. T(x, a, b) should end with T(rank b + 1, x, a, b)

Which would make it more consistent with the OCaml version.