Saturday, 24 April 2010

Variant types and pattern matching in HLVM

The compiler development series of articles in The OCaml Journal have culminated in an example front-end compiler targetting HLVM that now supports variants and pattern matching in addition to previous features.

These new language features have been implemented using only a few lines of code, bringing the front-end up to only 253 lines of OCaml code!

This new functionality allows substantially more complicated ML-style programs to be written, including the following symbolic simplifier:

# type expr = Int of int | Var of int | Add of expr * expr | Mul of expr * expr;; # let rec cmp_int((m, n): int * int) : int = if m<n then -1 else if m=n then 0 else 1;; # let rec cmp((f, g): expr * expr) : int = match f, g with Int m, Int n -> cmp_int(m, n) | Int _, Var _ -> -1 | Int _, Add _ -> -1 | Int _, Mul _ -> -1 | Var x, Var y -> cmp_int(x, y) | Var _, Add _ -> -1 | Var _, Mul _ -> -1 | Add(f0, g0), Add(f1, g1) -> let c = cmp(f0, f1) in if c<>0 then c else cmp(g0, g1) | Add _, Mul _ -> -1 | Mul(f0, g0), Mul(f1, g1) -> let c = cmp(f0, f1) in if c<>0 then c else cmp(g0, g1) | _ -> 1;; # let rec add((f, g): expr * expr) : expr = match f, g with Int m, Int n -> Int(m + n) | Int m, Add(Int n, f) -> Add(Int(m + n), f) | f, Int 0 -> f | Int 0, f -> f | Add(f, g), h -> add(f, add(g, h)) | f, Add(g, h) -> (match cmp(f, g) with -1 -> Add(f, Add(g, h)) | 0 -> Add(Mul(Int 2, f), h) | _ -> add(g, add(f, h))) | f, g -> match cmp(f, g) with -1 -> Add(f, g) | 0 -> Mul(Int 2, f) | _ -> Add(g, f);; # let rec mul((f, g): expr * expr) : expr = match f, g with Int m, Int n -> Int(m * n) | Int m, Mul(Int n, f) -> Mul(Int(m * n), f) | f, Int 0 -> Int 0 | Int 0, f -> Int 0 | f, Int 1 -> f | Int 1, f -> f | Mul(f, g), h -> mul(f, mul(g, h)) | f, Mul(g, h) -> (match cmp(f, g) with 1 -> mul(g, mul(f, h)) | _ -> Mul(f, Mul(g, h))) | f, g -> match cmp(f, g) with 1 -> mul(g, f) | _ -> Mul(f, g);; # let rec d((f, x) : expr * int) : expr = match f with Int n -> Int 0 | Var y -> Int(if x=y then 1 else 0) | Add(f, g) -> add(d(f, x), d(g, x)) | Mul(f, g) -> add(mul(f, d(g, x)), mul(g, d(f, x)));; # let x = Var 0 in let f = add(add(mul(mul(x, x), x), x), Int(-1)) in f, d(f, 0), d(d(f, 0), 0), d(d(d(f, 0), 0), 0);; - : `Struct[`Reference; `Reference; `Reference; `Reference] = (Add$1((Int$1(-1), Add$1((Var$1(0), Mul$1((Var$1(0), Mul$1((Var$1(0), Var$1(0))))))))), Add$1((Int$1(1), Add$1((Mul$1((Int$1(2), Mul$1((Var$1(0), Var$1(0))))), Mul$1((Var$1(0), Var$1(0))))))), Add$1((Mul$1((Int$1(2), Var$1(0))), Mul$1((Int$1(4), Var$1(0))))), Int$1(6)) Live: 95 58.9176s total; 0s suspend; 0s mark; 0s sweep

Moreover, the performance of the generated code is competitive with the OCaml top-level and F# interactive despite the fact that HLVM is optimized for high-performance parallel numerics and not this kind of symbolic and logic programming at all. However, this example also demonstrates that compilation times can be far too long with HLVM. Specifically, the 19-line cmp function takes 45 seconds to compile in the interactive session. Presumably this is due to a performance bug in HLVM but the problem has not yet been investigated.

This new version of HLVM will be made available as open source software on the HLVM page on OCamlCore.

4 comments:

Flying Frog Consultancy Ltd. said...

We have since ascertained that the performance bug is actually in the compiler front-end and not in HLVM itself. Specifically, the pattern match compiler was generating an exponential amount of HLVM IR code in the number of match cases. However, it is not clear which of the possible solutions is preferable.

jhuni said...

Hello, why are you developing this "HLVM" thing rather then using parrot???

Parrot is a high level vm that is already relatively widely used, it supports many languages, it is the basis of Perl6, and it already has a solid and robust code base with many performance optimizations.

If you are really interested in building a high level VM then there has already been years of working done by the parrot people... check out parrot.org

Flying Frog Consultancy Ltd. said...

@JHuni: We were already aware of parrot. HLVM is designed for high-performance parallel numerics from statically-typed functional languages and we didn't believe Parrot would be viable because its emphasis is dynamically typed languages like Perl 6.

智能 said...

想像力的力量比知識更加巨大。.............................................