Tuesday, 17 March 2009

Performance: OCaml vs HLVM beta 0.3

Our HLVM project is evolving rapidly and recently saw its first release with a working garbage collector. This has allowed us to make some performance measurements and, as we had hoped, the results are extremely compelling.

The following graph illustrates the time taken to perform each of the benchmarks in the HLVM test suite on one of the 2.1GHz Opteron 2352 cores in an 8-core Dell PowerEdge:

The individual benchmarks are:

  • fib: The usual Fibonacci function.
  • ffib: A floating-point version of the Fibonacci function.
  • sieve: Sieve of Eratosthenes to find all of the primes under 10^8 and print the last one using a byte array.
  • mandel: Mandelbrot rendering with float arithmetic.
  • mandel2: Mandelbrot rendering with abstracted complex arithmetic where complex numbers are pairs of floating point numbers.
  • Array.fold: Initialize a 10^8-element float array and fold of a float * float pair over it.
  • List.init/fold: Initialize a 10^6-element int list and fold over it (allocation intensive).
  • 10-queens: Solve the 10-queens problem by filtering lists (allocation and GC intensive).

These results show that the latest HLVM is already faster than OCaml on three tests (ffib, mandel and mandel2), gives comparable performance on three more tests (fib, sieve and Array.fold) and is substantially slower on the allocation and GC-intensive list and 10-queens benchmarks.

The third set of results labeled "HLVM unsafe" refer to results obtained by manually avoiding manipulation of the shadow stack when it is not necessary (note that the GC still runs exactly as before). This is the next major optimization that HLVM is likely to automate. With this optimization, HLVM also outperforms OCaml on the sieve and Array.fold benchmarks.

Suffice to say, we are very happy with these results and are now encouraging others to invest in building upon HLVM now that the basic infrastructure is in place. We have already had interest from the developers of HaXe and PolyML in using HLVM as a new VM for their compilers and another group are applying to the Jane St. Summer Project 2009 for funding to port INRIA's OCaml compiler to HLVM.

The next major step for HLVM is support for parallel threads. This will be implemented by cooperatively suspending all threads, performing a GC using the current implementation and then resuming the threads. This alone will make HLVM significantly more capable than OCaml when it comes to parallel computing and our next set of benchmark results will include parallel programs!


phil tomson said...

Just curious, what do the HLVM implementations of these programs look like?

phil tomson said...

Ooops... sorry, I just saw the link to the HLVM versions.