Friday, 4 May 2012

Hesitating between C/C++, OCaml and F# for your compiler?

Metaprogramming is a real weak point of C++. Most of your effort will be expended trying to manipulate trees. The core advantage of OCaml and F# in this context is pattern matching over union types (and not functional programming) precisely because this makes it so much easier to manipulate trees. Historically, OCaml and F# come from the ML family of languages and were bred specifically for this application domain.
I used LLVM via its OCaml bindings to write HLVM, which includes both standalone and JIT compilation to native code, multicore-capable garbage collection, foreign function interface, tail call optimization and many other features. The experience was very pleasant. My only advice would be to keep track of which LLVM features are tried-and-tested and which are experimental because you don't want to depend on anything experimental (e.g. the GC support when I wrote HLVM).
You can easily use System.Reflection.Emit to generate CIL from F# but you obviously won't be able to leverage your LLVM backend by doing so, although you do get a garbage collector for free, of course. .NET bindings to LLVM are an option. I am not familiar with the ones you cite but writing bindings to LLVM's C API is relatively straightforward. However, I am not sure how well supported LLVM is on the Windows platform.
Regarding OCaml vs F#, both have advantages and disadvantages but I'd say the overall difference is relatively small in this context. Writing functions to print values of big union types due to the lack of generic printing is tedious in OCaml, although this can be automated using some third-party macros. F# provides generic printing but is missing some useful features such as polymorphic variants and structurally-typed objects.