Saturday, 24 November 2007

Low-Level Virtual Machine

LLVM is a compiler infrastructure designed to make it easier to write native code compilers by providing a RISC-like intermediate assembler and the potential for high-level features like garbage collection and free exception handling.

We are very interested in the idea of building a new virtual machine designed ideally suited to statically-typed functional programming languages. LLVM looks like the ideal starting point for such a project. The examples even include a complete Scheme implementation with working garbage collector in only 1,000 lines of code!

Thursday, 15 November 2007

Most popular functional languages on Linux

The Linux operating system is unique in providing a wide variety of tools for developers. In particular, Linux offers an incredible variety of programming languages. This post describes our attempt to measure the popularity of functional programming languages on Linux.

There are many language popularity comparisons out there. The TIOBE programming community index is a famous one based upon the number of search hits indicated by various search engines. Like every comparison, the TIOBE results are flawed in various different ways. Some of the most important problems with this particular measure are:

  • Legacy: older languages have more out-of-date web pages.
  • Unpopularity: this metric is an equally good measure of the unpopularity of a language.
  • Subjectivity: the estimated number of search results returned by search engines is highly dependent upon unrelated factors like Google's algorithm du jour.

We are going to try to measure language popularity on Linux using a more objective metric: the results of the Debian and Ubuntu package popularity contests. Amongst other things, the results allow us to determine how many installations there are for core development tools for each language. Summing the number of installations gives a much more accurate estimate of the number of people actually developing in each language.

Before we go into detail, let's consider some of the flaws in this approach. Firstly, the absolute number of installations is not equivalent to the number of users. Many users will have their favorite language installed on several different systems. Secondly, programmers using languages with multiple different implementations are likely to have several different compilers for that language on each machine. This will bias the results in favor of languages with multiple implementations (such as GHC and Hugs for Haskell). Finally, these results only apply to Ubuntu and Debian users who elected to contribute to the popularity contests. We are assuming that other Linux distributions will give similar results and we can test this to some extent by comparing the results between Debian and Ubuntu.

The results were compiled by summing the contributions from the following major development packages for each language:

  • Erlang: erlang-base
  • OCaml: ocaml-nox
  • Haskell: ghc6 and hugs
  • Lisp: clisp, sbcl, gcl and cmucl
  • Scheme: mzscheme, mit-scheme, bigloo, scheme48 and stalin
  • Standard ML: smlnj, mosml and mlton
  • Eiffel: smarteiffel
  • Mercury: mercury
  • Oz: mozart

The results are illustrated in the graph above. Sure enough, the number of installations is similar between Debian and Ubuntu and, therefore, it seems likely that these results will reflect the trend for most Linux users.

We found the results surprising for several reasons:

  • Lisp is often cited as the world's most popular functional programming language yet it comes 4th after OCaml, Haskell and Erlang in our results.
  • There is no clear preference for a most popular functional programming language. Instead, we find that OCaml, Haskell and Erlang are all equally popular.
  • Despite the bias against OCaml because it is unified by a single implementation, this language still appears to be among the most popular functional programming languages on Linux. This is even more surprising because there are few OCaml books.

Following Microsoft's productization of their OCaml derivative F#, it seems likely that OCaml will continue to grow in popularity on the Linux platform.

Popularity of the Haskell programming language is likely to increase soon following substantial performance improvements thanks to the new 6.8 release of the Glasgow Haskell Compiler (ghc).