Monday, 3 September 2018

What languages did you learn, in what order?

I once compiled the following list of programming languages and which year I started learning them:
  • 1981: Sinclair BASIC (on a ZX81)
  • 1983: BBC BASIC (on a BBC Micro)
  • 1985: Pascal (installed via ROM)
  • 1987: Logo
  • 1987: 6502 Assembly (on a BBC Micro)
  • 1989: ARM assembly (on an Acorn Archimedes)
  • 1992: Casio fx-7700 BASIC
  • 1992: C (using the excellent Norcroft compiler for Acorn computers)
  • 1994: C++ (the awful Beebug Easy C++)
  • 1995: UFI (on a VAX running OpenVMS)
  • 1996: StrongARM assembly
  • 1996: Standard ML
  • 1997: Mathematica
  • 1998: Quake C
  • 2004: OCaml
  • 2006: Java
  • 2007: Common Lisp
  • 2007: Scheme
  • 2007: Scala
  • 2007: Haskell
  • 2007: F#
  • 2009: Clojure
  • 2010: HLVM
  • 2017: Elm
  • 2017: Javascript
I was recently concerned to hear from Socio-PLT: Quantitative and Social Theories for Programming Language Adoption that the number of languages a programmer knows stagnates after the age of just 20. The decade is nearing its end but I have only learned three languages. Now I'm wondering which languages I should learn next...

Wednesday, 31 January 2018

Background reading on the reference counting vs tracing garbage collection debate

Eight years ago I answered a question on Stack Overflow about the suitability of OCaml and Haskell for soft real-time work like visualization:
"for real-time applications you will also want low pause times from the garbage collector. OCaml has a nice incremental collector that results in few pauses above 30ms but, IIRC, GHC has a stop-the-world collector that incurs arbitrarily-long pauses"
My personal experience has always been that RAII in C++ incurs long pauses when using non-trivial data (i.e. nested, structured, collections of collections of collections, trees, graphs and so on), non-deferred reference counting has the same problem for the same reason, tracing garbage collectors like OCaml work beautifully but there are many notoriously bad tools like Java that have given tracing garbage collection a bad name.
Now that I am revisiting this issue I am surprised to find many individuals and organisations repeating exactly the same experimental tests that I did and coming to the same conclusions.
Pusher published a blog post Low latency, large working set, and GHC’s garbage collector: pick two of three that explains how, after building their production solution in Haskell in the interests of correctness, they were forced to drop the Haskell programming language due to insurmountable problems with GC latency. They chose to switch to Google's Go language instead but I think OCaml would have served them well. I'd also like to stress that people in industry neglecting performance as a functional requirement is a bugbear for me!
Gabriel Schrerer published his research as a blog post Measuring GC latencies in Haskell, OCaml, Racket where he presented a related GC benchmark with results showing that OCaml attains 2ms max GC pauses compared 38ms with tuned Racket (formerly known as PLT Scheme) and to 51ms with Haskell.
Twitch published a post entitled "Go’s march to low-latency GC" that described their five year journey from GC pauses lasting tens of seconds down to 1ms using Go.
Hashtable latencies on Github is a web server benchmark ported to a variety of tracing garbage collected languages as well as Swift (which uses reference counting). Although at the beginning they say "the Swift version is mostly used as a reference of what would be ideal" at the end they actually conclude "Want to build low-latency, memory-intensive services in a GCed language? Use OCaml (or Reason if you prefer the syntax)". Note in particular their graph of measured latency profiles which shows no significant difference between OCaml and Swift at any scale:
Here is a discussion in the Rust subreddit about reference counting woes where the author ends up admitting "it might be interesting to deliberately leak the memory and just hope that the OS is smart enough to swap out pages that aren't being referenced any more".
Google's V8 team have blogged about their garbage collector including the post Getting Garbage Collection for Free about optimising their GC for latency by deferring collection work until idle time.