Wednesday, 25 September 2013

What's new in garbage collection?

Since the 1950s there have been three main families of collectors: semi-space, mark-sweep and mark-compact. Almost all production GCs have been generational mark-sweep even though they exhibit pathological performance when the nursery is full of survivors because they are marked, (physically) copied and all references to them updated which is ~3x slower than necessary and is practically useful when filling a hash table with heap-allocated keys and/or values. The Beltway (2002) and Immix (2008) garbage collectors introduced the new family called the mark-region GCs. With a mark-region GC the entire heap is a collection of regions and so is the nursery generation so it can be logically aged by replacing it with another region when it is full of survivors. Sun's Hotspot JVM introduced the first mainstream mark-region GC with its G1 collector.

The advent of multicore in 2005 has meant more emphasis on parallel and concurrent garbage collectors. The Staccato (2008) garbage collector is the first that is simultaneously parallel and concurrent and real-time.

.NET represents a major lateral advancement among mainstream GCs because support for reified generics and value types allows .NET languages like C# and F# to express generic collections that can be filled with new values without requiring any heap allocation. For example, filling a Dictionary with ints, floats, complex numbers and low-dimensional vectors and matrices.

No comments: