Sunday, 20 April 2008

Who will use F#?

This post is in response to the comment by Fernando on the previous post. Many of the statements made by Fernando reflect commonly held views but I believe the foundation (e.g. FP vs OO) is too simplistic to be an accurate predictor of what is to come.

I take issue with several of the points that you have raised, Fernando. I'll start with the ones where I can provide objective evidence rather than just opinion.

You say that "F# will be adopted by long-time functional programmers, with LISP/Haskell heritage" but Lisp/Scheme and Haskell programmers account for only 5% of our F#.NET Journal registrants whereas C#/C++/Java programmers account for 53%. The reason is that functional programmers very rarely migrate between functional languages because they are so different (e.g. Lisp vs Haskell is like C++ vs Ruby). People learning any given functional language are always predominantly from mainstream backgrounds. Moreover, the prospect of commercialization makes F# alluring and that is irrelevant for academics happily using Lisp or Haskell. I believe F# will be adopted primarily by startup companies composed of small groups of talented programmers attacking hard problems who realise that the productivity of this language gives them serious advantages over the competition.

Your statements about C# adopting functional features are correct but then you say "Granted, the C# or VB implementations are not as elegant or pure as the F# counterparts, but the features are there". That is very misleading because the features responsible for F#'s awesome productivity are certainly not there in C# and VB. I'm talking about extensive type inference with automatic generalization and pattern matching over algebraic data types, both of which underpin the productivity of all MLs including OCaml and F#. Microsoft have not even begun trying to figure out how to add these features to C# and, until they do, C# will remain in a league below F# in terms of productivity and cost effectiveness.

Now for my subjective opinions. If it were possible to have a "one size fits all" language then I think programming language researchers would already have invented it. After all, they have complete freedom to do so: their results do not even have to be practically useful or adopted. However, I believe the different programming paradigms are at odds by design and, consequently, this is a strictly either-or situation. For example, using overloading undermines type inference. This is why overloading requires type annotations for disambiguation in F#. Many other languages lie at different points along the FP-OO curve. OCaml is closer to FP and Scala is closer to OO, but F# is the only language to have ever brought the productivity of OCaml to a mainstream industrial platform like .NET. Scala does a slightly better job with respect to OO but only at the cost of a catastrophic loss in terms of productivity due to its lack of automatic generalization.

In summary, I think you are overestimating the amount of cross-pollination that will occur between languages and underestimating the amount of programmer migration that will occur.

Wednesday, 16 April 2008

Is OOP good for technical computing?

During one of the more heated discussions on the moderated F# mailing list, Jeffrey Sax stated that technical computing benefits from object oriented programming:

"In the end, you usually need both object-oriented and functional concepts working together. This is particularly true of technical computing, where you have some meaningful 'object' abstractions (vectors, matrices, curves, probability distributions...) and lots of 'functions' you want to perform on them (integration, fit a curve, solve an equation...)." - Jeffrey Sax, Extreme Optimization

We used C++ in technical computing for many years and then migrated first to Mathematica, then to OCaml and now to F#. I found this statement really surprising. From my point of view, object oriented programming has almost nothing to offer in the context of technical computing. OO languages are obviously widespread in technical computing but that is only because they are common elsewhere: none of the dominant technical computing environments (e.g. Mathematica, MATLAB, Maple, MathCAD) emphasize OOP and many numerical libraries for object oriented languages do not adopt an object oriented design (e.g. IMSL). Inheritance is the unique angle of OOP compared to more conventional approaches (in technical computing) like procedural programming, functional programming and term rewriting. These examples of vectors, matrices, curves and probability distributions all seem very bad to me. Given the choice, can OOP really be preferable in this context?

Vectors and matrices are almost always represented internally by arrays and do have associated functions (such as arithmetic operations) but such encapsulation can be provided by many different approaches, not just OOP. One might argue that real/complex, non-symmetric/symmetric/hermitian and dense/sparse matrices could be brought together into a class hierarchy and inheritance could be used to factor out commonality but there is none: storage and almost all functions (e.g. matrix-matrix multiplication) are completely different in each case.

A "curve" is just another name for a function and, therefore, is surely best represented by functional programming because that makes evaluation as easy and efficient as possible. One might argue that "curves" might have other associated functions beyond straightforward evaluation, such as a symbolic derivative. However, as soon as you step down that road term rewriting becomes preferable to OOP because it facilitates symbolic processing which is the only viable way to do computer algebra and compute general derivatives. OOP might let you encapsulate a single special case but inheritance buys you nothing.

Probability distributions are perhaps less clear cut. There are an arbitrary number of such distributions (beta, Normal, exponential, Poisson...) and they have an arbitrary number of useful functions over them (mean, median, mode, variance, standard deviation, skew, kurtosis, inverse cumulative distribution function...). Although representing probability distributions using OOP allows the set of distributions to be extended easily it makes it difficult to add new functions over distributions: you cannot retrofit a new member onto every class in a predefined hierarchy. ML-style functional programming would make it easy to add new functions but difficult to add new distributions. Term rewriting makes it easy to extend both the number of distributions and the number of functions but leads to a "rats nest" style of unstructured programming as such extensions may be placed anywhere.

Neither Jeffrey nor I are impartial in this, of course. Jeffrey sells the excellent Extreme Optimization library for C#, which makes heavy use of object orientation and my company sells a variety of products related to technical computing such as the OCaml for Scientists and F# for Scientists books, Signal Processing .NET software for C# users, Time-Frequency analysis software for Mathematica users and F# for Numerics and F# for Visualization for technical computing using Microsoft's new F# programming language. However, I do not believe I am alone in my view, not least because none of the world's foremost integrated technical computing environments are built around object oriented programming. Indeed, Mathematica is my personal favorite and it is primarily a functional language built around term rewriting.

Our best special offer ever

For a limited time only buy OCaml for Scientists, 6 month subscriptions to the OCaml and F#.NET Journals, F# for Numerics and F# for Visualization and get over £100 off!

That's a saving of over 30%!

Thursday, 10 April 2008

Memory management in Sun's Java VM

Sun released this interesting paper in 2006, describing the memory management and concurrent garbage collection strategies employed by their HotSpot Java virtual machine.

HotSpot contains one of the most advanced concurrent garbage collectors of any language implementation, rivalling Microsoft's .NET CLR implementation.