During one of the more heated discussions on the moderated F# mailing list, Jeffrey Sax stated that technical computing benefits from object oriented programming:

"In the end, you usually need both object-oriented and functional concepts working together. This is particularly true of technical computing, where you have some meaningful 'object' abstractions (vectors, matrices, curves, probability distributions...) and lots of 'functions' you want to perform on them (integration, fit a curve, solve an equation...)." - Jeffrey Sax, Extreme Optimization

We used C++ in technical computing for many years and then migrated first to Mathematica, then to OCaml and now to F#. I found this statement really surprising. From my point of view, object oriented programming has almost nothing to offer in the context of technical computing. OO languages are obviously widespread in technical computing but that is only because they are common elsewhere: none of the dominant technical computing environments (e.g. Mathematica, MATLAB, Maple, MathCAD) emphasize OOP and many numerical libraries for object oriented languages do not adopt an object oriented design (e.g. IMSL). Inheritance is the unique angle of OOP compared to more conventional approaches (in technical computing) like procedural programming, functional programming and term rewriting. These examples of vectors, matrices, curves and probability distributions all seem very bad to me. Given the choice, can OOP really be preferable in this context?

Vectors and matrices are almost always represented internally by arrays and do have associated functions (such as arithmetic operations) but such encapsulation can be provided by many different approaches, not just OOP. One might argue that real/complex, non-symmetric/symmetric/hermitian and dense/sparse matrices could be brought together into a class hierarchy and inheritance could be used to factor out commonality but there is none: storage and almost all functions (e.g. matrix-matrix multiplication) are completely different in each case.

A "curve" is just another name for a function and, therefore, is surely best represented by functional programming because that makes evaluation as easy and efficient as possible. One might argue that "curves" might have other associated functions beyond straightforward evaluation, such as a symbolic derivative. However, as soon as you step down that road term rewriting becomes preferable to OOP because it facilitates symbolic processing which is the only viable way to do computer algebra and compute general derivatives. OOP might let you encapsulate a single special case but inheritance buys you nothing.

Probability distributions are perhaps less clear cut. There are an arbitrary number of such distributions (beta, Normal, exponential, Poisson...) and they have an arbitrary number of useful functions over them (mean, median, mode, variance, standard deviation, skew, kurtosis, inverse cumulative distribution function...). Although representing probability distributions using OOP allows the set of distributions to be extended easily it makes it difficult to add new functions over distributions: you cannot retrofit a new member onto every class in a predefined hierarchy. ML-style functional programming would make it easy to add new functions but difficult to add new distributions. Term rewriting makes it easy to extend both the number of distributions and the number of functions but leads to a "rats nest" style of unstructured programming as such extensions may be placed anywhere.

Neither Jeffrey nor I are impartial in this, of course. Jeffrey sells the excellent Extreme Optimization library for C#, which makes heavy use of object orientation and my company sells a variety of products related to technical computing such as the OCaml for Scientists and F# for Scientists books, Signal Processing .NET software for C# users, Time-Frequency analysis software for Mathematica users and F# for Numerics and F# for Visualization for technical computing using Microsoft's new F# programming language. However, I do not believe I am alone in my view, not least because none of the world's foremost integrated technical computing environments are built around object oriented programming. Indeed, Mathematica is my personal favorite and it is primarily a functional language built around term rewriting.

## 3 comments:

Hello,

Nice post.

I think people end using the language(s) they're most comfortable with. If you have invested decades learning C++, then it's highly improbable you'll switch to a functional language (unless "functional" comes to you, like in C# or VB.NET), no matter how better suited it is for the project at hand.

For ages, functional languages remained in niche circles, mostly academic. The lack of library support slowed down their adoption. The heavy math background that's needed to get started also discouraged many people, although this should be second-nature... you're doing technical computing after all.

F# could change that, specially in the library-related aspect. Through F# you could target the whole .NET 3.5 and that's impressive.

Today C# is Microsoft's flagship language, with VB.NET in a close second place. See, in a couple of years they've gotten many features that historically have been exclusive to data-centric languages (e.g. T-SQL) by tightly integrating strongly-typed querying facilities such as LINQ to SQL, to Objects, to XML or to whatever IQueryable of T you can think of. To enable that, firstly they had to bake in many of the idioms, metaphors, abstractions (you name it) of functional languages. Granted, the C# or VB implementations are not as elegant or pure as the F# counterparts, but the features are there. Now the Microsoft developer division is going after the dynamic language space. See [*]. In essence, they'll bring the best of the dynamic language world to the next versions of C# and VB.NET. (they already have IronRuby and IronPython BTW). After a few iterations, plus community feedback and the natural evolution of the languages, C# and VB.NET will become the de-facto, one-stop, one-size-fits-all programming proposal for the Microsoft platforms. We'll have two strongly typed, data-friendly, dynamic-aware, general-purpose programming languages.

In my opinion, I see:

F# will be adopted by long-time functional programmers, with LISP/Haskell heritage.

IronRuby will be adopted by long-time Ruby programmers.

IronPython will be adopted by long-time Python programmers.

C++/clr will be adopted by long-time C++ programmers.

C# and VB.NET will be adopted by everyone else.

What do you think?

Cheers,

Fernando

[*]

http://blogs.msdn.com/charlie/archive/2008/01/25/future-focus.aspx

http://blogs.msdn.com/charlie/archive/2008/03/05/future-focus-ii-call-hierarchy.aspx

Fernando, you have made lots of interesting points and I think the answer will be of more general interest to others so I have written a new post in response rather than adding a comment here.

Hi,

Interesting post.

From my experience, OOP is ideal for implementing probability distributions since (as you mentioned) there is only a limited number of commonly used functions over them, so once they are implemented, you are unlikely to have a need for new ones. Even if you do, you can easily add them to the class hierarchy. This approach is used internally for our distribution fitting products EasyFit and EasyFitXL, and it works very well for us. We do, however, use functional programming for implementing other statistical methods.

As for the other "abstractions", if you take a look at modern 3D engines, most (if not all) of them make heavy use of vectors and matrices implemented as classes.

I do agree that inheritance buys you almost nothing in these cases, but encapsulation & polymorphism are still very useful. My point is that you don't necessarily have to use all the features or take advantage of all the benefits of OOP. If it makes your life easier in any particular way, then why not.

Regards,

Antony

Post a Comment