Sunday, 17 January 2010

Naïve Parallelism: a rebuttal

Several people including Simon Marlow of Microsoft have objected to our rejection of Saynte's new Haskell code, claiming that the alterations were minor and that they optimize serial performance. This is simply not true. Firstly, over half of the lines of code in the entire program have been altered. Secondly, the new version is slower than Lennart's original version 5 on 1 core. So there is no logical reason to choose the revised Haskell for the basis of a naive parallelization unless you want to cherry pick results by leveraging knowledge of how they will scale after parallelization. Suffice to say, doing so would be bad science.

This is illustrated in the following graph of performance results for Lennart's original version 5 vs Saynte's new revised Haskell with 11 levels of spheres at 1,024×1,024:

The original code is 5% faster on 1 core but scales poorly and is 2.7× slower on 7 cores. The substantially-revised "serial" Haskell code was obviously specifically designed to be amenable to parallelization and, therefore, had no place in a comparison about naive parallelizations.

This naturally raises the question of how the different programs will perform when optimized for parallelism. A fair comparison will require the C++ to be rewritten just as the revised Haskell had been. We shall address this question in the future.


Sebastian Sylvan said...

An alternative to paranoid conspiracy theories is to actually read what he actually said, and see why the changes were made. This was explained in the original blog post to some extent (basically style preference), and expanded on subsequently.

That you would prefer to accuse people of cheating, with no evidence, reflects poorly on you.

Flying Frog Consultancy Ltd. said...

@Sebastian: Saynte's latest blog post admits that he had tried Lennart's faster 5th version but rejected it because it did not give the results that he wanted to see for Haskell. So he backported only those optimizations from the fifth version to the fourth that would give him the results he wanted.

For example, his original article claimed that HLVM "used exorbitantly more memory than the other implementations" but he now admits that Lennart's fastest fifth version uses so much more memory that it could not even complete these tasks.

Pretending that he had "naively" parallelized the Haskell when he had blantently been through several different programs was dishonest. There is nothing naive about what he did. He obviously went to great lengths to parallelize the Haskell but not the C++. That is not a fair comparison.

Sebastian Sylvan said...

Are you intentionally being dense? He explains precisely why he used the version of the code he used, and why he made the changes he made. And no, improved parallel performance wasn't it.

You insist on calling him a liar and a cheater when there is NO EVIDENCE to support that. All you're doing is making it obvious what your agenda is, and that you (and indeed FFC) should be given a wide berth if you're hoping for intellectual honesty. That's probably bad for business too.

Flying Frog Consultancy Ltd. said...

@Sebastian: Saynte's latest explanation makes it perfectly clear that he had gathered results that contradicted the conclusions he wanted to draw so he buried those results in order to draw incorrect conclusions dishonestly.

Moreover, I have now completed an exhaustive study and the program Saynte pulled out of thin air is the only Haskell program that exhibits scalable parallelism. The probability of him having arrived at a non-trivial combination of optimizations that happen to be optimally amenable to parallelism accidentally is vanishingly small.