We were recently led to a fascinating Channel 9 interview where the insideous Erik Meijer walks in on parallelism-expert Burton Smith, culminating in a fight the likes of which have not been seen since Chris Curry vs Sir Clive Sinclair in the Baron of Beef here in Cambridge.
A pragmatic Burton points out that lazy evaluation renders performance unpredictable and that, in turn, makes it impossible to tune the granularity of parallel programs to ensure that more effort is spent doing actual work than is wasted administering parallel tasks. The puritanical Erik points out that strictness is essentially another form of side effect because it affects issues such as non-termination. The conclusion is irreconcilable differences between what parallel programming requires and what purely functional programming can provide.
Erik and Burton were joined by an elephant in the room though: caches. They are the main difference between supercomputers and multicores and are a game changer, yet people do not discuss the effect caches have on programming and that effect will only grow as the memory gap continues to widen. Today, effective utilization of caches offers an order of magnitude in performance yet their effective use from purely functional languages like Haskell is essentially impossible precisely because memory has been successfully abstracted away. Indeed, as we have explained before, this not only remains a show-stopping unsolved problem for parallel Haskell but is not even being addressed by researchers. Today's state-of-the-art advice for multicore programmers is still to use mutable data structures in order to ensure caches are used effectively based upon the quantification of asymptotic cache complexity in the context of a simple theoretical model of CPU caches.