Tuesday, June 29, 2010

Good ol' FP infighting

A Schemer by upbringing, I know all about infighting!

It seems that there's been an ongoing back and forth between certain ML and Haskell partisans about performance predictability. Further, there's been the accusation that Haskell has horrible hash table performance. (Haven't ran these tests myself yet, but I'd like to.)

Anyway, the Haskell CnC project that I announced on the Intel blogs has been garnering some feedback. It got panned particularly brutally by Jon Harrop here. I don't know, maybe that's just what Jon likes to do. In any case, I wrote a response in the comments and it grew big enough that I thought I'd make it into a post (below).




Response to: http://flyingfrogblog.blogspot.com/2010/06/intel-concurrent-collections-for.html

I think there's a bit of a misunderstanding here in the initial conflation of CnC with "graph reduction". CnC is a programming model based on static graphs (no reduction) like synchronous data flow. (I recommend the papers on StreamIT in this area.)

If anything, you could view this as backing down from some of the traditional Haskell implementation decisions. The user makes decisions as to the granularity of the parallel tasks (the granularity of the nodes in the graph), and CnC eschews lazy evaluation during graph execution.

I guess I feel like I've been caught in the middle of some existing flame wars wrt hash tables and performance predictability. I'm aware of the existing battle over Haskell hash tables. I have no problem with hash tables. The Haskell CnC implementation makes no use of persistence and can use mutable map data structures just as well as immutable. I use Haskell hash tables, but as they don't support concurrent update (and aren't high performance to start with) they're not the default. I'd love better hash tables. In other implementations we use TBB hashmaps.

Regarding performance predictability -- sure, it's a problem, both because of lazy evaluation and dependence on high level (read fragile) optimizations. But I don't see why this needs to be an argument. People who need great predictability today can use something else. Haskellers should continue to try to improve predictability and scaling. What else can they do? Even in this little CnC project we have made progress -- complex runtime, quirks and all -- and have Haskell CnC benchmarks getting parallel speedups over 20X. Not perfect, but improving. (I'll give your raytracing benchmark a try sometime, if you like; that sounds fun.)

I also don't think it's a good idea to endorse Dmitriy Vyukov "kindly" dismissing all of Haskell and Erlang. This is an even more extreme position than Jon's own "doesn't scale beyond a few cores" position.

Jon, I'm amenable to your arguments and not necessarily in a different camp (I'm an eager functional programmer more than a lazy one), but I would appreciate not being so cursorily panned!

P.S. (Regarding "the only problem that we've solved" being mandelbrot. Indeed that would be bad if it were so. Please see the paper. We don't have nearly as many benchmarks as I'd like yet, but we're doing Cholesky factorization, the Black-Scholes equation, and the N-body problem at least. Also the paper has some more in-depth discussions about scheduling. None of the current schedulers, by the way, are yet considered by me to be anywhere near an endpoint. Many low hanging fruit remain.)

P.P.S We also have a CnC implementation in F#.

I guess a blog wouldn't hurt.

I've been posting a bit on the Intel blogs recently. Yet I guess there will always be posts that aren't appropriate in that context, so why not create one of these new fangled Weblogs?