Wednesday, February 8, 2012

Potential GSoC: Haskell Lock-free Data Structure Implementations

The GHC Haskell compiler recently gained the capability to generate atomic compare-and-swap (CAS) assembly instructions. This opens up a new world of data-structure implementation possibilities.

Furthermore, it's an important time for concurrent data structures. Not only is the need great, but the design of concurrent data structures has been a very active area in recent years, as summarized well by Nir Shavit in this article.

Because Haskell objects containing pointers can't efficiently be stored outside the Haskell heap, it is necessary to reimplement these data structures for Haskell, rather than use the FFI to access external implementations. There are already a couple of data structures implemented in the following library (queues and deques) :


But, this leaves many others, such as:
  • Concurrent Bags
  • Concurrent Hashtables
  • Concurrent Priority Queues
A good point of reference would be the libcds collection of concurrent data structures for C++ (or those that come with Java or .NET):

One of the things that makes implementing these data structures fun is that they have very short algorithmic descriptions but a high density of thought-provoking complexity.

A good GSoC project would be to implement 2-3 data structures from the literature, and benchmark them against libcds implementations.

UPDATE: Here's a link to the Trac ticket.