What is a "spark" in Haskell What is a "spark" in Haskell multithreading multithreading

What is a "spark" in Haskell


Sparks are not threads. forkIO introduces Haskell threads (which map down onto fewer real OS threads). Sparks create entries in the work queues for each thread, from which they'll take tasks to execute if the thread becomes idle.

As a result sparks are very cheap (you might have billions of them in a program, while you probably won't have more than a million Haskell threads, and less than a dozen OS threads on half a dozen cores).

Think of it like this:

spark model


See A Gentle Introduction to Glasgow Parallel Haskell.

Parallelism is introduced in GPH by the par combinator, which takes two arguments that are to be evaluated in parallel. The expression p `par` e (here we use Haskell's infix operator notation) has the same value as e, and is not strict in its first argument, i.e. bottom `par` e has the value of e. (bottom denotes a non-terminating or failing computation.) Its dynamic behaviour is to indicate that p could be evaluated by a new parallel thread, with the parent thread continuing evaluation of e. We say that p has been sparked, and a thread may subsequently be created to evaluate it if a processor becomes idle. Since the thread is not necessarily created, p is similar to a lazy future.

[Emphasis in original]


If I understand it correctly, a spark is an entry in a queue of jobs requiring work. A pool of threads take entries from this queue and runs them. Typically there is one thread per physical processor, so this scheme maximises throughput and minimizes thread context switching.