Why do we need 'seq' or 'pseq' with 'par' in Haskell?

https://stackoverflow.com/questions/4576734

14-10-2019
|

Pergunta

I'm trying to understand why we need all parts of the standard sample code:

a `par` b `pseq` a+b

Why won't the following be sufficient?

a `par` b `par` a+b

The above expression seems very descriptive: Try to evaluate both a and b in parallel, and return the result a+b. Is the reason only that of efficiency: the second version would spark off twice instead of once?

How about the following, more succinct version?

a `par` a+b

Why would we need to make sure b is evaluated before a+b as in the original, standard code?

Solução

Ok. I think the following paper answers my question: http://community.haskell.org/~simonmar/papers/threadscope.pdf

In summary, the problem with

a `par` b `par` a+b

and

a `par` a+b

is the lack of ordering of evaluation. In both versions, the main thread gets to work on a (or sometimes b) immediately, causing the sparks to "fizzle" away immediately since there is no more need to start a thread to evaluate what the main thread has already started evaluating.

The original version

a `par` b `pseq` a+b

ensures the main thread works on b before a+b (or else would have started evaluating a instead), thus giving a chance for the spark a to materialize into a thread for parallel evaluation.

Outras dicas

a `par` b `par` a+b

will evaluate a and b in parallel and returns a+b, yes.

However, the pseq there ensures both a and b are evaluated before a+b is.

See this link for more details on that topic.

a `par` b `par` a+b creates sparks for both a and b, but a+b is reached immediately so one of the sparks will fizzle (i.e., it is evaluated in the main thread). The problem with this is efficiency, as we created an unnecessary spark. If you're using this to implement parallel divide & conquer then the overhead will limit your speedup.

a `par` a+b seems better because it only creates a single spark. However, attempting to evaluate a before b will fizzle the spark for a, and as b does not have a spark this will result in sequential evaluation of a+b. Switching the order to b+a would solve this problem, but as code this doesn't enforce ordering and Haskell could still evaluate that as a+b.

So, we do a `par` b `pseq` a+b to force evaluation of b in the main thread before we attempt to evaluate a+b. This gives the a spark chance to materialise before we try evaluating a+b, and we haven't created any unnecessary sparks.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow