質問

I've written a parallel algorithm in C# to partition an array into two lists, one that contains elements which satisfies a given predicate and the other list contains the elements that fails to satisfy the predicate. It is an order preserving algorithm.

I have written it as follows, but I want to know how to maximize the opportunity to profit from hardware concurrency.

    static void TestPLinqPartition(int cnt = 1000000)
    {
        Console.WriteLine("PLINQ Partition");
        var a = RandomSequenceOfValuesLessThan100(cnt).ToArray();
        var sw = new Stopwatch();
        sw.Start();
        var ap = a.AsParallel();
        List<int> partA = null;
        List<int> partB = null;
        Action actionA = () => { partA = (from x in ap where x < 25 select x).ToList(); };
        Action actionB = () => { partB = (from x in ap where !(x < 25) select x).ToList(); };
        Parallel.Invoke(actionA, actionB);
        sw.Stop();

        Console.WriteLine("Partion sizes = {0} and {1}", partA.Count, partB.Count);
        Console.WriteLine("Time elapsed = {0} msec", sw.ElapsedMilliseconds);
    }
役に立ちましたか?

解決

I'd partition the data into small segments (e.g. using the Partitioner class), and assign an index to each partition in relation to its position. For each numbered partition I'd create a Task that splits the partition into two groups, one that matches the predicate and one that doesn't and return the two groups, together with the index of the partition that they originated from, as the Task's return value. I'd then wait for all the tasks to complete, and then .Concat() (to prevent wasting time on actually merging all the data) the matching groups according to their index, and the same for the non-matching groups. You should be able to achieve an arbitrary degree of parallelism this way, while preserving relative item order.

他のヒント

If your lists are very long you will not get much parallelism out of it (2x). Instead, I'd recommend using a Parallel.For and use a thread-local Tuple<List<int>, List<int>> as the parallel loop state. The Parallel.For API allows you to do this easily. You can merge the individual sublists at the end.

This version is embarrassingly parallel and causes almost no coherency traffic on the CPU-bus because there is no synchronization.

Edit: I want to emphasize that you cannot just use two List's shared by all threads because that is going to cause synchronization overhead like crazy. You need to use thread-local lists. Not even a ConcurrentQueue is suitable for this scenario because it uses Interlocked operations which cause CPU coherency traffic which is limited.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top