Why lambdas seem broken in multithreaded environments (or how closures in C# works).

I’ve been playing around with some code for.NET 3.5 to enable us to split big operations into small parallel tasks. During this work I was reminded why Resharper has the “Access to modified closure” warning. This warning tells us about a inconsistency in handling the ”Immutable” loop variable created in a foreach loop when lambdas or anonymous delegates are involved.

Take this code:

public static void ForEach<T>(IEnumerable<T> list)
{
    foreach(var item in list)
    {
        Action action = () => Console.WriteLine(item);
        action.Invoke();
    }
}

This will yield the expected result. When we call WriteLine the current item will be displayed:

image

A naïve and not so interesting example, but creates a baseline. Now let’s look at the real code where the reminder was at, a mimic of .NET 4’s parallel foreach:

public static void ForEach<T>(IEnumerable<T> list, Action<T> task)
{
    var waitHandles = new ManualResetEvent[list.Count()];

    var index = 0;
    foreach(var item in list)
    {
        var handleForCurrentTask = new ManualResetEvent(false);
        waitHandles[index] = handleForCurrentTask;
        ThreadPool.QueueUserWorkItem(delegate
         {
             task(item);
             handleForCurrentTask.Set();
         });

        index++;
    }

    WaitHandle.WaitAll(waitHandles);
}

Calling this with the following line:

ForEach(items, Console.WriteLine);

Will give a more unexpected result (if you don’t know what’s going on that is).

image

So what is going on, or how closures are implemented in C#.

To understand the issue we must examine how the C# compiler handles anonymous delegates and lambdas (called lambda from now on). Since they are really a compiler trick and not a real feature of the Common Language Runtime.

For every lambda we define; the compiler will generate a new method (there are some nice tricks to avoid duplication but for the sake of this post let’s go with this simple definition). As illustrated in this screenshot from Reflector:

image

As you can see this is encapsulated inside an auto generated class. That class also contains a couple of instances variables that we’ll discuss.

We can see that the method encapsulated the two lines of code we had inside our lambda and has a dependency on “task” and “item” from instance variables that are also auto generated types.

To be able to execute the Foreach method in the auto generated class, we’ll need to initialize the instance variables of the lambda class. The compiler neatly picks this up and creates a couple lines of code in the beginning of the method to encapsulate Task:

public static void ForEach<T>(IEnumerable<T> list, Action<T> task)
{
    <>c__DisplayClass1<T> CS$<>8__locals2 = new <>c__DisplayClass1<T>();
    CS$<>8__locals2.task = task;

 

A bit further down in the method we can see how the the item variable is encapsulated inside another auto generated instance variable.

  <>c__DisplayClass3<T> CS$<>8__locals4 = new <>c__DisplayClass3<T>();
        while (CS$5$0000.MoveNext())
        {
            CS$<>8__locals4.item = CS$5$0000.Current;

Notice how the creation of the object that encapsulates the item is declared outside of the iteration and the item is then passed into the object inside of the iteration. Here is the heart of the problem. In the interest of optimization, the compiler encapsulates parameters and variables as few times as possible. It seems like the optimizer do not recognize the immutable nature of the loop variable, nor the perceived local scope of it.

Further down this is passed into the ThreadPool as a delegate:

<>c__DisplayClass5<T> CS$<>8__locals6 = new <>c__DisplayClass5<T>();
CS$<>8__locals6.CS$<>8__locals4 = CS$<>8__locals4;
CS$<>8__locals6.CS$<>8__locals2 = CS$<>8__locals2;
CS$<>8__locals6.handleForCurrentTask = new ManualResetEvent(false);
waitHandles[index] = CS$<>8__locals6.handleForCurrentTask;
ThreadPool.QueueUserWorkItem(new WaitCallback(CS$<>8__locals6.<ForEach>b__0));

So while the delegate that is passed to the thread pool is instantiated for each iteration, the variables and parameters used by the closure isn’t.

In a scenario where the lambda is executed inside the loop this will not be a problem, the next time the item variable is assigned, the previous execution will already be done. In a multithreaded, or a deferred execution of the lambda the story is quite different.

The assignment of the Item variable is far from safe here. Since the reference to the item encapsulation will be shared between all of the instances of the delegate passed to the ThreadPool. Thus they will share the last (or the last at the time of execution) assignment.

To solve this we need to tell the compiler that the Item variable is locally scoped to the loop and not a method-wide variable.

A simple change to:

foreach(var item in list)
{
    var handleForCurrentTask = new ManualResetEvent(false);
    waitHandles[index] = handleForCurrentTask;
    var currentItem = item;
    ThreadPool.QueueUserWorkItem(delegate
     {
         task(currentItem);
         handleForCurrentTask.Set();
     });

Will be enough for the compiler do do the right thing. It will now move Item inside the closure class for the lambda and not create a separate class:

 

image

Since this class had an instance created for each iteration, we will now have a separate copy of the item value in each delegate passed to the ThreadPool:

<>c__DisplayClass3<T> CS$<>8__locals4 = new <>c__DisplayClass3<T>();
CS$<>8__locals4.CS$<>8__locals2 = CS$<>8__locals2;
CS$<>8__locals4.handleForCurrentTask = new ManualResetEvent(false);
waitHandles[index] = CS$<>8__locals4.handleForCurrentTask;
CS$<>8__locals4.currentItem = item;
ThreadPool.QueueUserWorkItem(new WaitCallback(CS$<>8__locals4.<ForEach>b__0)); 

And we’ll get the expected result again:

image

In conclusion

Most people will not run into this problem, but it is an interesting one never the less. I would argue that for deferred executed lambdas, the semantics of foreach is not the same as it is for the rest of the language. I’m sure the C# team has a really good reason for ignoring that Item really should be handled as a locally scoped variable, but my brain isn’t big enough to figure that out at the moment.

Maybe some of you know?

3 thoughts on “Why lambdas seem broken in multithreaded environments (or how closures in C# works).

  1. Great article. After reading it I though it would be interesting to test this with the new Parallell functions in .net 4 and it gives a correct result regarding the value of the currentitem.

    Parallel.ForEach(list, Console.WriteLine);

    0
    1
    3
    5
    7
    9
    6
    2
    8
    4

  2. Pingback: Tweets that mention Why lambdas seem broken in multithreaded environments (or how closures in C# works). | Lowendahl's Shout -- Topsy.com

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>