Why lambdas seem broken in multithreaded environments (or how closures in C# works).

I’ve been playing around with some code for.NET 3.5 to enable us to split big operations into small parallel tasks. During this work I was reminded why Resharper has the “Access to modified closure” warning. This warning tells us about a inconsistency in handling the ”Immutable” loop variable created in a foreach loop when lambdas or anonymous delegates are involved.

Take this code:

public static void ForEach<T>(IEnumerable<T> list)
{
    foreach(var item in list)
    {
        Action action = () => Console.WriteLine(item);
        action.Invoke();
    }
}

This will yield the expected result. When we call WriteLine the current item will be displayed:

image

A naïve and not so interesting example, but creates a baseline. Now let’s look at the real code where the reminder was at, a mimic of .NET 4’s parallel foreach:

public static void ForEach<T>(IEnumerable<T> list, Action<T> task)
{
    var waitHandles = new ManualResetEvent[list.Count()];

    var index = 0;
    foreach(var item in list)
    {
        var handleForCurrentTask = new ManualResetEvent(false);
        waitHandles[index] = handleForCurrentTask;
        ThreadPool.QueueUserWorkItem(delegate
         {
             task(item);
             handleForCurrentTask.Set();
         });

        index++;
    }

    WaitHandle.WaitAll(waitHandles);
}

Calling this with the following line:

ForEach(items, Console.WriteLine);

Will give a more unexpected result (if you don’t know what’s going on that is).

image

So what is going on, or how closures are implemented in C#.

To understand the issue we must examine how the C# compiler handles anonymous delegates and lambdas (called lambda from now on). Since they are really a compiler trick and not a real feature of the Common Language Runtime.

For every lambda we define; the compiler will generate a new method (there are some nice tricks to avoid duplication but for the sake of this post let’s go with this simple definition). As illustrated in this screenshot from Reflector:

image

As you can see this is encapsulated inside an auto generated class. That class also contains a couple of instances variables that we’ll discuss.

We can see that the method encapsulated the two lines of code we had inside our lambda and has a dependency on “task” and “item” from instance variables that are also auto generated types.

To be able to execute the Foreach method in the auto generated class, we’ll need to initialize the instance variables of the lambda class. The compiler neatly picks this up and creates a couple lines of code in the beginning of the method to encapsulate Task:

public static void ForEach<T>(IEnumerable<T> list, Action<T> task)
{
    <>c__DisplayClass1<T> CS$<>8__locals2 = new <>c__DisplayClass1<T>();
    CS$<>8__locals2.task = task;

 

A bit further down in the method we can see how the the item variable is encapsulated inside another auto generated instance variable.

  <>c__DisplayClass3<T> CS$<>8__locals4 = new <>c__DisplayClass3<T>();
        while (CS$5$0000.MoveNext())
        {
            CS$<>8__locals4.item = CS$5$0000.Current;

Notice how the creation of the object that encapsulates the item is declared outside of the iteration and the item is then passed into the object inside of the iteration. Here is the heart of the problem. In the interest of optimization, the compiler encapsulates parameters and variables as few times as possible. It seems like the optimizer do not recognize the immutable nature of the loop variable, nor the perceived local scope of it.

Further down this is passed into the ThreadPool as a delegate:

<>c__DisplayClass5<T> CS$<>8__locals6 = new <>c__DisplayClass5<T>();
CS$<>8__locals6.CS$<>8__locals4 = CS$<>8__locals4;
CS$<>8__locals6.CS$<>8__locals2 = CS$<>8__locals2;
CS$<>8__locals6.handleForCurrentTask = new ManualResetEvent(false);
waitHandles[index] = CS$<>8__locals6.handleForCurrentTask;
ThreadPool.QueueUserWorkItem(new WaitCallback(CS$<>8__locals6.<ForEach>b__0));

So while the delegate that is passed to the thread pool is instantiated for each iteration, the variables and parameters used by the closure isn’t.

In a scenario where the lambda is executed inside the loop this will not be a problem, the next time the item variable is assigned, the previous execution will already be done. In a multithreaded, or a deferred execution of the lambda the story is quite different.

The assignment of the Item variable is far from safe here. Since the reference to the item encapsulation will be shared between all of the instances of the delegate passed to the ThreadPool. Thus they will share the last (or the last at the time of execution) assignment.

To solve this we need to tell the compiler that the Item variable is locally scoped to the loop and not a method-wide variable.

A simple change to:

foreach(var item in list)
{
    var handleForCurrentTask = new ManualResetEvent(false);
    waitHandles[index] = handleForCurrentTask;
    var currentItem = item;
    ThreadPool.QueueUserWorkItem(delegate
     {
         task(currentItem);
         handleForCurrentTask.Set();
     });

Will be enough for the compiler do do the right thing. It will now move Item inside the closure class for the lambda and not create a separate class:

 

image

Since this class had an instance created for each iteration, we will now have a separate copy of the item value in each delegate passed to the ThreadPool:

<>c__DisplayClass3<T> CS$<>8__locals4 = new <>c__DisplayClass3<T>();
CS$<>8__locals4.CS$<>8__locals2 = CS$<>8__locals2;
CS$<>8__locals4.handleForCurrentTask = new ManualResetEvent(false);
waitHandles[index] = CS$<>8__locals4.handleForCurrentTask;
CS$<>8__locals4.currentItem = item;
ThreadPool.QueueUserWorkItem(new WaitCallback(CS$<>8__locals4.<ForEach>b__0)); 

And we’ll get the expected result again:

image

In conclusion

Most people will not run into this problem, but it is an interesting one never the less. I would argue that for deferred executed lambdas, the semantics of foreach is not the same as it is for the rest of the language. I’m sure the C# team has a really good reason for ignoring that Item really should be handled as a locally scoped variable, but my brain isn’t big enough to figure that out at the moment.

Maybe some of you know?

Executing lambdas on transaction commit – or Transactional delegates.

One great thing with the TransactionScope class is that it makes it possible to include multiple levels in your object hierarchy.  Everything that is done from the point that the scope was created until it is disposed will be in the exact same transaction. This makes it easy to control business transactions from high up in the call stack and allow for everything to participate.

Everything but calls to delegates, until now. I checked in code in our project today to allow for this:

   1: using(var scope = new TransactionScope)

   2: {

   3:     repository.Save(order);

   4:     bus.SendMessage(new OrderStatusChanged {OrderId = order.Id, Status = order.Status});

   5:     scope.Complete();

   6: }

   7:  

   8: // In some message listener somwhere that listens to the Message sent

   9: // But still in the transaction scope.

  10:  

  11: public void HandleMessage(OrderStatusChanged message)

  12: {

  13:     new OnTransactionComplete

  14:     (

  15:         () => publisher.NotifyClientsStatusChange(message.OrderId, message.NewStatus);

  16:     ) 

  17: }

  18:  

  19:  

 

Why is this interesting?

Why would anyone want a delegate to be called on commit and only then? The scenarios where I find this and other similar scenarios (the code allows for OnTransactionRollback and AfterTransactionComplete as well) is where you have events or message sent to other subsystems asking them react. It can be more then one that listens and often you might not want to carry out the action if the transaction started high above in the business layer isn’t commited (like telling all clients that the status has changed and then the transaction rolls back).

How can I implement this?

The code to achieve this was fairly simple. Transactions in System.Transaction has a EnlistVoilatile method where you can send in anything that implements IEnlistmentNotification. This handy little interface defines four methods:

   1:  

   2: public interface IEnlistmentNotification

   3: {

   4:     void Prepare(PreparingEnlistment preparingEnlistment);

   5:     void Commit(Enlistment enlistment);

   6:     void Rollback(Enlistment enlistment);

   7:     void InDoubt(Enlistment enlistment);

   8: }

Any object enlisting in a transaction with this interface will be told what’s happening and can participate in the voting. This particular implementation don’t vote it just reacts on Commit or Rollbacks.

The implementation spins around the class TransactionalAction that in it’s constructor accepts an Action delegate as a target and then enlists the object in the current transaction:

 

   1: public TransactionalAction(Action action)

   2: {

   3:      Action = action;

   4:      Enlist();

   5: }

   6:  

   7: public void Enlist()

   8: {

   9:      var currentTransaction = Transaction.Current;

  10:      currentTransaction.EnlistVolatile(this, EnlistmentOptions.None);

  11: }

Enlisting the object into the transaction will effectively add it to the transactions graph. As long as the transaction scope have a root reference, the object will stay alive. That is why this works by only stating new OnTransactionComplete( () => ) and you don’t have to save the reference.

The specific OnTransactionComplete, OnTransactionRollback  and WhenTransactionInDoubt then inherits from TransactionalAction and overrides the appropriate method (the base commit ensures that we behave well in a transactional scenario):

   1: public class OnTransactionCommit : TransactionalAction

   2: {

   3:     public OnTransactionCommit(Action action)

   4:         : base(action)

   5:     { }

   6:  

   7:     public override void Commit(System.Transactions.Enlistment enlistment)

   8:     {

   9:         Action.Invoke();

  10:         base.Commit(enlistment);

  11:     }

  12: }

Hey, what about AfterTransactionCommit?

The IEnlistmentNotification isn’t an interceptor model. It just allows you to participate in the transactional work. To be able to make After calls we need to use another mechanism. The TransactionCompleted event. As the previous code, this is fairly simple;

   1: public class AfterTransactionComplete

   2: {

   3:     private Action action;

   4:  

   5:     public AfterTransactionComplete(Action action)

   6:     {

   7:         this.action = action;

   8:         Enlist();

   9:     }

  10:  

  11:     private void Enlist()

  12:     {

  13:         var currentTransaction = Transaction.Current;

  14:  

  15:         if(NoTransactionInScope(currentTransaction))

  16:             throw new InvalidOperationException("No active transaction in scope");

  17:  

  18:         currentTransaction.TransactionCompleted += TransactionCompleted;

  19:     }

  20:  

  21:     private void TransactionCompleted(object sender, TransactionEventArgs e)

  22:     {

  23:         if(TransactionHasBeenCommited(e))

  24:             action.Invoke();

  25:     }

  26: }

Summary

With some simple tricks using Transaction.Current and lambdas we can now participate in our transactions with our delegates. Download the code to get to do this:

   1: new OnTransactionComplete

   2: (

   3:   () => DoStuff()

   4: )

   5:  

   6: new OnTransactionRollback

   7: (

   8:     () => DoStuff()

   9: )

  10:  

  11: new AfterTransactionCommit

  12: (

  13:     () => DoStuff();

  14: )

  15:  

  16: // Also includes a more "Standard" api:

  17: Call.OnCommit( () => DoStuff );

  18:  

  19: // Since Call has extension methods, this is also valid:

  20: ((Action)(()=> firstActionWasCalled=true)).OnTransactionCommit();

I hope this helps you build more robust solutions.

Code download (project is VS2010 but it works on .NET 2.0) [306kb]

More information on IEnlistmentNotification

More information on EnlistVolatile

Pattern focus: Decorator pattern

The decorator pattern is a pretty straightforward pattern that utilizes wrapping to extend existing classes with new functionality. It’s commonly used to apply cross-cutting concerns on top of business classes, which avoids mixing that up with the business logic. Frameworks like Spring.Net uses this pattern to apply it’s functionality to your classes.

Implementing decorator pattern

The pattern is pretty simple to implement; extract public methods from the class to wrap to an interface and implement that on the decorator. Let the decorator receive an object of the original type and delegate all calls to that object. Let’s look at an example that uses caching (inspired from a discussion with Anders Bratland today on the Swenug MSN group).

The UserService we’ll extend is pretty straight forward; 

   1: public interface IUserService

   2: {

   3:     User GetUserById(int id);

   4: }

   5: 

   6: public class UserService : IUserService

   7: {

   8:     public User GetUserById(int id)

   9:     {

  10:         return new User { Id = id };

  11:     }

  12: }

The idea is that we want to cache the user and avoid mixing that logic with the code to retrieve the user. For this we create the decorator wrapper;

   1: public class CachedUserService : IUserService

   2: {

   3:     private readonly IUserService _userService;

   4: 

   5:     public CachedUserService(IUserService service)

   6:     {

   7:         _userService = service;

   8:     }

   9: }

The wrapper implements the same interface as the wrapped class and takes an instance of the wrapped object (in this case the interface to ensure we can add more then one decorator) and store it as an instance variable for later use.

To add our caching, we’ll implement the GetUserById method to first check our cache if the user is there and if not just delegate the call to the wrapped object;

   1: private static readonly Dictionary<int, User> UserCache = new Dictionary<int, User>();

   2: 

   3: public User GetUserById(int id)

   4: {

   5:     if (!UserCache.ContainsKey(id))

   6:         UserCache[id] = _userService.GetUserById(id);

   7: 

   8:     return UserCache[id];

   9: }

Anyone consuming this service will never have to worry about the caching we just applied, additionally our original code never have to know that it might be cached and each concern can be maintained separately. A win-win on both parts, concerns successfully separated.

 

Using decorators

Since we need to compose the objects to get all the decorators we want wrapping our object. It’s usually advisable to hide the creation of the object inside a factory. A typical usage could be;

   1: public static class ServiceFactory

   2: {

   3:     public static IUserService Create()

   4:     {

   5:         var baseObject = new UserService();

   6:

   7:         return new AuditableUserService(new CachedUserService(baseObject))

   8:     }

   9: }

  10: .

  11: .

  12: .

  13: private void SetupContext()

  14: {

  15:     _service = ServiceFactory.Create();

  16: }

Using this approach we are free to add and subtract as many decorators to the original object as we wish and the consumers never have to know.

Usage scenarios for the decorator pattern

There is several usages for this pattern for a lot of different scenarios not only adding caching to a service. Let’s examine the categories and what they can be used for.

Applying cross cutting concerns

We’ve just seen how we can apply a cross cutting concerns, like caching, to an existing class. The factory hints about Auditing and there is several other you could imagine being interesting. One could be the notorious INotifyPropertyChanged interface that’s needed for data binding. Wrapping the bound object into a decorator that automatically calls the event will keep the domain objects free from infrastructure concerns. Something like;

   1: public class NotifyingUser : INotifyPropertyChanged, IUser

   2: {

   3:     private IUser _user;

   4:     public NotifyingUser(IUser user)

   5:     {

   6:         _user = user;

   7:     }

   8: 

   9:     public int Id

  10:     {

  11:         get { return _user.Id; }

  12:         set

  13:         {

  14:             _user.Id = value;

  15:             RaisePropertyChanged("Id");

  16:         }

  17:     }

This will keep the user entity clean from data binding concerns and we’ll be able to provide the functionality. Win-Win.

Creating proxies

A lot of frameworks use decorators to create runtime proxies that applies the framework functionality. Spring.Net is an example of that, it uses decorators to apply Mixins and Advices.

Strangle wine refactoring

Michael Feathers talks a lot about how to handle legacy code. One of the concepts I’ve heard him talk about is the “Strangle wine”. A plant that starts out small but if you let it grow it will soon take over and strangle everything else. Refactoring using strangle wine strategies is about replacing legacy code with new code. A variant of decorator pattern can be very useful to accomplish that. In a recent project we refactored from Linq To Sql to NHibernate, not all at once but step by step. To do this successfully we used the ideas behind the decorator (with a touch of the Adapter pattern thrown in there);

   1: public class OrderRepository : IOrderRepository

   2: {

   3:     private readonly IOrderRepository _ltsRepository;

   4:     private readonly IOrderRepository _nhRepository;

   5: 

   6:     public OrderRepository(IOrderRepository ltsRepository, IOrderRepository nhRepository)

   7:     {

   8:         _ltsRepository = ltsRepository;

   9:         _nhRepository = nhRepository;

  10:     }

  11: 

  12:     public Order GetOrderById(int id)

  13:     {

  14:         var ltsObject = _ltsRepository.GetOrderById(id);

  15:

  16:         return Translator.ToEntityFromLts(ltsObject);

  17:     }

  18: 

  19:     public IList<Order> ListOrders()

  20:     {

  21:         return _nhRepository.ListOrders();

  22:     }

  23: }

This allowed us to move over from Linq to Sql over to NHibernate method by method. Slowly strangling the old repository to death so it could be cut out.

Summary

Decorator pattern is a very versatile pattern with a lot of usage scenarios. A great tool to have in your toolbox. Either as a stand alone pattern or, as we’ve seen in some of the examples, in cooperation with others. It is simple but powerful and I hope you’ll find usages for it in your applications.

Creating a dynamic state machine with C# and NHibernate

In my last post (An architecture dilemma: Workflow Foundation or a hand rolled state machine?) I talked about the discussion around an architectural choice. The conclusion of that discussion was to hand-roll a dynamic state machine. This post is the first part of 3 explaining the solution we used. In this part we’ll focus on the state machine, in the following two parts we’ll be adding business rules to each state and utilizing a couple of tricks from NHibernate to make those rules rich.

If you are uncertain what the state machine pattern looks like, there is some information here: http://msdn.microsoft.com/en-us/magazine/cc301852.aspx . For the rest of this post I will assume that you got a basic understanding of it.

The requirements for our state machine was that users should be able to add their own states and tight them into the state machine. They should also be able to define the flow, designing the structure of which states can transition to each other.

The basic state machine looks something like this:

Classic State Machine

Since we need to be more dynamic then this our model turned out more like this instead:

image

In this model we are using composition instead of inheritance to build our state machine. The list “AllowedTransitions” contains a list of states that is allowed to transition to from the current one.

The method “CanBeChangedInto” takes a state object and compares it to the list of states and decides if the transition is allowed. For our scenario this comparison is done by implementing IEquatable and overriding the appropriate operators (==, !=).

This is defined on a template kind of entity, there will be “instances” made out of this template where all attributes are copied onto the instance.

The implementation of the ChangeStateTo method on the template is fairly simple:

Template:
public void ChangeStateTo(State newState)
{
    if (State.CanBeChangedInTo(newState))
        State = newState;
    else
        throw new InvalidStateTransitionException();
}

State:
public bool CanBeChangedInTo(State state)
{
    return AllowedTransitions.Contains(state);
}

Simple but yet very powerful and allows for the kind of dynamic state transitions our scenario requires.

Adding state transitions is easy as well, since the AllowedTransitions list holds

Our project uses NHibernate as a persistence engine and to persist our dynamic state we use a many to many bag mapped in xml:

<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" assembly="" namespace="">
  <class name="State" table="States">
    <id name="Id" type="int">
      <generator class="native" />
    </id>

    <property name="Name" length="255" not-null="true" />

    <bag name="AllowedTransitions" table="AllowedStateTransitions">
      <key column="FromState_Id" />
      <many-to-many column="ToState_Id"  class="State" />
    </bag>
  </class>
</hibernate-mapping>

Between NHibernate many-to-many mapping and the design of the state transition list it’s easy to start building an application upon this. We are not done yet though. Every state needs more transition rules then just “AllowedTransitions”. The next post will tackle that requirement.