Lowendahl's Shout

NDC2010: Hadi Hariri – CouchDB For .NET Developers

Posted by Patrik Löwendahl in NDC2010 on June 16th, 2010

Hadi speaks in one of the scariest rooms ever, It’s built on the top of this sports arena. Way way up with a ridiculous steep angle down to the stage. Butterflies in my stomach:

Prelude
Specifications today are written in the language of the client but most of the data modeling is done in a relational model. Hadi talks about foreign keys and the idea that a invoice have to have a customer because the relational model requires a foreign key and a bunch of other things. This makes some changes inflexible, like a client wanting to invoice an anonymous customer.

We have an impedance miss match between how we look at the data and how the client looks at the data. When client talks about invoices they see the actual invoice, when we see invoices we see the relational mode.

In Domain Driven Design, et al, we model the clients perspective using objects and then use an ORM to map between the business perspective and the relational model. Which in Hadi’s opinion brings a heap of other problems into the flexibility of responding to change and feature implementations.

Hadi’s session will show how to work with document databases using CoachDb

The Tool
CouchDB is a document database that stores complete documents in JSON. It has ACID compliance for each documentation and uses REST to access the stored documents. The initial engines was built in C++ was was ported to Erlang because of it’s strength in concurrent operations, which accessing and writing data is really about. It works on a lot of platforms; Windows, Linux, Andriod, Maemo, Browser Couch (HTML 5 storage) and CouchDB is able to synchronize data between any number of instances on any type of platform.

CouchDB doesn’t have a schema, well there is a small schema with an ID (_id) and a Revision (_rev), which allows you to store documents in any format you want and change that format during the documents lifetime.

Since CouchDB is built as a REST-based db, everything is done through HTTP (GET, POST; UPDATE and DELETE) and JSON which makes it very easy to access either by running your own HTTP requests or using the browser based admin app Futon.

Writing and versions
CouchDB uses an insert only approach, so each write to a document creates a new version of that document. This means you can track changes through the documents lifetime, but it also means that there is no locks and thus data consistency will follow the Eventual Consistency paradigm.

CouchDB enforces optimistic looking by enforcing the rule that you have to pass a revision number when updating.

Since the nature of CouchDB is to be distributed, this means that you can have a lot of versions spread out in your eco-system up-until the synchronization occurs. This makes it scalable but changes how we have to think about consistency.

When creating new documents, the engine wants you to specify an iD. Anything can be an ID but the DB uses UUID as the default and let’s you generate them easily.

Queries
Since the REST-Api only allows you to query on ID’s you can’t do this;

select * from customers where email =’’

Which means that you have to use MAP / Reduce. That is create a Map (a view as another document) of the data which is a javascript function which executes the against the documents and returns the data that you need.

This means that all queries have to be pre-created. There is no ad-hoc queries in CouchDB, you have to think about all of your queries in advance.

The perspective

NoSQL is a curious thing and CouchDB falls into that category. There are great advantages in using documents databases if documents is all you want. It won’t support things like reporting or BI.

CouchDB will probably really quick find documents using it’s ID but I’m still skeptical about performance when it comes to more advanced queries, especially if you need queries based on cross references.

The architecture of CouchDB allows to simply replicate and scale out instances across the internet, but I’d love to see some numbers on hardware vs performance. Javascript and JSON is not quick, so maybe it needs more hardware to achieve the same performance as other options.

All in all, I’ll probably pick up the tech and play around with it a bit to find good scenarios where it is a perfect match.

Data Access, NDC2010, NoSql

3 Comments

NDC2010: Chris Sells on Data

Posted by Patrik Löwendahl in NDC2010 on June 16th, 2010

First session ended, Chris Sells on data. He kicked on a lot of open doors, tried to sell the idea that M and Oslo’s death are over exaggerated and will be the next big thing. It’s over, let it go.

Chris position on Data
Chris started off by stating his position on data, it can be saved in many forms; graphs, trees or tables, but as it seems we as an industry more often then not revert back to tables. Since it brings the most utility of the three for multi purposes. Later in the talk he spoke about NoSQL and how a lot of these technologies solve interesting problems, often with scale, but warned the audience (as the engineers they are) to think that the new shiny toy comes without flaws or drawbacks. Every tool has his/her place in the eco-system.

Data for everyone, really everyone
An interesting point he made though, that runs chills down my spine, is that availability and access of data changes. It’s not that it changes that gives me the creeps, it’s how he and the team he works for envisions the change. Chris made a parallel with Excel and how good it was, how it allowed everyone to be a “programmer”, his vision was that with Microsofts OData and things like Excel Power Pivot, everyone will be able to query data and put in their program. As there isn’t enough Excel mess to clean up in the world?! But hey, at least it’s consultant friendly.

Perspective
Chris concluded in his talk that how we think about data changes, how we expose/get exposed to data changes and that no matter what we do, data will be what’s important (he also said that behavior was “the 90’s”, meh?). I’d agree that data is important, how we store data is important and how we access data is important. But data isn’t just there to be entered, read or draw diagram off. There is a huge portion of data that’s used to support and make business processes easier. Excel doesn’t help with that, neither do OData (and certainly not M). So even with all these new shiny toys Microsoft will be putting out, we’ll still build our software as we used to. Just with more options.

Architecture, Data Access, NDC2010

1 Comment

NDC 2010 – Day 0

Posted by Patrik Löwendahl in Column on June 15th, 2010

Today I arrived in Oslo, Norway with a colleague for Norwegian Developer Conference. This is my first visit and the agenda looks very good. I’d hoped for a little more diversity and some more local Norwegian heros, but I’ll chat with them during the breaks instead (that’s where the interesting stuff happens, isn’t it). Day 0 was a nice meet up with NNUG and some of the speakers for a casual beer, tomorrow starts the cramming.

I’ll do some summarizing here the next couple of days so watch this space.

Sessions I’m planning to see and summarize include;

Data for developers, Chris Sells
CouchDB for .NET Developers
Strategic Design, Eric Evans
What I learned about DDD since I wrote the book, Eric Evans
Unleash Your Domain, Greg Young
Domain Driven Entity Framework, Julie Lerman
5 reasons why projects using DDD fail, Greg Young

…. and lot’s I can’t decide between.

If you are here, poke me and we’ll have a beer.

Events, NDC2010

No Comments

Common Service Host for Windows Communication Foundation

Posted by Patrik Löwendahl in WCF on June 14th, 2010

After having to write new instance providers, host factories and service behaviors for almost every new WCF project; I decided to write a simple reusable component for WCF and dependency injection and put it on codeplex so that I never had to write that again.

The idea is simple, when creating service hosts you more often then not want a centralized controlling factory that handles your dependency wiring and life time management of said dependencies. WCF requires you to add custom code to the initialization extension points.

Enter the Common Service Host. Based on the Common Service Locator interface and model it allows for a single library with classes for any DI-Framework to automatically wire up your service instances. From the codeplex documentation:

Example host configuration:

public class AppContainer : UnityContainerProvider
{
        public void EnsureConfiguration()
        {
            UnityContainer.RegisterType();
        }
}

Example self-hosted usage:

using(var host = new CommonServiceHost())
{
    host.Open();
}

Example usage for a .svc file:

<%@ ServiceHost Language="C#"
      Service="Sogeti.Guidelines.Examples.Service1"
      Factory="Sogeti.Guidelines.WCF.Hosting.CommonServiceHostFactory`1
        [[Sogeti.Guidelines.Examples.AppContainer,
 Sogeti.Guidelines.Examples.Service]], Sogeti.Guidelines.WCF.Hosting" %>

Included providers in the current release for Unity and spring.net

Get your copy here: http://commonservicehost.codeplex.com

Architecture, Dependency Injection, WCF

1 Comment

The repository pattern explained and implemented

Posted by Patrik Löwendahl in Data Access, design patterns on June 7th, 2010

The pattern documented and named “Repository” is one of the most misunderstood and misused. In this post we’ll implement the pattern in C# to achieve this simple line of code:

var customers = customers.Matching(new PremiumCustomersFilter())

as well as discuss the origins of the pattern and the original definitions to clear out some of the misrepresentations.

The formal description

My first contact with the repository pattern was through the study of Domain Driven Design. In his book[DDD] (p. 147), Eric Evans simply states that;

Associations allow us to find an object based on it’s relationships to another. But we must have a starting point for a traversal to an ENTITY or VALUE in the middle of it’s life cycle.

My interpretation of the section on repositories is simple, the Domain do not care about where the objects are stored in the middle of it’s life cycle but we still need a place in our code to get (and often store) them.

In Patterns of Enterprise Application Architecture[PoEAA] (p. 322), repositories is described as:

Mediates between the domain and the data mapping layers using a collection-like interface for accessing domain objects.

Examining the both chapters; we’ll quickly come to understand that the original ideas of the repository is to honor and gain the benefits of Separations of Concerns and hide infrastructure plumbing.

With these principles and descriptions this simple rule emerges:

Repositories are the single point where we hand off and fetch objects. It is also the boundary where communication with the storage starts and ends.

A rule that should guide any and all attempts of following the Repository pattern.

Implementations

There are several ways of implementing a repository. Some honor the original definitions more then others (and some are just blunt confusing). A classic implementation looks a lot like a DAL class:

public class CustomerRepository
{
       public IEnumerable FindCustomersByCountry(string country) {…}
}

Using this implementation strategy; the result is often several repository classes with a lot of methods and code duplicated across them. It misses the beauty and simplicity in the original definitions. Both [PoEAA] and [DDD] uses a form of the Specification Pattern (implemented as Query Object in PoEAA) and asks the repository for objects that matches that, instead of named methods.

In code this gives the effect of having several small classes instead of a couple of huge ones. Here is a typical example:

public IEnumerable Matches(IQuery query) { …. }
var premiumCustomers = customers.Matches(new PremiumCustomersFilter())

The above code is a great improvement over the named methods strategy. But let’s take it a little further.

The Generic Repository

The key to a generic repository is to think about what can be shared across different entities and what needs to be separate. Usually the initialization of infrastructure and the commands to materialize is sharable while the queries are specific. Let’s create a simple implementation for Entity Framework:

public class Repository : IRepository
    where T : class
{
    protected ObjectContext Context;
    protected ObjectSet QueryBase;

    public Repository(ObjectContext context)
    {
        Context = context;
        QueryBase = context.CreateObjectSet();
    }

    public IEnumerable Matches(ICriteria criteria)
    {
        var query = criteria.BuildQueryFrom(QueryBase);
        return query.ToList();
    }

    public void Save(T entity)
    {
        QueryBase.AddObject(entity);
    }
}

Using Generics in the definition of the repository, allows us to reuse the basics while still allowing us to be specific using the criteria. In this naïve implementation there is not much that would be shared, but add things like logging, exception handling and validation and there is LoC to be saved here. Notice that the repository is executed and returns an IEnumerable with the result as we expect all communication with the store to go through the repository.

The query objects then implement the ICriteria interface and adds any filtering needed. An example query can look like this:

public class WarehousesWithReservableQuantitiesFor : ICriteria
{
    private readonly string _itemCode;
    private readonly int _minimumRemaingQuantity;

    public WarehousesWithReservableQuantitiesFor(string itemCode,
                                                int minimumRemaingQuantity)
    {
        _itemCode = itemCode;
        _minimumRemaingQuantity = minimumRemaingQuantity;
    }

    IQueryable
        ICriteria.BuildQueryFrom(ObjectSet queryBase)
    {
        return (from warehouse in queryBase
                from stock in warehouse.ItemsInStock
                where stock.Item.Code == _itemCode
                        && (stock.Quantity - stock.ReservedQuantity)
                                > _minimumRemaingQuantity
                select warehouse)
                .AsQueryable();
    }
}

There is a couple of things to notice here. First of all the interface is implemented explicit, this “hides” the method from any code that isn’t, and shouldn’t be, aware that there is the possibility to create a query here. Remember: …It is also the boundary where communication with the storage starts and ends….

Another thing to note is that it only handles the query creation, not the execution of it. That is still handled by the Generic repository. For me, using the above type of repository / query separation achieves several goals.

There is high reuse of the plumbing. We write it once and use it everywhere:

var customers = new Repository();
var warehouses = new Repository();

This makes it fairly quick to start working with new entities or change plumbing strategies.

Usage creates clean code that clearly communicates intent:

var reservables = warehouses.Matching
        (new WarehousesWithReservableQtyFor(code, 100));

Several small classes with one specific purpose instead of a couple of huge classes with a loads of methods.

It might seem like a small difference. But the ability to focus on just the code for a single query in one page and the ease of navigating to queries (especially if you use R#’s Find type) makes this an enormous boost in maintainability.

The above example is based on Entity Framework, but I’ve successfully used the same kind of implementation with NHibernate and Linq To SQL as well.

Composing Critierias

By utilizing Decorators or similar composition patterns to build the criteria’s, it’s possible to compose queries for each scenario. Something like:

var premiumCustomers = customers.Matching(
        new PremiumCustomersFilter(
                new PagedResult(page, pageSize)
);

Or:

var premiumCustomers = customers.Matching(
        new PremiumCustomersFilter(),
        new FromCountry("SE")
);

The implementation of the above examples is outside the scope of this post and is left as an exercise to the reader for now.

Repositories and testing

In my experience there is little point of Unit testing a repository. It exists as a bridge to communicate with the store and therein lies it value. Trying to unit test a repository and/or it’s query often turns out to test how they use the infrastructure, which has little value.

That said, you might find it useful to ensure that logging and exception handling works properly. This turns out to be a limited set of tests, especially if you follow the implementation above.

Integration tests is another story. Validating that queries and communication with the database acts as expected is extremely important. How to be effective in testing against a store is another subject which we won’t be covering here.

Making repositories available for unit testing to other parts of the system is fairly simple. As long as you honor the boundary mentioned earlier and only return well known interfaces or entities (like the IEnumerable), mocking or faking repositories will be easy and technology agnostic (ex. using rhino mocks):

ProductListRepository =
        MockRepository.GenerateMock>();

 ProductListRepository.Expect(
                repository => repository.Matching(new RuleFilter("PR1"))
                                 .Return(productListRules);

In summary

The repository pattern in conjunction with others is a powerful tool that lowers friction in development. Used correctly, and honoring the pattern definitions, you gain a lot of flexibility even when you have testing in the mix.

To read more about repositories I suggest picking up a copy of [PoEAA] or [DDD].

Entity Framework, NHibernate, Patterns

1 Comment

Is there any value in certifications?

Posted by Patrik Löwendahl in Column, People on May 31st, 2010

The last few months there has been a massive amount of voices speaking out against any form of certifications. Today this was actualized in a twitter discussion with a Cornerstone (training company) and Emil Cardell. Emil argued that certifications basically are worthless and all you need is a little problem solving and google.

I’ve been working several years helping everything from individuals to large enterprises maximize and capitalize on the competency they have and are reaching for. In this work I’ve come to realize that I do not agree with Emil’s simplistic view of the value of certifications, I find the issue more complex.

These are my thoughts on the subject.

Certifications outside our little sandbox

There are several areas where certifications are used outside of our industry, we are not unique. There are majors ones like doctors and electricians and minor ones like “transporting dangerous goods”. As with the certifications for devs, these will never single handily guarantee that the person holding it is competent; or even the right man for a specific task. But on the other hand, you would probably avoid letting a plumber without the right bathroom certifications rebuild it for you.

The classic example is people with drivers licenses, how many times have you wondered how they passed the test? But they still did, and you can’t drive a car without one.

So why do we rely on these certifications here but disregard them for software development?

State of today’s certifications

There is several types of certifications for software developers today and the major once can be categorize into three types that each have questions of validity raised against them;

Knowledge tests (aka multiple questions / answers) – can be crammed.
Attendance (e.g. SCRUM Master) – Only measure attendance not knowledge.
Reviews (Arguing for a solution in front of a board like the MCA) – Risk being based on friendship or contacts.

Does these doubts make the certifications worthless? They will certainly make it hard to guarantee competency, but that’s not unique for “our” certifications.

So why the outcry against certifications in our industry?

Reasons of outcry

I’ve come across several, more or less well founded, opinions why certifications fail us. Some more admirable then others as well.

The “I’m better then you” mentality
This is based on the same feelings that people get from others driving their cars like mad men. The simple rhetorical question of “how did this guy get certified?”. The interesting thing is that there are millions of people that never judge people outside of their car, likewise it seems that when we start writing code, some start to judge others.

Simple criteria’s
Some criteria’s can be very simple to complete. So simple that the value of the certification itself can be questioned. “Should it really be enough to attend a two day course to get certified?”

“Everybody got one”
This ties in a little to the two earlier outcries, if everyone in the industry got one. What worth are they?

This is an objection I find very interesting, since no electrician can work without a certification. It’s not the fact that you hold one that’s the value, it’s that if you don’t, you shouldn’t work.

They are only on X, not on Y.
This is a very popular argument and it usually comes from people not agreeing that X is the technology to use, anyone competent uses Y! This arguments springs from the view that it’s “my way or the highway”, a view that almost brought down the ALT.NET movement completely. Is a certification on X worthless if it is X you will work with and not Y?

So is the value only in how “hard” they are to attain, how many people got them, or what technology they are based on?

This is the real value.

The value of a certification always lies in the perception of the people it’s presented to. In our industry; MCP certifications held a huge value when they first where introduced. But as more people attained them, the perception of it’s value faded. There are still some that are well respected, most of them have few people that’s passed them. So it seems that we only value certifications as a proof of “superiority”, not as a measurement of basic knowledge in a job role.

I’d like to argue that there is value, even though it doesn’t prove your “superiority”. There is value from a lot of different perspectives.

Value for individuals

The Dreyfus model of skill acquisition speaks about 5 levels of competency. Going from one level to another needs different tools depending on where you are on the scale (Johan Normén will speak about this on devsum10). Today’s Certifications are such a tool.

It will not take you from competent to proficient or proficient to expert, but from beginner to advanced beginner. It’s often a measurement that you grasped the basics and are ready to move on. It forms a baseline for your claims to be competent.

And most importantly, it tells people that you put in the dedication and effort and those without the certification didn’t. It doesn’t tell that you are an expert but it tells a lot about who you are and what areas you studied.

Value for organizations

There is value beyond the individual in certifications. Organizations can use them as a tool in many aspects of their business model. They can be used to inventory competency, as a measurable goal for bonuses or as an argument for higher prices. For businesses the two strongest areas of use is first a mean for steering the business direction and secondly in relationship to their vendors.

Steering

If I as a corporate leader wanted to move my business in a certain direction I will have two challenges to handle. First I need to know where I am, I can’t plot a course without having a starting point, secondly I need to know that I got there. Depending what my ambitions are, certifications can help in either one.

Relationship

It’s easy to claim that you have 100 developers that know a certain vendors technology and is a strong candidate for partnership. But how do you prove it? A showcase isn’t enough, in theory everything in your showcase could’ve been produced by one person. Certifications are a great tool here. Since it’s neutral and not based on your own perception, vendors tend to trust certification numbers as an indication. It’s usually not all that’s needed, but it’s one component in a validation of your organizations worth to the vendor.

This need of a “neutral” verification is truly visible in many organizations job-ads as well. They wish for a “engineering degree” for their applicants, not because engineers by default make better developers, but because there has been a third party involved in validating your credentials according to a baseline that can be compared to others.

Do I think that certifications are ideal to achieve these two values? No, but it’s one tool, one step towards something that will be.

What does the future hold?

I don’t think we’ll see less certifications. I don’t think we’ll see “certifications” as a concept devolve in value. I think we’ll see different forms based on work and not solely on theoretical knowledge. I also think we’ll see better tools for organizations to get a solid overview of where they are. Does it mean that certifications as they are today will disappear? I don’t think so. They do fill a basic need, much like the theoretical test for your drivers license tell whether or not you are ready to do the “real test”.

I’ll tell you what I would like to see happening.

I hope that basic certifications get a little bit more unpredictable, so that pure cram session will require you to learn stuff (much like university tests).
I would like to see an apprenticeship form that has basic competency requirements and certifications and makes you a craftsman.
I would want for people to stop calling themselves “seniors” and instead work with their peers and earn the title “senior craftsman” not just print it on their business cards. This could be similar to how you gain your doctors hat in the universities.

We are working in a knowledge industry, we use our brain everyday to deliver our work. There will be a need to measure what we can do and there will be times that your individual competency has to be judge along side others. For us to be able to call our self a “profession” we don’t need less certifications, tests or titles. We need the ones we have and then some. We need to make a strong statement that Software Development is not for everyone, you need skills to call you a Developer. You need to understand that’s it’s serious work, not toying around with computers. Raising the bar with certifications, tests and titles will do that.

Certifications, People matters to

2 Comments

Unity LifeTimeManager for WCF

Posted by Patrik Löwendahl in Architecture, WCF, design patterns on May 20th, 2010

I really love DI-frameworks. One reason is that it allows me to centralize life-time management of objects and services needed by others. Most frameworks have options to control the lifetime and allows objects to be created as Singletons, ThreadStatics, Transients (a new object everytime) etc.

I’m currently doing some work on a project where Unity is the preferred DI-framework and it’s hooked to resolve all service objects for us.

The challenge

WCF has three options for instances, PerCall, PerSession and Single (http://msdn.microsoft.com/en-us/library/system.servicemodel.instancecontextmode.aspx). IN essence this will have the following effect on dependencies injected:

PerCall – All dependencies will be resolved once per message sent to the service.
PerSession – All dependencies will be resolved once per WCF service Session.
Single – All dependencies will be resolved exactly once (Singleton).

Now why is this challenge? It’s as expected that the constructor parameters is called once per instantiation, isn’t it?

The basis of the challenge is when you want to share instances of dependencies one or more levels down. Consider this code: ´

  1: public class CalculationService : Contract {

  2:   public CalculationService(IRepository ruleRepository,

  3:                             IRepository productRepository){..}

  4: }

This would work perfectly fine with the InstanceContextModes explained above, but what if we want to share instances of NHibernate Sessions or Entity Framework Contexts between repositories?

The default setting for most DI-Frameworks is to always resolve objects as “Transients”, which means once for each object that is dependent on them.

This is where LifeTime management comes into play by changing how the DI-Framework shares instances between dependant objects.

Unity has six “sharing-options” (from http://msdn.microsoft.com/en-us/library/ff660872(PandP.20).aspx):

TransientLifetimeManager:	A new object per dependency
ContainerControlledLifetimeManager	One object per unity container instance (including childrens)
HierarchicalLifetimeManager	One object per unity child container
PerResolveLifetimeManager	A new object per call to resolve, recursive dependencies will reuse objects.
PerThreadLifetimeManager	One new object per thread
ExternallyControlledLifetimeManager	Moves the control outside of Unity

As you can see we are missing our scenario. We’d like to share all dependencies of some objects across a single service instance, no matter what InstanceContextMode we choose.

The Solution

For Unity there is a good extension point that can help us. Combine that with WCF’s ability to add extensions to Instances and problem will be solved.

First we extend WCF instance context so it can hold objects created by Unity:

  1: public class WcfServiceInstanceExtension : IExtension

  2: {

  3:     public static WcfServiceInstanceExtension Current

  4:     {

  5:         get

  6:         {

  7:             if (OperationContext.Current == null)

  8:                 return null;

9:

 10:             var instanceContext = OperationContext.Current.InstanceContext;

 11:             return GetExtensionFrom(instanceContext);

 12:         }

 13:     }

14:

 15:     public static WcfServiceInstanceExtension GetExtensionFrom(

 16:                                               InstanceContext instanceContext)

 17:     {

 18:         lock (instanceContext)

 19:         {

 20:             var extension = instanceContext.Extensions

 21:                                            .Find();

22:

 23:             if (extension == null

 24:             {

 25:                 extension = new WcfServiceInstanceExtension();

 26:                 extension.Items.Hook(instanceContext);

27:

 28:                 instanceContext.Extensions.Add(extension);

 29:             }

30:

 31:             return extension;

 32:         }

 33:     }

34:

 35:     public InstanceItems Items = new InstanceItems();

36:

 37:     public void Attach(InstanceContext owner)

 38:     { }

39:

 40:     public void Detach(InstanceContext owner)

 41:     { }

 42: }

InstanceContextExtension, which get’s applied on each WCF Service Instance

  1: public class InstanceItems

  2: {

  3:     public object Find(object key)

  4:     {

  5:         if (Items.ContainsKey(key))

  6:             return Items[key];

7:

  8:         return null;

  9:     }

10:

 11:     public void Set(object key, object value)

 12:     {

 13:         Items[key] = value;

 14:     }

15:

 16:     public void Remove(object key)

 17:     {

 18:         Items.Remove(key);

 19:     }

20:

 21:     private Dictionary<object, object> Items

 22:                          = new Dictionary<object, object>();

23:

 24:     public void CleanUp(object sender, EventArgs e)

 25:     {

 26:         foreach (var item in Items.Select(item => item.Value))

 27:         {

 28:             if (item is IDisposable)

 29:                 ((IDisposable)item).Dispose();

 30:         }

 31:     }

32:

 33:     internal void Hook(InstanceContext instanceContext)

 34:     {

 35:         instanceContext.Closed += CleanUp;

 36:         instanceContext.Faulted += CleanUp;

 37:     }

 38: }

InstanceItems, used by the extension to hold objects created by unity

This gives us a nice place to put created objects, it will also call dispose on any objects when the instance is closing down.

Now we need to tell Unity to use our new shiny class, this is done by first extending LifeTimeManager:

  1: public class WcfServiceInstanceLifeTimeManager : LifetimeManager

  2: {

  3:     private readonly Guid key;

4:

  5:     public WcfServiceInstanceLifeTimeManager()

  6:     {

  7:         key = Guid.NewGuid();

  8:     }

9:

 10:     public override object GetValue()

 11:     {

 12:         return WcfServiceInstanceExtension.Current.Items.Find(key);

 13:     }

14:

 15:     public override void SetValue(object newValue)

 16:     {

 17:         WcfServiceInstanceExtension.Current.Items.Set(key, newValue);

 18:     }

19:

 20:     public override void RemoveValue()

 21:     {

 22:         WcfServiceInstanceExtension.Current.Items.Remove(key);

 23:     }

 24: }

The LifeTimeManager that uses our WCF Extension

All that’s left now is to tell Unity when to use this LifeTimeManager instead of the default. That is done when we register the type:

container.RegisterType(new WcfServiceInstanceLifeTimeManager(),
                                 new InjectionFactory(
                                     c => SessionFactory.CreateSession()));

In conclusion

So, DI-Frameworks are powerful to handle dependencies but sometimes they need a little nudge in the right direct. Custom LifeTimeManagement is one of those nudges you can do and both Unity and WCF helps you do that.

Dependency Injection, Extensions, Unity

3 Comments

Why lambdas seem broken in multithreaded environments (or how closures in C# works).

Posted by Patrik Löwendahl in Code Design on April 20th, 2010

I’ve been playing around with some code for.NET 3.5 to enable us to split big operations into small parallel tasks. During this work I was reminded why Resharper has the “Access to modified closure” warning. This warning tells us about a inconsistency in handling the ”Immutable” loop variable created in a foreach loop when lambdas or anonymous delegates are involved.

Take this code:

public static void ForEach(IEnumerable list)
{
    foreach(var item in list)
    {
        Action action = () => Console.WriteLine(item);
        action.Invoke();
    }
}

This will yield the expected result. When we call WriteLine the current item will be displayed:

A naïve and not so interesting example, but creates a baseline. Now let’s look at the real code where the reminder was at, a mimic of .NET 4’s parallel foreach:

public static void ForEach(IEnumerable list, Action task)
{
    var waitHandles = new ManualResetEvent[list.Count()];

    var index = 0;
    foreach(var item in list)
    {
        var handleForCurrentTask = new ManualResetEvent(false);
        waitHandles[index] = handleForCurrentTask;
        ThreadPool.QueueUserWorkItem(delegate
         {
             task(item);
             handleForCurrentTask.Set();
         });

        index++;
    }

    WaitHandle.WaitAll(waitHandles);
}

Calling this with the following line:

ForEach(items, Console.WriteLine);

Will give a more unexpected result (if you don’t know what’s going on that is).

So what is going on, or how closures are implemented in C#.

To understand the issue we must examine how the C# compiler handles anonymous delegates and lambdas (called lambda from now on). Since they are really a compiler trick and not a real feature of the Common Language Runtime.

For every lambda we define; the compiler will generate a new method (there are some nice tricks to avoid duplication but for the sake of this post let’s go with this simple definition). As illustrated in this screenshot from Reflector:

As you can see this is encapsulated inside an auto generated class. That class also contains a couple of instances variables that we’ll discuss.

We can see that the method encapsulated the two lines of code we had inside our lambda and has a dependency on “task” and “item” from instance variables that are also auto generated types.

To be able to execute the Foreach method in the auto generated class, we’ll need to initialize the instance variables of the lambda class. The compiler neatly picks this up and creates a couple lines of code in the beginning of the method to encapsulate Task:

public static void ForEach(IEnumerable list, Action task)
{
    <>c__DisplayClass1 CS$<>8__locals2 = new <>c__DisplayClass1();
    CS$<>8__locals2.task = task;

A bit further down in the method we can see how the the item variable is encapsulated inside another auto generated instance variable.

  <>c__DisplayClass3 CS$<>8__locals4 = new <>c__DisplayClass3();
        while (CS$5$0000.MoveNext())
        {
            CS$<>8__locals4.item = CS$5$0000.Current;

Notice how the creation of the object that encapsulates the item is declared outside of the iteration and the item is then passed into the object inside of the iteration. Here is the heart of the problem. In the interest of optimization, the compiler encapsulates parameters and variables as few times as possible. It seems like the optimizer do not recognize the immutable nature of the loop variable, nor the perceived local scope of it.

Further down this is passed into the ThreadPool as a delegate:

<>c__DisplayClass5 CS$<>8__locals6 = new <>c__DisplayClass5();
CS$<>8__locals6.CS$<>8__locals4 = CS$<>8__locals4;
CS$<>8__locals6.CS$<>8__locals2 = CS$<>8__locals2;
CS$<>8__locals6.handleForCurrentTask = new ManualResetEvent(false);
waitHandles[index] = CS$<>8__locals6.handleForCurrentTask;
ThreadPool.QueueUserWorkItem(new WaitCallback(CS$<>8__locals6.b__0));

So while the delegate that is passed to the thread pool is instantiated for each iteration, the variables and parameters used by the closure isn’t.

In a scenario where the lambda is executed inside the loop this will not be a problem, the next time the item variable is assigned, the previous execution will already be done. In a multithreaded, or a deferred execution of the lambda the story is quite different.

The assignment of the Item variable is far from safe here. Since the reference to the item encapsulation will be shared between all of the instances of the delegate passed to the ThreadPool. Thus they will share the last (or the last at the time of execution) assignment.

To solve this we need to tell the compiler that the Item variable is locally scoped to the loop and not a method-wide variable.

A simple change to:

foreach(var item in list)
{
    var handleForCurrentTask = new ManualResetEvent(false);
    waitHandles[index] = handleForCurrentTask;
    var currentItem = item;
    ThreadPool.QueueUserWorkItem(delegate
     {
         task(currentItem);
         handleForCurrentTask.Set();
     });

Will be enough for the compiler do do the right thing. It will now move Item inside the closure class for the lambda and not create a separate class:

Since this class had an instance created for each iteration, we will now have a separate copy of the item value in each delegate passed to the ThreadPool:

<>c__DisplayClass3 CS$<>8__locals4 = new <>c__DisplayClass3();
CS$<>8__locals4.CS$<>8__locals2 = CS$<>8__locals2;
CS$<>8__locals4.handleForCurrentTask = new ManualResetEvent(false);
waitHandles[index] = CS$<>8__locals4.handleForCurrentTask;
CS$<>8__locals4.currentItem = item;
ThreadPool.QueueUserWorkItem(new WaitCallback(CS$<>8__locals4.b__0));

And we’ll get the expected result again:

In conclusion

Most people will not run into this problem, but it is an interesting one never the less. I would argue that for deferred executed lambdas, the semantics of foreach is not the same as it is for the rest of the language. I’m sure the C# team has a really good reason for ignoring that Item really should be handled as a locally scoped variable, but my brain isn’t big enough to figure that out at the moment.

Maybe some of you know?

C#, Lambda

3 Comments

Slice up your business logic using C# Extension methods to honor the context

Posted by Patrik Löwendahl in Architecture, design patterns on February 20th, 2010

One of my favorite features with C# 3.0 is the extension methods. An excellent way to apply some cross cutting concerns and great tool for attaching utility functions. Heavily used by the LINQ frameworks and in most utility classes I’ve seen around .NET 3.5 projects. Some common example usages I’ve come across include:

   1: var name = "OlympicGames2010".Wordify();

   2: var attributeValue = typeof(Customer).Attribute(o => o.Status)

   3:                                      .Value();

Lately I’ve started to expand the the modeling ideas I tried to explain during my presentation at Öredev 2008. It became more of a discussion with James O. Coplien then a presentation and I was far from done with my own understanding of the ideas and issues I’d identified (there are things improve in this content). The core idea is pretty simple though:

Not all consumers of an object are interested of the same aspects of that object, it will take on different roles in different contexts

Let me explain with the simplest example; when building an order system, the aspects of a product that the order think is important are usually not exactly the same aspects that the inventory system value.

Contexts in object models

Eric Evans touches this in his description of “bounded contexts” (Domain Driven Design p335) where he stresses the importance of defining contexts where a model is valid and not mix it into another context. In essence the model of a product should be duplicated, once in the order context and once in the inventory context.

This is a great principle but at times it be too coarse-grained. James Coplien and Trygve Reenskaug have identified this in their work around, what they call, “DCI architecture”. Richard Öberg et al have done some work in what they call qi4j where they are composing objects with bits and pieces instead of creating full blown models for each context.

Slicing logic and models using Extension Methods

Let’s get back to the extension methods and see how they can help us slice business logic up in bits and pieces and “compose” what we need for different contexts.

In the code base I’m writing for my TechDays presentation I have a warehouse class that holds stock of items. These Items are common for different contexts, they will surface in orders and PLM. One of the features in this application is to find a stock to reserve a given an item. The following code is used to find that stock:

   1: return Stock.First(stock => stock.Item == item);

Although trivial, this is a business rule for the warehouse. When the warehouse class evolved this piece of rule would be duplicated in methods like Reserve, Releaes and Delete. A classic refactoring would be to use Extract Method to move it out and reuse that piece, something like:

   1: private bool Match(Stock stock, ItemInfo item)

   2: {

   3:    return stock.Item == item

   4: }

   5: ...

   6: return Stock.First(stock => Match(stock, item));

This is a completely valid refactoring but honestly we loose some intent, the immediate connection with stock and item are not as explicit and the lambda expression wasn’t simplified that much.

So let’s Refator to Slice instead:

   1: public static class ItemInAWarehouseSlices

   2: {

   3:     public static bool Match(this ItemInfo @this, Stock stock)

   4:     {

   5:         return stock.Item == @this;

   6:     }

   7: }

Adding this business rules as an extension method gives us a natural place for the code to live and a mechanism to use to compose objects differently in different contexts. Importing this extension method class into the Warehouse C#-file, ItemInfo will provide the logic needed in that context;

   1: return Stock.First(item.Match);

Adding the rule this way also sprinkles a touch of DSL on it and gives it a semantic meaning which makes the code make more sense.

Why don’t you just put that method on the ItemInfo, you migh ask. Well the answer is simple. ItemInfo is a concept that might be shared across contexts. Contexts that have no grasp of what a Stock is, nor should it. If I’d add everything I needed to the ItemInfo class for all contexts that uses Item. I would be in a bad shape. Thus the ideas behind Contextual Domain models, Bounded Context, DCI and Composite objects.

Extend away …

So extension methods have other usages then just extending existing classes with some utility. It’s also a powerful tool to slice your logic in composable pieces which helps you focus on the aspects you think is important in the current context you are working.

So what do you think? Will this help you create clear separtions?

C#, DDD, Patterns

No Comments

Performance differences between LINQ To SQL and NHibernate

Posted by Patrik Löwendahl in Data Access, ORM on February 15th, 2010

In my current project one of the actions I’ve taken is to have the project and team move away from Linq To Sql to NHibernate. There was a multitude of issues that was the basis for this move, some of the main reasons are outlined in my post:“Top 10 reasons to think twice about using Linq To Sql in your project” but there was also others, like the inability to tweak Linq To Sql to perform in different scenarios which often lead to stored procedures. NHibernate have a multitude of buttons to push and can tweak almost every aspect of the data access which gives us more options.

This post is not about the lack of optimizing options in LTS, nor the options in NHibernate. It’s just to illustrate a simple truth; NHibernate is more mature and have had time to optimize the core functionality like object creation (materialization). Compare these two images:

Linq To Sql Materialization 703ms - DB access 7 ms

NHIbernate materialization 159 – DB access 7 ms

Unfortunately I can’t show the real model that we loaded but it was an aggregate with a couple of lists which where 3 levels deep.

Another thing to note here: Linq To Sql needed a stored procedure to load this (due to the fact that load spans work really really bad in depths deeper then 2 levels, read more here: LINQ To SQL: Support for Eager Loading, Really? ) NHibernate usages a couple of batched select queries that was built using NHibernate Criteria and the Future feature.

Look at the database execution times, both are 7ms. So can we please kill the “stored procedures are faster then ORM generated queries”-debate now? Each of these scenarios is run on the same database with the same indexes and both had a database that where equally warmed up before the runs.

Data Access, Linq To Sql, NHibernate, ORM, Performance

No Comments

NDC 2010 – Day 0

The formal description

Implementations

The Generic Repository

Composing Critierias

Repositories and testing

In summary

Certifications outside our little sandbox

State of today’s certifications

Reasons of outcry

This is the real value.

What does the future hold?

Unity LifeTimeManager for WCF

The challenge

The Solution

In conclusion

Why lambdas seem broken in multithreaded environments (or how closures in C# works).

So what is going on, or how closures are implemented in C#.

In conclusion

Patrik Löwendahl

Tags

Blogroll