The repository pattern explained and implemented

The pattern documented and named “Repository” is one of the most misunderstood and misused. In this post we’ll implement the pattern in C# to achieve this simple line of code:

var customers = customers.Matching(new PremiumCustomersFilter())

as well as discuss the origins of the pattern and the original definitions to clear out some of the misrepresentations.

The formal description

My first contact with the repository pattern was through the study of Domain Driven Design. In his book[DDD] (p. 147), Eric Evans simply states that;

Associations allow us to find an object based on it’s relationships to another. But we must have a starting point for a traversal to an ENTITY or VALUE in the middle of it’s life cycle.

My interpretation of the section on repositories is simple, the Domain do not care about where the objects are stored in the middle of it’s life cycle but we still need a place in our code to get (and often store) them.

In Patterns of Enterprise Application Architecture[PoEAA] (p. 322), repositories is described as:

Mediates between the domain and the data mapping layers using a collection-like interface for accessing domain objects.

Examining the both chapters; we’ll quickly come to understand that the original ideas of the repository is to honor and gain the benefits of Separations of Concerns and hide infrastructure plumbing.

With these principles and descriptions this simple rule emerges:

Repositories are the single point where we hand off and fetch objects. It is also the boundary where communication with the storage starts and ends.

A rule that should guide any and all attempts of following the Repository pattern.

Implementations

There are several ways of implementing a repository. Some honor the original definitions more then others (and some are just blunt confusing). A classic implementation looks a lot like a DAL class:

public class CustomerRepository
{
       public IEnumerable<Customer> FindCustomersByCountry(string country) {…}
}

Using this implementation strategy; the result is often several repository classes with a lot of methods and code duplicated across them. It misses the beauty and simplicity in the original definitions. Both [PoEAA] and [DDD] uses a form of the Specification Pattern (implemented as Query Object in PoEAA) and asks the repository for objects that matches that, instead of named methods.

In code this gives the effect of having several small classes instead of a couple of huge ones. Here is a typical example:

public IEnumerable<Customer> Matches(IQuery query) { …. }
var premiumCustomers = customers.Matches(new PremiumCustomersFilter())

The above code is a great improvement over the named methods strategy. But let’s take it a little further.

The Generic Repository

The key to a generic repository is to think about what can be shared across different entities and what needs to be separate. Usually the initialization of infrastructure and the commands to materialize is sharable while the queries are specific. Let’s create a simple implementation for Entity Framework:

public class Repository<T> : IRepository<T> 
    where T : class
{
    protected ObjectContext Context;
    protected ObjectSet<T> QueryBase;

    public Repository(ObjectContext context)
    {
        Context = context;
        QueryBase = context.CreateObjectSet<T>();
    }

    public IEnumerable<T> Matches(ICriteria<T> criteria)
    {
        var query = criteria.BuildQueryFrom(QueryBase);
        return query.ToList();
    }

    public void Save(T entity)
    {
        QueryBase.AddObject(entity);
    }
}

Using Generics in the definition of the repository, allows us to reuse the basics while still allowing us to be specific using the criteria. In this naïve implementation there is not much that would be shared, but add things like logging, exception handling and validation and there is LoC to be saved here. Notice that the repository is executed and returns an IEnumerable<T> with the result as we expect all communication with the store to go through the repository.

The query objects then implement the ICriteria<T> interface and adds any filtering needed. An example query can look like this:

public class WarehousesWithReservableQuantitiesFor : ICriteria<Warehouse>
{
    private readonly string _itemCode;
    private readonly int _minimumRemaingQuantity;

    public WarehousesWithReservableQuantitiesFor(string itemCode, 
                                                int minimumRemaingQuantity)
    {
        _itemCode = itemCode;
        _minimumRemaingQuantity = minimumRemaingQuantity;
    }

    IQueryable<Warehouse> 
        ICriteria<Warehouse>.BuildQueryFrom(ObjectSet<Warehouse> queryBase)
    {
        return (from warehouse in queryBase
                from stock in warehouse.ItemsInStock
                where stock.Item.Code == _itemCode 
                        && (stock.Quantity - stock.ReservedQuantity) 
                                > _minimumRemaingQuantity
                select warehouse)
                .AsQueryable();
    }
}

There is a couple of things to notice here. First of all the interface is implemented explicit, this “hides” the method from any code that isn’t, and shouldn’t be, aware that there is the possibility to create a query here. Remember: …It is also the boundary where communication with the storage starts and ends….

Another thing to note is that it only handles the query creation, not the execution of it. That is still handled by the Generic repository. For me, using the above type of repository / query separation achieves several goals.

There is high reuse of the plumbing. We write it once and use it everywhere:

var customers = new Repository<Customer>();
var warehouses = new Repository<Warehouse>();

This makes it fairly quick to start working with new entities or change plumbing strategies.

Usage creates clean code that clearly communicates intent:

 

var reservables = warehouses.Matching
	(new WarehousesWithReservableQtyFor(code, 100));

 

Several small classes with one specific purpose instead of a couple of huge classes with a loads of methods.

image

It might seem like a small difference. But the ability to focus on just the code for a single query in one page and the ease of navigating to queries (especially if you use R#’s Find type) makes this an enormous boost in maintainability.

The above example is based on Entity Framework, but I’ve successfully used the same kind of implementation with NHibernate and Linq To SQL as well.

Composing Critierias

By utilizing Decorators or similar composition patterns to build the criteria’s, it’s possible to compose queries for each scenario. Something like:

var premiumCustomers = customers.Matching(
	new PremiumCustomersFilter( 
		new PagedResult(page, pageSize) 
);

Or:

var premiumCustomers = customers.Matching(
	new PremiumCustomersFilter(),  
	new FromCountry("SE")
);

The implementation of the above examples is outside the scope of this post and is left as an exercise to the reader for now.

Repositories and testing

In my experience there is little point of Unit testing a repository. It exists as a bridge to communicate with the store and therein lies it value. Trying to unit test a repository and/or it’s query often turns out to test how they use the infrastructure, which has little value.

That said, you might find it useful to ensure that logging and exception handling works properly. This turns out to be a limited set of tests, especially if you follow the implementation above.

Integration tests is another story. Validating that queries and communication with the database acts as expected is extremely important. How to be effective in testing against a store is another subject which we won’t be covering here.

Making repositories available for unit testing to other parts of the system is fairly simple. As long as you honor the boundary mentioned earlier and only return well known interfaces or entities (like the IEnumerable<T>), mocking or faking repositories will be easy and technology agnostic (ex. using rhino mocks):

ProductListRepository = 
	MockRepository.GenerateMock<IRepository<ProductListRules>>();

 ProductListRepository.Expect(
		repository => repository.Matching(new RuleFilter("PR1"))
                                 .Return(productListRules);

 

In summary

The repository pattern in conjunction with others is a powerful tool that lowers friction in development. Used correctly, and honoring the pattern definitions, you gain a lot of flexibility even when you have testing in the mix.

To read more about repositories I suggest picking up a copy of [PoEAA] or [DDD].

Read more patterns explained and exemplified here

 

Slice up your business logic using C# Extension methods to honor the context

One of my favorite features with C# 3.0 is the extension methods. An excellent way to apply some cross cutting concerns and great tool for attaching utility functions. Heavily used by the LINQ frameworks and in most utility classes I’ve seen around .NET 3.5 projects. Some common example usages I’ve come across include:

   1: var name = "OlympicGames2010".Wordify();

   2: var attributeValue = typeof(Customer).Attribute<Description>(o => o.Status)

   3:                                      .Value();

Lately I’ve started to expand the the modeling ideas I tried to explain during my presentation at Öredev 2008. It became more of a discussion with James O. Coplien then a presentation and I was far from done with my own understanding of the ideas and issues I’d identified (there are things improve in this content). The core idea is pretty simple though:

Not all consumers of an object are interested of the same aspects of that object, it will take on different roles in different contexts

Let me explain with the simplest example; when building an order system, the aspects of a product that the order think is important are usually not exactly the same aspects that the inventory system value.

Contexts in object models

Eric Evans touches this in his description of “bounded contexts” (Domain Driven Design p335) where he stresses the importance of defining contexts where a model is valid and not mix it into another context.  In essence the model of a product should be duplicated, once in the order context and once in the inventory context.

This is a great principle but at times it be too coarse-grained. James Coplien and Trygve Reenskaug have identified this in their work around, what they call, “DCI architecture”. Richard Öberg et al have done some work in what they call qi4j where they are composing objects with bits and pieces instead of creating full blown models for each context.

Slicing logic and models using Extension Methods

Let’s get back to the extension methods and see how they can help us slice business logic up in bits and pieces and “compose” what we need for different contexts.

In the code base I’m writing for my TechDays presentation I have a warehouse class that holds stock of items. These Items are common for different contexts, they will surface in orders and PLM. One of the features in this application is to find a stock to reserve a given an item. The following code is used to find that stock:

 

   1: return Stock.First(stock => stock.Item == item);

 

Although trivial, this is a business rule for the warehouse. When the warehouse class evolved this piece of rule would be duplicated in methods like Reserve, Releaes and Delete. A classic refactoring would be to use Extract Method to move it out and reuse that piece, something like:

   1: private bool Match(Stock stock, ItemInfo item)

   2: {

   3:    return stock.Item == item

   4: }

   5: ...

   6: return Stock.First(stock => Match(stock, item));

 

This is a completely valid refactoring but honestly we loose some intent, the immediate connection with stock and item are not as explicit and the lambda expression wasn’t simplified that much.

So let’s Refator to Slice instead:

   1: public static class ItemInAWarehouseSlices

   2: {

   3:     public static bool Match(this ItemInfo @this, Stock stock)

   4:     {

   5:         return stock.Item == @this;

   6:     }

   7: }

Adding this business rules as an extension method gives us a natural place for the code to live and a mechanism to use to compose objects differently in different contexts. Importing this extension method class into the Warehouse C#-file, ItemInfo will provide the logic needed in that context;

   1: return Stock.First(item.Match);

Adding the rule this way also sprinkles a touch of DSL on it and gives it a semantic meaning which makes the code make more sense.

Why don’t you just put that method on the ItemInfo, you migh ask. Well the answer is simple. ItemInfo is a concept that might be shared across contexts. Contexts that have no grasp of what a Stock is, nor should it. If I’d add everything I needed to the ItemInfo class for all contexts that uses Item. I would be in a bad shape. Thus the ideas behind Contextual Domain models, Bounded Context, DCI and Composite objects.

Extend away …

So extension methods have other usages then just extending existing classes with some utility. It’s also a powerful tool to slice your logic in composable pieces which helps you focus on the aspects you think is important in the current context you are working.

 

So what do you think? Will this help you create clear separtions?