The repository pattern explained and implemented

The pattern documented and named “Repository” is one of the most misunderstood and misused. In this post we’ll implement the pattern in C# to achieve this simple line of code:

var customers = customers.Matching(new PremiumCustomersFilter())

as well as discuss the origins of the pattern and the original definitions to clear out some of the misrepresentations.

The formal description

My first contact with the repository pattern was through the study of Domain Driven Design. In his book[DDD] (p. 147), Eric Evans simply states that;

Associations allow us to find an object based on it’s relationships to another. But we must have a starting point for a traversal to an ENTITY or VALUE in the middle of it’s life cycle.

My interpretation of the section on repositories is simple, the Domain do not care about where the objects are stored in the middle of it’s life cycle but we still need a place in our code to get (and often store) them.

In Patterns of Enterprise Application Architecture[PoEAA] (p. 322), repositories is described as:

Mediates between the domain and the data mapping layers using a collection-like interface for accessing domain objects.

Examining the both chapters; we’ll quickly come to understand that the original ideas of the repository is to honor and gain the benefits of Separations of Concerns and hide infrastructure plumbing.

With these principles and descriptions this simple rule emerges:

Repositories are the single point where we hand off and fetch objects. It is also the boundary where communication with the storage starts and ends.

A rule that should guide any and all attempts of following the Repository pattern.

Implementations

There are several ways of implementing a repository. Some honor the original definitions more then others (and some are just blunt confusing). A classic implementation looks a lot like a DAL class:

public class CustomerRepository
{
       public IEnumerable<Customer> FindCustomersByCountry(string country) {…}
}

Using this implementation strategy; the result is often several repository classes with a lot of methods and code duplicated across them. It misses the beauty and simplicity in the original definitions. Both [PoEAA] and [DDD] uses a form of the Specification Pattern (implemented as Query Object in PoEAA) and asks the repository for objects that matches that, instead of named methods.

In code this gives the effect of having several small classes instead of a couple of huge ones. Here is a typical example:

public IEnumerable<Customer> Matches(IQuery query) { …. }
var premiumCustomers = customers.Matches(new PremiumCustomersFilter())

The above code is a great improvement over the named methods strategy. But let’s take it a little further.

The Generic Repository

The key to a generic repository is to think about what can be shared across different entities and what needs to be separate. Usually the initialization of infrastructure and the commands to materialize is sharable while the queries are specific. Let’s create a simple implementation for Entity Framework:

public class Repository<T> : IRepository<T> 
    where T : class
{
    protected ObjectContext Context;
    protected ObjectSet<T> QueryBase;

    public Repository(ObjectContext context)
    {
        Context = context;
        QueryBase = context.CreateObjectSet<T>();
    }

    public IEnumerable<T> Matches(ICriteria<T> criteria)
    {
        var query = criteria.BuildQueryFrom(QueryBase);
        return query.ToList();
    }

    public void Save(T entity)
    {
        QueryBase.AddObject(entity);
    }
}

Using Generics in the definition of the repository, allows us to reuse the basics while still allowing us to be specific using the criteria. In this naïve implementation there is not much that would be shared, but add things like logging, exception handling and validation and there is LoC to be saved here. Notice that the repository is executed and returns an IEnumerable<T> with the result as we expect all communication with the store to go through the repository.

The query objects then implement the ICriteria<T> interface and adds any filtering needed. An example query can look like this:

public class WarehousesWithReservableQuantitiesFor : ICriteria<Warehouse>
{
    private readonly string _itemCode;
    private readonly int _minimumRemaingQuantity;

    public WarehousesWithReservableQuantitiesFor(string itemCode, 
                                                int minimumRemaingQuantity)
    {
        _itemCode = itemCode;
        _minimumRemaingQuantity = minimumRemaingQuantity;
    }

    IQueryable<Warehouse> 
        ICriteria<Warehouse>.BuildQueryFrom(ObjectSet<Warehouse> queryBase)
    {
        return (from warehouse in queryBase
                from stock in warehouse.ItemsInStock
                where stock.Item.Code == _itemCode 
                        && (stock.Quantity - stock.ReservedQuantity) 
                                > _minimumRemaingQuantity
                select warehouse)
                .AsQueryable();
    }
}

There is a couple of things to notice here. First of all the interface is implemented explicit, this “hides” the method from any code that isn’t, and shouldn’t be, aware that there is the possibility to create a query here. Remember: …It is also the boundary where communication with the storage starts and ends….

Another thing to note is that it only handles the query creation, not the execution of it. That is still handled by the Generic repository. For me, using the above type of repository / query separation achieves several goals.

There is high reuse of the plumbing. We write it once and use it everywhere:

var customers = new Repository<Customer>();
var warehouses = new Repository<Warehouse>();

This makes it fairly quick to start working with new entities or change plumbing strategies.

Usage creates clean code that clearly communicates intent:

 

var reservables = warehouses.Matching
	(new WarehousesWithReservableQtyFor(code, 100));

 

Several small classes with one specific purpose instead of a couple of huge classes with a loads of methods.

image

It might seem like a small difference. But the ability to focus on just the code for a single query in one page and the ease of navigating to queries (especially if you use R#’s Find type) makes this an enormous boost in maintainability.

The above example is based on Entity Framework, but I’ve successfully used the same kind of implementation with NHibernate and Linq To SQL as well.

Composing Critierias

By utilizing Decorators or similar composition patterns to build the criteria’s, it’s possible to compose queries for each scenario. Something like:

var premiumCustomers = customers.Matching(
	new PremiumCustomersFilter( 
		new PagedResult(page, pageSize) 
);

Or:

var premiumCustomers = customers.Matching(
	new PremiumCustomersFilter(),  
	new FromCountry("SE")
);

The implementation of the above examples is outside the scope of this post and is left as an exercise to the reader for now.

Repositories and testing

In my experience there is little point of Unit testing a repository. It exists as a bridge to communicate with the store and therein lies it value. Trying to unit test a repository and/or it’s query often turns out to test how they use the infrastructure, which has little value.

That said, you might find it useful to ensure that logging and exception handling works properly. This turns out to be a limited set of tests, especially if you follow the implementation above.

Integration tests is another story. Validating that queries and communication with the database acts as expected is extremely important. How to be effective in testing against a store is another subject which we won’t be covering here.

Making repositories available for unit testing to other parts of the system is fairly simple. As long as you honor the boundary mentioned earlier and only return well known interfaces or entities (like the IEnumerable<T>), mocking or faking repositories will be easy and technology agnostic (ex. using rhino mocks):

ProductListRepository = 
	MockRepository.GenerateMock<IRepository<ProductListRules>>();

 ProductListRepository.Expect(
		repository => repository.Matching(new RuleFilter("PR1"))
                                 .Return(productListRules);

 

In summary

The repository pattern in conjunction with others is a powerful tool that lowers friction in development. Used correctly, and honoring the pattern definitions, you gain a lot of flexibility even when you have testing in the mix.

To read more about repositories I suggest picking up a copy of [PoEAA] or [DDD].

Read more patterns explained and exemplified here