The repository pattern explained and implemented

The pattern documented and named “Repository” is one of the most misunderstood and misused. In this post we’ll implement the pattern in C# to achieve this simple line of code:

var customers = customers.Matching(new PremiumCustomersFilter())

as well as discuss the origins of the pattern and the original definitions to clear out some of the misrepresentations.

The formal description

My first contact with the repository pattern was through the study of Domain Driven Design. In his book[DDD] (p. 147), Eric Evans simply states that;

Associations allow us to find an object based on it’s relationships to another. But we must have a starting point for a traversal to an ENTITY or VALUE in the middle of it’s life cycle.

My interpretation of the section on repositories is simple, the Domain do not care about where the objects are stored in the middle of it’s life cycle but we still need a place in our code to get (and often store) them.

In Patterns of Enterprise Application Architecture[PoEAA] (p. 322), repositories is described as:

Mediates between the domain and the data mapping layers using a collection-like interface for accessing domain objects.

Examining the both chapters; we’ll quickly come to understand that the original ideas of the repository is to honor and gain the benefits of Separations of Concerns and hide infrastructure plumbing.

With these principles and descriptions this simple rule emerges:

Repositories are the single point where we hand off and fetch objects. It is also the boundary where communication with the storage starts and ends.

A rule that should guide any and all attempts of following the Repository pattern.

Implementations

There are several ways of implementing a repository. Some honor the original definitions more then others (and some are just blunt confusing). A classic implementation looks a lot like a DAL class:

public class CustomerRepository
{
       public IEnumerable<Customer> FindCustomersByCountry(string country) {…}
}

Using this implementation strategy; the result is often several repository classes with a lot of methods and code duplicated across them. It misses the beauty and simplicity in the original definitions. Both [PoEAA] and [DDD] uses a form of the Specification Pattern (implemented as Query Object in PoEAA) and asks the repository for objects that matches that, instead of named methods.

In code this gives the effect of having several small classes instead of a couple of huge ones. Here is a typical example:

public IEnumerable<Customer> Matches(IQuery query) { …. }
var premiumCustomers = customers.Matches(new PremiumCustomersFilter())

The above code is a great improvement over the named methods strategy. But let’s take it a little further.

The Generic Repository

The key to a generic repository is to think about what can be shared across different entities and what needs to be separate. Usually the initialization of infrastructure and the commands to materialize is sharable while the queries are specific. Let’s create a simple implementation for Entity Framework:

public class Repository<T> : IRepository<T> 
    where T : class
{
    protected ObjectContext Context;
    protected ObjectSet<T> QueryBase;

    public Repository(ObjectContext context)
    {
        Context = context;
        QueryBase = context.CreateObjectSet<T>();
    }

    public IEnumerable<T> Matches(ICriteria<T> criteria)
    {
        var query = criteria.BuildQueryFrom(QueryBase);
        return query.ToList();
    }

    public void Save(T entity)
    {
        QueryBase.AddObject(entity);
    }
}

Using Generics in the definition of the repository, allows us to reuse the basics while still allowing us to be specific using the criteria. In this naïve implementation there is not much that would be shared, but add things like logging, exception handling and validation and there is LoC to be saved here. Notice that the repository is executed and returns an IEnumerable<T> with the result as we expect all communication with the store to go through the repository.

The query objects then implement the ICriteria<T> interface and adds any filtering needed. An example query can look like this:

public class WarehousesWithReservableQuantitiesFor : ICriteria<Warehouse>
{
    private readonly string _itemCode;
    private readonly int _minimumRemaingQuantity;

    public WarehousesWithReservableQuantitiesFor(string itemCode, 
                                                int minimumRemaingQuantity)
    {
        _itemCode = itemCode;
        _minimumRemaingQuantity = minimumRemaingQuantity;
    }

    IQueryable<Warehouse> 
        ICriteria<Warehouse>.BuildQueryFrom(ObjectSet<Warehouse> queryBase)
    {
        return (from warehouse in queryBase
                from stock in warehouse.ItemsInStock
                where stock.Item.Code == _itemCode 
                        && (stock.Quantity - stock.ReservedQuantity) 
                                > _minimumRemaingQuantity
                select warehouse)
                .AsQueryable();
    }
}

There is a couple of things to notice here. First of all the interface is implemented explicit, this “hides” the method from any code that isn’t, and shouldn’t be, aware that there is the possibility to create a query here. Remember: …It is also the boundary where communication with the storage starts and ends….

Another thing to note is that it only handles the query creation, not the execution of it. That is still handled by the Generic repository. For me, using the above type of repository / query separation achieves several goals.

There is high reuse of the plumbing. We write it once and use it everywhere:

var customers = new Repository<Customer>();
var warehouses = new Repository<Warehouse>();

This makes it fairly quick to start working with new entities or change plumbing strategies.

Usage creates clean code that clearly communicates intent:

 

var reservables = warehouses.Matching
	(new WarehousesWithReservableQtyFor(code, 100));

 

Several small classes with one specific purpose instead of a couple of huge classes with a loads of methods.

image

It might seem like a small difference. But the ability to focus on just the code for a single query in one page and the ease of navigating to queries (especially if you use R#’s Find type) makes this an enormous boost in maintainability.

The above example is based on Entity Framework, but I’ve successfully used the same kind of implementation with NHibernate and Linq To SQL as well.

Composing Critierias

By utilizing Decorators or similar composition patterns to build the criteria’s, it’s possible to compose queries for each scenario. Something like:

var premiumCustomers = customers.Matching(
	new PremiumCustomersFilter( 
		new PagedResult(page, pageSize) 
);

Or:

var premiumCustomers = customers.Matching(
	new PremiumCustomersFilter(),  
	new FromCountry("SE")
);

The implementation of the above examples is outside the scope of this post and is left as an exercise to the reader for now.

Repositories and testing

In my experience there is little point of Unit testing a repository. It exists as a bridge to communicate with the store and therein lies it value. Trying to unit test a repository and/or it’s query often turns out to test how they use the infrastructure, which has little value.

That said, you might find it useful to ensure that logging and exception handling works properly. This turns out to be a limited set of tests, especially if you follow the implementation above.

Integration tests is another story. Validating that queries and communication with the database acts as expected is extremely important. How to be effective in testing against a store is another subject which we won’t be covering here.

Making repositories available for unit testing to other parts of the system is fairly simple. As long as you honor the boundary mentioned earlier and only return well known interfaces or entities (like the IEnumerable<T>), mocking or faking repositories will be easy and technology agnostic (ex. using rhino mocks):

ProductListRepository = 
	MockRepository.GenerateMock<IRepository<ProductListRules>>();

 ProductListRepository.Expect(
		repository => repository.Matching(new RuleFilter("PR1"))
                                 .Return(productListRules);

 

In summary

The repository pattern in conjunction with others is a powerful tool that lowers friction in development. Used correctly, and honoring the pattern definitions, you gain a lot of flexibility even when you have testing in the mix.

To read more about repositories I suggest picking up a copy of [PoEAA] or [DDD].

Read more patterns explained and exemplified here

 

29 thoughts on “The repository pattern explained and implemented

  1. Nice. I just implemented this in EF Code Only CTP4.

    I used to create one repository pr. aggregate, with a lot of query functions. Now I have a lot of reuseable specefications instead… nice.

    I did use an abstract base class for my Specification/Criteria (instead of a ICriteria). By doing that I’m able to create a Generic IsSatisfiedBy that uses the exact same Linq expression:

    public abstract class Specification where T : IAggregate
    {
    internal abstract IQueryable BuildQueryFrom(IQueryable queryBase);

    protected bool IsSatisfiedBy(T aggregate)
    {
    IQueryable queryBase = (new List {aggregate}).AsQueryable();

    return BuildQueryFrom(queryBase).Count() == 1;
    }
    }

    : Thomas

  2. Hi, this is a very clear and pattern-adherent approach.
    I’m having some indecisions about how to implement the navigation of collection properties inside an aggregate entity.
    Is it possible to have a look inside the source code of your repository implementation with Linq to Sql version?
    Thanks a lot.
    Massimo

  3. Very nice explaination, but I’m confused and fail to see how to make it work with NHibernate, would you have any working example for download somewhere?

    Thanks!

    Seb :)

  4. Any suggestions or references where a Generic Repository implementation handles stored procedures?
    I have a problem where we decided to use ObjectSet for simple queries and stored procedures for complex data operations.

    Thanks.

  5. The handling of stored procs for reading would be handled by putting the setup inside the query but still return an IEnumerable.

    For updating it’s a specific case and I’d probably create an overload with a storage specification.

  6. Pingback: Back to the Basics: the DAO pattern. « Rockstar Engineering

  7. Pingback: The foundations of a game | Android Game Development

  8. When using Query objects sent to your repository, unit testing in higher layers demands (depending on mock framework) impl of Equals/GetHashCode in your query objects. Right?

    In your example…
    ProductListRepository.Expect(
    repository => repository.Matching(new RuleFilter(“PR1″))
    .Return(productListRules);
    …assumes that “RuleFilter” QO has implemented Equals, being an another instance than the one run in production code.

    • Or that you just put ignore on the argument. I usuallyt tend to not have to much reliance on what arguments are sent in. In the case you are describing, it is really not valuable for the test. To know internal implementations details.

  9. Hi, I have to say great explanation.

    If you do consider my request, I would also like you to write articles on following topics since i strongly feel that through understanding of following terminology will greatly improve learning experience of this pattern.

    1) Lambda Expression
    2) Expression
    3) DataContext Class
    4) IEnumerable
    5) IEntity
    6) IQueryable
    7) LINQ

    Thank you

  10. Pingback: The foundations of a game | Tom Eggington

  11. ObjectContext Here:

    public class Repository : IRepository
    where T : class
    {
    protected ObjectContext Context;
    protected ObjectSet
    QueryBase;

    public Repository(ObjectContext context)
    {
    Context = context;
    QueryBase = context.CreateObjectSet();
    }

  12. right, but in your statements below, there’s no object context being passed:

    var customers = new Repository();
    var warehouses = new Repository();

  13. Ah yes. I was thinking IoC eems much different in a good way from the one repository per aggregate entity method. Good work. Also, thanks very much for yas well :) Thanks. I’m just trying this out right now BTW. This sour prompt response.

  14. Pingback: LOB Gamification Service Admin Website: Data Services « Dan's Green Shoes

  15. Hans, all respect to Ayende and his work is excellent. But first of all, in his session he talks about the repository as its not supposed to be built or used. This post is on how you should do to avoid most of what he talks about.

    Secondly, there is no one tool, pattern or architecture that works or do not work for every scenario.

  16. The only reason why you try to avoid something is to test your Repository which he also addresses with a simpler model.

    Repositories are useless wrappers. What is the definition of a Repository? What do todays ORMs? They do the same.

    The time frame where this pattern came up covers exactly with the appearance of powerful ORMs like Hibernate. Nobody needs Repositories except you’re writing directly against the database.

  17. So, to have a common Repository interface because some of your data is in NoSQL and some is in a database is useless? What about in a brownfield application where you want to abstract out all sorts of different data access methods and wrap around existing implementation so you can gradually swap out custom SQL or stored procs with ORMs? What about adapters to Repositories, like a CachingRepository that can wrap around another repository. Most people don’t get to write code in a perfect scenario where every line of code looks exactly how it would if that person had written it.

  18. Hans, It seems we have different understandings about what the repository is and solves. If you where talking about DAL (that seems to be the common perception of what a repository do, and Ayendes arguments are really directed towards that, not the real repository pattern).

    Neither NHibernate nor EF can be as good in encapsulating and ensuring Open/Closed and DRY as the Repository can. Rich mentions caching and brownfield. I’d like to add named quereies (that the specifications are), plumbing (like errorhandling / logging) and life time management. Just to add a few.

    Not all solutions benefit from this, just like not all solutions benefit from having a nosql database with the queries specified in the UI/Service operations (like ayende suggests).

    In the cases where you do want to abide to open/closed, dry and ensure you have easy to read and understand concepts in your code. The (real) repository pattern is a tool for that and will most probably continue to be so as long as we do data access.

  19. I would like to see a sample project of this setup. I am especially curious on how you created this.

    var premiumCustomers = customers.Matching(
    new PremiumCustomersFilter(),
    new FromCountry(“SE”)
    );

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>