Okay, time for design pattern thoughts. This time its the Repository pattern. The Repository pattern is a key pattern in Domain driven design, it’s actually specifically mentioned in Evan’s book actually and it’s importance to facilitating DDD cannot be understated. And since Rhinestone is using DDD practices, it only makes sense to use the Repository pattern but even if you are not using DDD and the Repository pattern provides so many advantages that its worth taking a look and perhaps implementing it even for non DDD projects.
So what is a Repository…
The Repository Pattern as a Data Access pattern:
The Repository pattern is defined by P of EAA as:
Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.
Okay, it’s a data access component. I get that. But what’s so special about a Repository and how is it any different from any other data access component? The first difference that I see is that a Repository exclusively deals in domain objects. The repository is not a generic data access component that uses Data Access Objects (DAO) or Data Transfer Objects (DTO) patterns. So the repository only accepts domain objects and returns domain objects.
The other difference I can spot from the definition is that a Repository provides a in-memory like collection interface for accessing domain objects. So as far as the consuming component is concerned, it uses the repository just like a collection when working with Domain objects. The repository then neatly abstracts the internal mechanics of how the Add / Remove calls to the repository translate to the actual data access calls to the data store.
So with the repository we get a nice abstraction that provides us with persistence ignorance and a nice separation of concerns where the responsibility of persisting domain objects is encapsulated by the Repository while leaving the domain objects to deal entirely with the domain model and domain logic.
Below is a mock standard Repository interface:
public interface IProjectsRepository
{
Project Get(int projectID);
Project Load(int projectID);
void Save(Project project);
void Delete(Project project);
}
That’s simple enough. Nice clean interface that provides a way to get, save and delete Project instances. But, I still don’t see the value add in using the Repository. Or rather I don’t see what’s the BIG advantage of using a Repository pattern. Here I have a simple Get and Load functions to get Project instances (the difference between the two is that Get will throw an exception if a project with that ID is not found) but what if I want to find projects based on some criteria? Lets say I want to find a project that a member belongs to, I’d end up adding this additional method to the Repository:
public interface IProjectsRepository
{
Project Get(int projectID);
Project Load(int projectID);
void Save(Project project);
void Delete(Project project);
//Find a project that has the specific user as a member
Project FindProjectsContainingMember(IUser member);
}
So as it stands the only advantage a Repository gives me is the encapsulation of logic that handles domain object persistence. So the above FindProjectsContainingMember would encapsulate the logic surrounding how to translate that business language to a data store language that the data store would understand and return the relevant results and finally constructing the domain objects from the returned results.
Back to the definition:
So going back to the definition and write up of the Repository pattern in P of EAA, I see the following explanation of Repository:
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes. Conceptually, a Repository encapsulates the set of objects persisted in a data store and the operations performed over them, providing a more object-oriented view of the persistence layer. Repository also supports the objective of achieving a clean separation and one-way dependency between the domain and data mapping layers.
[Emphasis mine]
Source: http://martinfowler.com/eaaCatalog/repository.html
Something that piqued my interest is the highlighted line above. Clients construct query specifications and submit them to the Repository for satisfaction… hmmm… so I can declaratively query the repository and have it return business object instances that satisfy that query? Now that is a huge advantage over traditional data access patterns. So does that mean that I don’t have to add additional methods everytime there’s a feature change and I need to query for domain objects using a new criteria? I’M SOLD. Now, the only question is… just how do I do that huh?
Specification and Repository patterns… two peas in a pod:
If you have some time and have an urge to submerge yourself in the dark work of programming patterns, I would highly recommend reading the white paper by Martin Fowler and Eric Evans on the Specification pattern. Otherwise read on…
The Specification pattern in my opinion helps formalizing and declaring criteria as a set of specification that encapsulates business logic. Lets take an example that would make things clearer; a Customer and it’s Preferred property. Say we are building a CRM solution for a famous Paper company down in Scranton, PA where preferred customers are defined as customers who buy more than 1000 reams paper at average in a month and whose Monthly invoiced amount at an average comes to over $100,000.
So looking at the hypothetical business requirement, the first instinct is to add a property to the Customer business object named IsPreferred which might look something like this:
public class Customer
{
public string Name { get; set; }
public int AverageInvoiced {get; set;}
public int AverageOrderQty { get; set; }
public bool IsPreferred
{
get
{
return AverageInvoiced > 100000 && AverageOrderQty > 10000;
}
}
}
That seems okay, we have encapsulated the logic of determining a Preferred customer into the Business Entity. Assuming that AverageInvoied and AverageOrderQty are in itself have some business rules on how those two values are calculated, now we just need to query based on that logic… Oh Oh… how do I do that now. So I’ve basically got two options; One is to add a IsPreferred column to the Customers table in the database and store the IsPreferred flag, the other is to create a specialized Stored Proc that queries the database using logic similar to that in the Customer class.
Both options suck! The first one especially sucks because if say the logic changes where preferred customers are customers that now buy 2000 avg. reams of paper and who have been invoiced $200,000 avg. per month, running the query for getting preferred customers based on the Preferred would yield invalid results since customers that still average 1000 reams and $100,000 in invoice are still marked as preferred. The second one is the lesser of two evils but then forces the domain logic to live in the Customer entity and the Stored Proc that is responsible for querying customers based on the preferred status. Now I have my logic spread all over instead of just the domain model… again something that is not acceptable.
The solution… Specifications
Following the specification pattern we can implement a re-usable business logic component that can be passed around to satisfy certain business criteria. As all good patterns, the Specification pattern starts by defining a interface:
public interface ISpecification<T>
{
bool IsSatisfiedBy(T entity);
}
So the ISpecification interface defines one method, IsSatisfiedBy that takes an instance of the entity type and checks if it satisfies the entity. So now lets define our two criteria, average Invoiced should be greater than $100,000 and average ordered qty should be greater than 1000 into a specification:
public class CustomerIsPrefferedSpec : ISpecification<Customer>
{
#region Implementation of ISpecification<Customer>
public bool IsSatisfiedBy(Customer entity)
{
return entity.AverageInvoiced > 100000 && entity.AverageOrderQty > 1000;
}
#endregion
}
Now this specification can be used in the Customer entity like so:
public bool IsPreferred
{
get
{
return new CustomerIsPrefferedSpec().IsSatisfiedBy(this);
}
}
Simple enough… but what we can now do is also use this spec and pass it to the repository as a re-usable piece of logic to get Customer instances that match the IsPreferred specification:
public class CustomerRepository
{
IEnumerable<Customer> FindCustomersBySpec (ISpecification<Customer> specification)
{
//Logic here to query based on the provided specification.
}
}
So now we have a re-usable business logic components that we can use to satisfy a criteria. What if we need to chain the criteria? For example; if we have a need to get all preferred customers that are also buying Grade A category of paper, how do we chain the two criteria together. Here we use the Composite Specification pattern. Below is an implementation of the Composite specification pattern:
/// <summary>
/// The generic ISpecification interface.
/// </summary>
/// <typeparam name="T"></typeparam>
public interface ISpecification<T>
{
bool IsSatisfiedBy(T entity);
}
/// <summary>
/// Base implementation of ISpecification that supports chaining specification
/// using And and Or chaining.
/// </summary>
/// <typeparam name="T"></typeparam>
public abstract class SpecificationBase<T> : ISpecification<T>
{
#region Implementation of ISpecification<T>
public abstract bool IsSatisfiedBy(T entity);
#endregion
public SpecificationBase<T> And (ISpecification<T> rightHand)
{
return new AndSpecification<T>(this, rightHand);
}
public Specification<T> Or (ISpecification<T> rightHand)
{
return new OrSpecification<T>(this, rightHand);
}
}
/// <summary>
/// Or binary expression specification implementation.
/// </summary>
/// <typeparam name="T"></typeparam>
internal class OrSpecification<T> : SpecificationBase<T>
{
private readonly ISpecification<T> _leftHand;
private readonly ISpecification<T> _rightHand;
public OrSpecification(SpecificationBase<T> leftHand, ISpecification<T> rightHand)
{
_leftHand = leftHand;
_rightHand = rightHand;
}
#region Overrides of SpecificationBase<T>
public override bool IsSatisfiedBy(T entity)
{
return _leftHand.IsSatisfiedBy(entity) || _rightHand.IsSatisfiedBy(entity);
}
#endregion
}
/// <summary>
/// And binary expression specification implementatin.
/// </summary>
/// <typeparam name="T"></typeparam>
internal class AndSpecification<T> : SpecificationBase<T>
{
private readonly ISpecification<T> _leftHand;
private readonly ISpecification<T> _rightHand;
public AndSpecification(SpecificationBase<T> leftHand, ISpecification<T> rightHand)
{
_leftHand = leftHand;
_rightHand = rightHand;
}
#region Overrides of SpecificationBase<T>
public override bool IsSatisfiedBy(T entity)
{
return _leftHand.IsSatisfiedBy(entity) && _rightHand.IsSatisfiedBy(entity);
}
#endregion
}
What the above implementation does is by creating an abstract SpecificationBase, we are ble to add a Or and And methods that return instances of OrSpecification and AndSpecification instances. Now both the OrSpecification and AndSpecification classes inherit from the SpecificationBase abstract class and implement the IsSatisfiedBy method, where all they do is return the evaluation of the right hand specification with the left hand specification.
The end result is we have a nice way to chain and compose specifications together and then also have a Fluent Interface that provides a nice way of specifying the composition.
With the above implementation I can now provide the following hypothetical specifications to a repository and have return the Customer entities that satisfies all the specified criterias:
IEnumerable<Customer> results = customerRepository.FindBySpec
(
new IsPreferredSpec()
.And(new OrdersPaperCategorySpec("Category A"))
.And(new HasOverduePayments())
);
Right now I’m sure you are thinking about how does the specifications translate to query filtering. One way is to use the yield keyword in a tight loop to return only those results that match the specifications:
public IEnumerable<Customer> FindBySpec(ISpecification<Customer> specs)
{
foreach (var customer in FindAll())
{
if (specs.IsSatisfiedBy(customer))
yield return customer;
}
}
The downside to the above approach is that the filtering and matching of the criteria is done one the entire list of customers that are loaded into memory. The other is to translate the specifications into the data store query language. This is slightly more trickier as you will need to implement some type of Query pattern that is capable translating specifications into the queries that the repository can execute.
A much more saner solution is if you are using ORM solutions such as NHibernate or LLBGenPro, there are ways to specify criteria that implement the Specification pattern which make the job of integrating with the repository a breeze. Heck you can even implement the specification pattern using Lambdas and LINQ and use Linq to SQL / Entity Framework.
Although by large, Repositories of today ignore the Specification pattern due to the difficulty in implementing a truly persistent ignorant way of translating queries that repositories can just take and run with it. If you are not using an ORM such as NHibernate or LLBGenPro or the numerous others out there that provide inbuilt support for Specifications, it gets hard.
But with Microsoft giving us Linq To SQL and Entity Framework, coupled with Lambdas and LINQ in general, I think we have come to a point where Specifications can be more widely adopted. And with third-party ORM providers now showing major love for Linq we may even come to a completely persistent ignorant solution to defining specifications.
In any case I believe it’s time to take a long hard look at some of the current patterns and how new technologies such as Linq and lambdas provide and how those new advances can be leverages. I think the Repository and Specification patterns are two such patterns that can definitely be overhauled to take advantage of Linq… and that’s what I’ll deal with in my next post. Stay tuned!