Wednesday, September 28, 2011

Domain Objects And Many To Many Relations

Today I want to discuss different kinds of many to many relations and cases where it can make sense to transform them into different structures when they become loaded from a database into an object graph of a business layer. For sake of straightforwardness this post focuses on Microsoft Entity Framework and O/R-Mappers in general. Let me know, if one is interested in how to handle many to many relations in native ADO.NET.

This is the third post of a blog series about designing domain objects in a business layer and the second part that gives attention to transformation of table structures into object structures. The first post "Tables Are No Domain Objects" gave an introduction to this series and showed some reasons why it can make sense to abstract O/R-Mappers in layers above the data access layer (DAL). In the second part "Tables Are No Domain Objects: Table Relation Transformations Part 1" we discussed foreign key fields, aggregations and status objects.

Basics

A many to many relation is given when two objects are related to each other and each of them can be referenced to more than one object (rows in the database) on the other side.

An example for a many to many relation is the relation between articles and their categories. Each category can be related to many articles, like a category "food" that references apples, pies and meat. On the other hand, an article "apple" can be categorized as "food" and "healthy".

In an object model a many to many relation is exposed by two domain objects where each contains a collection of objects of objects of the other type. Since databases don't provide complex column types, like lists many to many relations are realized by putting an intermediate link table between the tables.

Simple Many To Many Relations

A simple many to many relation is given whenever the link table consists of nothing but the foreign keys which point to the rows of the two tables to be related to each other.


When working with native ADO.NET many to many relations are always a bit tricky but can, for sure, be handled. I'll focus on O/R-Mappers for now.

When working with a common O/R-Mapper simple many to many relations are usually automatically transformed by the mapper. The link table stays hidden inside of the mapper each of our two objects can provide a list of objects of the other type.

public partial class Article {
   public IList<Category> Categories { get; set; }
}

public partial class Category {
   public IList<Article> Articles { get; set; }
}
The ORM knows all values to be inserted into our link table and doesn't need to annoy clients of our business layer with this table. If you are at the beginning of a project and your O/R-Mapper does not support simple many to many relations, I'd suggest to consider another mapper.

Complex Many To Many Relations

A complex many to many relation is given when the link table contains any additional columns which are not the foreign keys of our domain objects base tables.


With this link table an O/R-Mapper like Entity Framework run into trouble. It is unable to fill our creation date column without an intermediate domain object that does nothing but hold the additional column. Our two domain objects will look like this.

public partial class Article {
   public List<ArticleCategory> ArticleCategories { get; set; }
}

public partial class Category {
   public List<ArticleCategory> ArticleCategories { get; set; }
}
This might be fine for EF but usually that's not how we want to work with our objects in the main part of our system. Often columns like a creation date are only used for support or reporting purposes and we don't want to think about the odd ArticleCategory object when adding new operations features.

Without some refining of our domain objects we will be forced to implement every access of an articles categories like this.
Article article = GetArticle();
var categories = from acl in article.ArticleCategories
                 select acl.Category;

// process the article and its categories
It is not only unnatural to need to always access the intermediate object to get what we are really looking for but also a causes a tight coupling between our domain objects and the underlying database table structure. Worst thing would be if we started up with a simple many to many relation between articles and categories and a new requirement causes the need of the creation date column - and the resulting ArticleCategory object. Without some architectural effort we might have to refactor larger parts of our existing source code. Luckily, there are a few things we can do.

The easiest way to hide the relation object is to define the ArticleCategories property as private and provide a few methods that give us the opportunity to directly work with the referenced entities.
public partial class Article {
   public IEnumerable<Category> GetCategories() {
      return ArticleCategories.Select(acl => acl.Category);
   }

   public void AddCategory(Category category) {
      ArticleCategory categoryLink = new ArticleCategory();
      categoryLink.CreationDate = DateTime.Now;
      categoryLink.Article = this;
      categoryLink.Category = category;
      ArticleCategories.Add(categoryLink);
   }

   public void RemoveCategory(Category category) {
      var categoryLink = ArticleCategories.Where(
                           item => category.Equals(item.Category)).FirstOrDefault();
      if (categoryLink != null)
         ArticleCategories.Remove(categoryLink);
   }
}
// =========================================
// sample usage
Article article = GetArticle();

var categories = article.GetCategories();
// process categories

article.AddCategory(GetCateory());

Apart from the fact that we provide a more natural access to our categories, this also causes an architecture that is robuster for possible future changes - like additional fields in our link table.

If we want to go one step further we can provide an even more sophisticated interface to access our (indirectly) referenced domain objects. Unfortunately we cannot use a simple List<T> and copy all categories into it because our ArticleCategories list would not become affected by any add/remove calls. This makes also impossible to use a simple LINQ query that transforms the ArticleCategory objects into categories.

However, what we can do is implement a custom IList<T> that transforms a list of objects of one type into other objects by utilizing a provided delegate. In our case we need to transform a list of ArticleCategory objects into categories.

The following snipped shows how such a list could work.

public class TransformationList<T, TResult> : IList<TResult> {
   private IList<T> _list;
   private Func<T, TResult> _transform;
   private Func<TResult, T> _factory;

   // Constructor that creates a read-only version of the list
   public TransformationList(IList<T> list, 
                             Func<T, TResult> transformation)
      : this(list, transformation, null) {
   }
   // Constructor that creates a writable version of the list
   public TransformationList(IList<T> list, 
                             Func<T, TResult> transformation, 
                             Func<TResult, T> factory) {
      _list = list;
      _transform = transformation;
      _factory = factory;
   }

   // Indexer access
   public TResult this[int index] {
      get { return _transform(_list[index]); }
      set {
         EnsureWritable();
         _list[index] = _factory(value);
      }
   }

   // Count property works like a proxy
   public int Count { get { return _list.Count; } }

   // The list is read-only if no factory method provided
   public bool IsReadOnly { get { return _factory != null; } }

   // Ensures that the list is writable and uses the factory method to create a new item
   public void Insert(int index, TResult item) {
      EnsureWritable();
      _list.Insert(index, _factory(item));
   }

   // Read-only method uses the transformation method
   public bool Contains(TResult item) {
      return _list.Where(i => item.Equals(_transform(i))).Any();
   }

   // ensure that the list is writable
   private void EnsureWritable() {
      if (IsReadOnly)
         throw new InvalidOperationException("List is read only");
   }

   // and so forth...
}

The second constructor, which gets a second delegate as factory method makes the list writable and enables us to add new objects from outside without knowing that another, hidden object becomes materialized inside of our transformation list.

This (reusable!) class makes us able to provide a our articles categories with a nice IList<Category> property.
public partial class Article {
   private IList<Category> _categories;

   public IList<Category> Categories {
      get {
         if (_categories == null)
            _categories = 
               new TransformationList<ArticleCategory, Category>(
                     ArticleCategories, 
                     (acl) => acl.Category,
                     (c) => AddCategory(c));
         return _categories;
      }
      set { _categories = value; }
   }

   public ArticleCategory AddCategory(Category category) {
      ArticleCategory acl = new ArticleCategory();
      acl.CreationDate = DateTime.Now;
      acl.Article = this;
      acl.Category = category;
      ArticleCategories.Add(acl);
      return acl;
   }
}
// =========================================
// sample usage
Article article = GetArticle();

foreach (var category in article.Categories) {
   // process categories
}

article.Categories.Add(GetCateory());

Conclusion

Simple many to many relations are usually easy to work with, but even if an O/R-Mapper shows some weakness in its mapping features, we are still able to provide a reasonable interface to clients of our business layer and its domain objects.

Outlook

In the next part of this series we will look at version controlled data, what challenges they can could cause and ways to get them handled.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.