Saturday, January 9, 2010

O/R-Mapper - Enemy Mine

Those days, I here many discussions pro and contra O/R-Mappers (ORMs). Jack Corbett, a professional SQL guy I really regard speaks at SQLSaturday#32 about "Why Should I Use Stored Procedures?" in relation to ORM tools. I just had an own, nice discussion with several guys at:
Sind O/R-Mapper potentiell ein schleichender Tod? (a German C# developer forum)
Try of a English translation:
Are O/R-Mapper a potential sneaking dead?

The quint essence of this discussion was. Most, experienced developers generally like the idea of those ORMs but currently they don't trust them in larger projects.

In this blog entry I'll try to address my main problems with O/R-Mappers.

The Holy Cow for Consumers

Often I saw developers who forgot the "R" (relational) within the OR-Mapper. It's a mapping tool to bring relational data into an domain object model. However, there is still a relational database system behind the curtain. It's very very very important to never forget this.

What does this mean?

The first thing is the query support. ORMs usually provide an object oriented way to handle database queries. This is a nice feature (if correct implemented by the ORM), but it's impossible for any tool to cover the whole power of RDBMS specific SQL. Those queries are usually fine for simple queries like "give me a customer by her code" or "give me the not finished orders of a specific customer". That means queries on direct fields on the base table of a domain object and queries over one or maybe two related tables. As soon as you hit this boundary, you should consider to implement a custom SQL query (or procedure). Especially things like "LEFT JOINs", "OR" or the disingenuous "IN" should always be done by custom SQL, since it directly shows potential problems - as execution plans in SQL Server.

The second, most times wrong used feature of O/R-Mappers is lazy loading. There are some, very rare(!) business cases where lazy loading is a good thing and should be used. Whenever I use lazy loading I have to keep in mind that any access to a not yet loaded relation causes an own SQL query fired to the database server. Using lazy loading in form of always just load the root object and let the mapper load every information when I need it quickly ends up in hundreds or thousands or more single queries stressing the server. Let me quote Martin Fowler who calls this a "ripple loading".

It's important to always keep in mind, an O/R-Mapper is a tool and nothing more, really. It's a tool to automate some boring work like simple 1:1 mapping of relational data into a domain object and automate some basic queries. Depending on the technology which is covered by the tool - relational databases - every ORM is always a weak tool. That doesn't mean that developers of ORMs doing a bad job! The weakness depends on the fact that an ORM always can only cover a very small and simple part of the SQL possibilities of a todays RDBMS. They especially cannot handle a professional query tuning like analyzing execution plans which may point a completely different query for same result on different databases - sometimes even for the same RDBMS.

The Holy Cow Mislead by the Publishers

Sadly, some of the problems addressed above are often caused on the way how those tools are promoted (or implemented). Every common O/R-Mapper is promoted as "the all-in-one". However, I think an ORM will never be able to handle every special case - from database and domain model side.

As I pointed the "R" as the most common potential problem while consuming an ORM; the "M" (mapper) is my main issue of todays ORMs. Sure, the mapping is the main part of an O/R-Mapper. It maps relational data into a domain object model and vice versa. Unfortunately, all mappers I know encapsulate this part of the tool as a black-box. My problem with this encapsulation is an build-in tight coupling between database and domain model which is restricted to the mapping strength of the used tool. I think the only restriction of modelling should be the experiences of the developers, DBAs and architects.

What does this mean?

In the domain model, this tight coupling sometimes disables a more normalized form of objects than it is available by the database. I don't speak about a denormalized database but often an domain object model uses more than one object for one table. While a "Orders" table might contain customer information like "CustomerOrderRef" or "RequestedDeliveryDate", the "Order" domain object can contain another "CustomerInfo" reference which holds all the customer specific information. As long as those related objects are mutable (what means they provider setter methods or properties), abstracting ORMs like NHibernate or Entity Framework are able to handle the mapping. However, if there is a mapping to a immutable value object they hit the wall. A good sample (again by Martin Fowler) for a mapping of columns to a immutable object is a Money-object (don't mis with the SQL Server data type MONEY). Money usually has two different information an Amount and a Currency. Usually a table usually contains an "Amount" column and a "CurrencyId" column, but those non normalized information seem to be not the best design in a domain model, especially if I want to be able to calculate with those money information. I'd be glad if I could inject the existing mapping with some custom work without having to provide the database columns in my domain objects.

On the other hand the database can be more normalized than a domain model, what is more common known situation. Again, ORMs like NHibernate and Entity Framework support simple denormalization of 1:1 table relations into a domain model - with some restrictions. A good sample where ORMs are impossible to handle the mapping are EAV/CR tables. Generally EAV/CR is not a common database design today but there are good reasons to use them for some special cases. If I think about a software of a sports club which holds describing information for each member. Storing all playing skills of all members for any kind of sport. This seems to be almost impossible in a usual relational database design. At this point I'd like to be able to map this very special form of data into my domain object model without leaving the whole universe of the rest of my model by defining my "Member" as POCO (or POJO in case of Java) which is not part of my mapper.

Last but not least, many ORMs access way to may information what causes a huge, not needed I/O overhead. An order object and table might contain several information which are needed for very different parts of a system like customer, shipping and billing information. ORMs either always load and write all the information or work with a property/column based lazy loading - what usually becomes the hardest kind of ripple loading. There are some main information like the table identity and an "OrderNumber" which are usually needed for almost any part of the system but I don't need all the other information whenever I load an order from the database.

My Wishes for the Future

From the consumer side, I hope developers come back to the mindset that an O/R-Mapper is a (weak) tool and nothing more. Whenever they create a new ORM query object, reconsider if the query that will be created fits the possibilities of the mapper.

From the publisher side, I don't look for a mapper that handles mapping of value objects or EAV/CR tables. These have just have been samples for boundaries of mappers - maybe there is even a mapper which supports one of those features. I'm looking for an O/R-Mapper which knows its limits and allows a tailor made mapping without the must of leaving the whole universe of the mapper for change tracking, transaction handling and other useful features.

A current Spare Time Project

I'm thinking several month about O/R-Mappers, their strengths and their weaknesses. I tried several mappers and after some short euphorias I hit back the ground due to existing restrictions which could not be handled without some ugly workarounds. About 2 month ago, I started to think (and implement) a own version of an ORM. After more than 150 classes and the first tests it seems to fit my requirements but who knows... Maybe it dies while implementation phase. Maybe it works and I'll use it in one of my future projects. Maybe I'll publish it. If I publish it, maybe it becomes yet another dead born O/R-Mapper in the internet, maybe there will be some guys and/or gals who try and hate it, maybe there will be some guys and/or gals who try and enjoy it.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.