A few days ago, I was having some drinks with a friend (let’s call him Jack) in a bar discussing about enterprise application development and the migration from DataSets with VS to Business Objects with NHibernate. I will try to reproduce the conversation here:
Jack: Oh man NHibernate really sucks! It is very slow compared to DataSets. The things about Business Objects you were saying are all theoretical mambo-jumbo. Datasets rock.
Me: Why do you say that? What did you do?
Jack: Well, I found some time in the weekend and have followed the DDD paradigm instead of using DataSets. I have created a BO (a simple one, mapped to a single table in the DB) and created a very simple DataLayer using NHibernate to retrieve a list of 1500 BOs and display them in a DataGrid. It took NHibernate 6 sec to retrieve the data! Can you believe it? (…) I have tried the same thing with a DataSet and it took only some msec (…) I tried to find a solution to this in the internet for 3 hours and found nothing. NHibernate is poorly documented too… Nothing beats DataSets.
There are a lot of things to discuss in Jack’s statement.
First of all, it took just a 1 hour of playing with NHibernate for Jack to claim that the NHibernate technology sucks due to inexplicable performance issues in such a small example. It took him another 3 hours to claim that the documentation for NHibernate is insufficient. And this whole experience was enough for him to judge... But since this issue has nothing to do with programming, we will leave it to the philosophy blog.
Another issue worth discussing here is the fact that Jack is comparing NHibernate with DataSets and relates the first with Business Objects and then says that he followed the DDD programming paradigm. A lot of terms whose connection is not clear in his mind. How do they relate? This is about philosophy again. But this time it is about philosophy of designing programs and I will discuss it here…
Enterprise applications share a general set of requirements. There are some data that need to be deleted, inserted, updated, selected and filtered in various ways. Those data are stored (persisted) in a database. There are several rules (depending on the application) that define how those data are entered, processed, validated and sometimes automatically generated. Finally there is the graphical user interface that provides the means of playing with those data.
The keyword here is “The Data”. Every “logic” in the application (Business, GUI, Data access) needs “The Data”. “The Data” are retrieved from the database, processed through classes in the business logic and presented through events in the GUI. So how are “The Data” organized? And how are “The Data” persisted? It turns out that this has a lot to do with how we design the application in our minds in the first place. Let us call “The Data” as a set of carefully designed classes also known as “Business Objects”.
The common ground for everyone and for a long time was the database. The relational database with the rules it implied (Tables, Views, Primary and Foreign Keys) provided a starting point on organizing “The Data” and as a consequence our BOs.
To make this point more clear imagine the following simple scenario where we need to implement an application involving Clients, Products and Purchases where a client buys a number of products which are written on an Invoice with a unique Invoice Number. Note that a lot of us, even before hearing anything about the application’s requirements, we have created a conceptual Database diagram in our mind that looks like the one below:
So “The Data” are conceptually organized in three tables and associations between them. All my application will be using select like statements operating on relations on table rows. Therefore my Business Objects will be mainly sets of DataTable, DataRow and Relation classes mimicking the database. In other words they will be DataSets. Yes DataSets are Business Objects since they are a way to organize “The Data” of the business.
But is this formulation of data convenient for my application (or Domain)? Note that up to now we have not said anything about the application’s requirements.
What if the main requirement for the application (ie the Domain) is to show the current Invoices or operate on Invoices?
With the DataSet approach we would probably fill a Purchases DataTable with the DataRows that have a specific InvoiceNo and operate on it.
Wouldn’t it be more convenient if we had an “Invoice” as “The Data”? Do we need to change the Database? The answer is no. The key point here is to decouple the application’s organization of “The Data” from the limitations the relational database imposes. We want to use Invoices in our business logic therefore we will create an “Invoice” Business Object:
There is no doubt that organizing our data like that has a lot of benefits. Actually, we have roughly followed the Domain Driven Design paradigm where we have implemented our Business Objects according to entities of the actual domain and the application’s requirements. There was nothing evil with our Datasets and the tabular like business objects. They just weren’t suited for the requirements of the particular application (the domain). In another application or as part of a domain datasets may be the most suitable choice (try accounting journal entries and you will see what I mean).
No matter what our BOs are (Datasets or other) there is always the need of mapping them to the Relational Database. Note here that the mapping is not a simple one to one connection of table fields to BO properties. Most of the time it requires more than that. For example creating a new Invoice BO in the previous example and adding a Client and a Product into it is persisted by adding a new record in the Purchases table with the appropriate Client and Product keys.
Here comes the role of ORM tools such as NHibernate. Of course along with the previous functionality those tools offer a lot more such as session management, concurrency control etc. So ORM tool provide methods for implementing the Data Access layer ASAP. So why do people pose the question Datasets vs Business Objects? Or why did Jack said that he prefers DataSets over NHibernate? That is because Visual Studio provides an automated way of creating the DataSets for you if you have the database ready. It is as if Visual Studio is creating your ORM tool that will handle your DataSets to DB communication. And it happens in a few seconds.
ORM tools such as NHibernate favor starting building your application from your Business Objects that fit your domain. Then they let you define how your BOs will be persisted (define the DB Schema) and request from you to provide a mapping between the two. They can then give you the goodies for manipulating the data. So for someone who already has the Database ready, this seems like a big overkill. But it is not always such a big deal since ORM tools usually provide third party tools that can construct some BOs automatically from tables of the database.
So here is the basis of the actual debate:
If the application is small scale and you don’t have the time, use DataSets and go have fun (unless fun is discussing about programming).
If the application is not small scale then see if the DataSet tabular logic fits in your Domain and if it does use it.
For the rest be custom and use ORM tools (by the way EDM is Microsoft’s suggestion so if you are into Microsoft go for it).
So we have actually questioned Jack’s personality and have shown that he has used terms without really knowing what they meant. In the next post we will prove that Jack was also wrong about the performance issues.