The history of data access technologies in .NET is not a pretty one. Microsoft told us to do DataSets, then SqlDataSource, then LINQ to SQL, then Entity Framework in one way, then Entity Framework in another way. In the meantime many people were using raw ADO.NET and wrote their SQL by hand and other people were using NHibernate. It seems that things have finally settled down. DataSets, DataSources and LINQ to SQL are pretty much dead and it seems NHibernate is about to join them. The people who love control over their SQL are still a minority but they are doing just fine and have added some tools like Dapper
to their belts. For the past four or five years most people have been doing (at least for new projects) what Microsoft calls Entity Framework Code First which is a name that in typical Microsoft fashion is very confusing and inaccurate. If you are in that camp you might find the knowledge I am about to share useful. It has come with a lot of battle scars.
First of all it should be noted that this list is for Entity Framework 6.2. I have not checked how much of this works with Entity Framework Core but I assume most of it will even if some of the APIs may change. If you happen to know which of the tricks in this list work in EF Core and which do not please leave a comment. I may or may not choose to update this article with EF Core code when it goes final but the general principles should still apply and I believe they apply to most ORMs.
I have ordered the list starting from the least expensive to implement and least dependent on opinion to the most controversial so the first items should be a no-brainer for any project while the end of the list may be more problematic and people may disagree depending on their personal experience. You will also notice that most points are a reflection of the idea that you should not trust the ORM to do the right thing. In my experience this is the correct way to deal with ORMs.
1. Dump your queries
Add the following lines in your DataContext constructor
this.Database.Log = s => System.Diagnostics.Debug.WriteLine(s);
This will simply log every query Entity Framework executes to the Output window of Visual Studio when running in debug mode. The current way to do this in EF Core is more elaborate and involves injecting a logger via the Dependency Injection mechanisms but it is simple enough to make it worth it even if you need to look at the actual queries just once. I cannot think of any reason not to use this. In fact I advise beginners with EF to look at each and every query EF generates and non-beginners should look at more complex queries too to avoid surprises. Just last week I was looking at a performance problem that was a result of a bad EF query. I had seen the query but decided to leave it there because my SQL Server showed the query plans of the generated and hand-written query to be equivalent but it turns out that with the amount of data in production the query plans changed and the EF one was inefficient. Had I not looked at that query in advance I would not be able to fix the problem as fast as I did. The output window log can also show you when you are doing unexpected amount of queries or when a LINQ query is executed on the client rather than in the database.
2. Disable lazy loading
I used to be a big fan of lazy loading but not anymore. Simply disable lazy loading especially if you have less experienced programmers on the team. Add this line of code in your DataContext constructor
this.Configuration.LazyLoadingEnabled = false;
Load your related entities by adding calls to Include where needed. By doing this you will avoid unintentional problems with N+1 queries that happen behind your back. You may not even know you are doing this in your project especially if you are not following advice 1. Remember to use the statically typed version of Include that accepts a lambda rather than the one that accepts a string.
3. Use raw SQL when you must
Sometimes the query you want is too complex and EF cannot generate a reasonable query. Sometimes you want to use a feature of SQL Server that is not exposed in EF. Sometimes you simply wasted too much time trying to make EF do what you want. Well in these cases go write a stored procedure or raw SQL query. It is not that big of a problem. EF has good support for producing entities from a raw query result and a stored procedure can be mapped to a method. These features are there for a reason. Use them! Last time I used a stored procedure with EF was to do a Full Text Search in SQL Server. I hear people are writing magic query interceptors and what not just to avoid writing a couple of lines of SQL which is absurd because it is much harder than simply going one level down.
4. Do not use repositories and unit of work
Various courses, articles and even the official ASP.NET website
recommend implementing a repository and unit of work. There are multiple articles out there explaining why this is an anti-pattern with Entity Framework and this could be a separate post but I will give a short version.
Repositories are supposed to abstract your data source. They serve as a data access layer. This means that you should be able to change your data access strategy from raw sql, to an ORM to a REST web service without changes to the client code. If the repository is implemented correctly it does indeed serve this purpose. The problem is that most tutorials, courses and articles recommend practices that leak the underlying implementation in the form of IQueryable. Yes, IQueryable is an interface but implementing it is practically impossible without the help of an ORM. At best you can switch from one ORM to another but for this you don't need a repository - a simple wrapper would be enough. A proper repository would not return anything but IEnumerable/List and will not accept IQueryable related argument (this is where the example on the ASP.NET website fails). So why do I advise not using a repository instead of implementing it correctly? The answer is simply – YAGNI
. Implementing a proper repository is a lot of work and it makes your Service/Business Layer harder to implement due to the lack of convenient IQueryable. Unless you have specific requirements that point to changing the data source at some point you should not implement a repository. This specific abstraction costs a lot and you want to evaluate the risk. What if you need the abstraction later? Well, you man up and refactor. People have lived through serious refactoring before.
And what about the unit of work? Once you kicked out the repository you have a premade unit of work in the form of the DataContext (and a DbSet happens to be a repository). You do not need another. Also at least in web applications this unit of work has a logical lifetime - the web request. No need to manage the lifetime of the unit of work everywhere. Just create a data context when the request starts and dispose it at the end. All dependency injection frameworks do this out of the box. They call it Request Scope.
Get rid of your repositories and units of work and watch your code base become dramatically simpler with practically no downsides! You will thank me later.
5. Do not let Entity Framework create your database
In my experience EF sucks at generating a database. You miss a navigation property here and it does not generate a foreign key. You do not add a column name there and it generates it as User1. Yes you can make EF generate whatever you want but you have to check that it does. I find it less painful to simply create my database and then write a model that matches the database. I do not generate my model from database either because I have found that it suffers from the same issues but in reverse. In addition if you create your database and model manually you have the option to drop properties you do not need. For example if you have a Post entity that has a list of Comments the Comment entity may not need a navigation property Post so you can omit it and only include a PostID property. After all you do not query Comments by Post title so why add a Post property? Omitting redundant navigation properties makes the entities simpler and clearer. Of course if you happen to need the property at some point you just add it without changes to existing code or database.
The biggest downside of this approach is that you cannot use migrations. However you can add migrations later in the development lifecycle. Usually initial development involves a great number of tables and then the database is modified a couple of tables at a time. In this case you can create the initial database by hand, then enable migrations and make sure EF generates precisely the intended tables and keys from this point on via the migrations system. Since a few tables will be modified it is not so painful to keep track.
Here comes the obvious question. If I do not trust my ORM (and I do not) is it still worth it to use an ORM? Is it not faster to write plain SQL from the start instead of writing LINQ and check the SQL every time? In my experience it is. The type safety, the reduction in boilerplate code, the ability to avoid minor errors (like misspelling column names) makes me more productive with an ORM. This may not be true for everybody and I am not about to argue that you should use an ORM if you do not feel these benefits but if you do feel like ORM makes you more productive maybe these tricks can increase your productivity even more. And remember - an ORM is not an excuse to not know (or forget) SQL.