Freekshow

May 8, 2008

Slides for the Océ presentation

Filed under: .Net,Events,Linq,Speaking — Freek Leemhuis @ 12:55 pm

For those who attended my LINQ talk yesterday at the Océ headquarters, thanks for coming. Find below the slides used for this presentation. Included are some resources (links, book recommendations) that I did not get around to mention. I guess 2 hours was not enough…

ado-vnext LINQ presentatie Océ 

Advertisements

March 26, 2008

LINQ to SQL problems

Filed under: .Net,Linq — Freek Leemhuis @ 4:49 pm

Anko has written about what he perceives as problems with LINQ to SQL in this post. Bob has previously spoken in similar terms. I’m putting in my two cents below.

#1 LINQ to SQL is designed for a two tier scenario. LINQ to SQL does not fits natural with a multi layer architecture.

This statement does not really specify what part of the design would prevent fitting LINQ to SQL in with a multi layer architecture. It’s true that the party line from the Microsoft camp seems to be that a 2-layer design would fit best with LINQ to SQL: you roll the data access layer and the business logic layer into one, with LINQ to SQL functioning as a 1:1 abstraction layer between your model and the database. I would use this model only for small-scale, data-driven applications. Problem with these applications usually is that they tend to evolve, bits get added on left right and center, and soon a fairly large codebase sprouts without a well thought out design.

When you need an enterprise-level application, you usually do not want to rely on the 1:1 mapping that LINQ to SQL offers. The database is will be normalized (or, in fact, denormalized) in a way that will not be in line with your domain model. (I think this is what Paul is referring to). In these cases, using an OR Mapper that would allow you to map entities to tables using the data mapper pattern is the more obvious choice.

Using LINQ to SQL in this kind of scenario, you could write your business object layer to encapsulate your data classes. Using LINQ, this kind of transformation is very easy to do.

For example, suppose that you have a data table for customer and a separate one for addresses. The address includes a country code, which is a foreign key to yet another table Countries. Do you want to worry about all this normalization when building the UI? Me thinks not..

You could create a data transfer object using LINQ like so

public class CustomerDTO

{

int CustomerID { get; set; }

string Name { get; set; }

string Street { get; set; }

string ZipCode { get; set; }

string City { get; set; }

string Country { get; set; }

} 

 

You can then use LINQ to transform the data like so:

CustomerDTO result = from c in context.Customers

where c.CustomerId == customerId

select new CustomerDTO

{

CustomerID = c.CustomerId,

Name = c.Name,

Street = c.Address.Street,

Zipcode = c.Address.ZipCode,

City = c.Address.City,

Country = c.Address.Country.Name

};

 

Note the use of Object Initializing and Automatic Implemented Properties that make it so much easier to write this sort of stuff.

When using this approach you keep normalization out of the entities, and you don’t lose any functionality since the data classes are POCO. You can’t obviously use the Attach and replay functionality, but to be honest it’s not a great loss.

Used this way I don’t see why LINQ to SQL would not fit in an enterprise scenario. I don’t think it’s the best solution. Other OR Mappers offer more functionality, and personally I think you’re still writing too much code that the choice of the right framework could handle for you.

#2 It is not possible to tune the SQL Server queries generated by LINQ to SQL.

This in itself is true, in my experience 90% of the queries you write off the cuff are translated to adequeate  SQL, and for the remaining queries you can decide to use stored procedures if the problem really is in the SQL that’s being generated.

 #3 LINQ to SQL generates dynamic SQL. The technology strongly motivates to use dynamic SQL instead of stored procedures.

Why are people so fond of stored procedures? In most stored procedures that I review there’s way too much business logic, and no kids, that’s not the right place for it. The only reasons for introducing stored procedures are IMHO if performance is not otherwise achievable or if the dba demands it (gotta love ’em, the bastards).  The problem I do see is that if you do want to use a stored procedure, then you can’t use temp tables or dynamic SQL in the stored procedure, because LINQ to SQL does not allow you to map these into it’s model.

#4 LINQ to SQL encourages to a “SELECT *” query

Ehm, I don’t think LINQ to SQL handles in any other way then other OR-Mappers. So you could rephrase the statement as ‘using an OR-Mapper encourages “Select *” queries’. I don’t see it. With the introduction of anonymous types it’s now real easy to retrieve exactly the columns you’re after without having to write too much code.

#5 The cache mismatch problem. LINQ to SQL materialize your objects “once” in the cache, this can give unexpected query results.

 This is a feature, not a problem. The LINQ to SQL datacontext uses an identity map to keep track of instances of your data classes. This means that on any datacontext your data class will get instanciated only once. Where would this lead to unexpected results? Let’s take the following example

Country countryBefore = ctx.Countries.First(c => c.Iso2 == “NL”);

ctx.ExecuteCommand(“update countries set population = population + 1”);

Country countryAfter = ctx.Countries.First(c => c.Iso2 == “NL”);

Assert.IsTrue( countryAfter.Population == countryBefore.Population – 1);

 

LINQ to SQL does not support set-based operations, so for updating a whole set of records I’ve used the get-out-of-jail card here by using the ExecuteCommand method of the DataContext. (I’m hopeful that we will see some improvements in this area…).

The update statement successfully updates the population of all countries, but the Assert fails. This is because the country, as specified by the primary key ‘NL’ is already instantiated, and so countryAfter does not result in another query but gets a reference to countryBefore. This is an example where it’s maybe not immediately obvious when you’re new to the area, but the identity mapping pattern is well established and in use in similar frameworks.

So in all, if you’re a shop where it’s hard to introduce frameworks that are not labeled ‘made in Redmond’ then I’d wait for the Entity Framework: it will allow for flexible mapping of your tables onto entities. If you want to start with what’s out today, make sure you’re aware of the limited support LINQ to SQL has for eager loading  See my earlier post for more details on the differences between the Entity Framework and LINQ to SQL.

March 24, 2008

Change tracking in LINQ to SQL

Filed under: .Net,Linq,Microsoft — Freek Leemhuis @ 9:49 pm

I was catching up with the ASP.NET Dynamic Data controls that will ship with the upcoming ASP.Net extension pack, and it struck me that this is a framework that, much like LINQ to SQL, is excellent for writing demo applications. Without writing too much code, you can very quickly create something that looks and even acts like a ‘real’ application.

Here’s the thing: I don’t get paid to write demos. Any real merit I get from any framework is if it helps me build ENTERPRISE applications.

Don’t get me wrong, I absolutely love LINQ, and some of the new language enhancements that have come with it. These days when I’m working in a team using Visual Studio 2005  I hate not being able to use LINQ, object initializers, automatic implemented properties and the like.

Looking at LINQ to SQL though, it seems to have been designed without focus on how you would actually use this in a multi-layer environment. Sure, you can use it to replace your Data Access Layer, but the passing around of data classes using Attach leaves a lot of plumming to be done manually and the whole ‘replay your changes, then submit’ feels awkward at best and looks to have been put in almost as an afterthought.

I understand that the design goal of Persistence Ignorance prevent any persistence plumming in the objects themselves. I am probably a heritic in the eyes of OO-purist, but I care more about productivity then about my objects qualifiying as POCO. It would be nice if a framework like LINQ to SQL would offer you a choice to go the POCO route, or actually have some persistence coding that would take care of change tracking over data tiers. Datasets, spawn of the devil they may be, have a convenient DiffGram model that allows one to pass changes to data over tiers. If a persistence framework can solve this for you, it would make life much easier.  LLBL Gen has it. Can we not have something in LINQ to SQL (or the Entity Framework!) that would allow one to choose how to handle change tracking?

Dinesh has written about the design goals of LINQ to SQL. What’s interesting is that he says in this post:

we don’t yet have a simple roundtripping and change tracking on the client.

(Emphasis mine) So it would seem they have given it some consideration, but must have decided to keep it clean and simple, and left it out. Will we see some enhancements in this area for LINQ to SQL? Matt Warren has posted a reply to this forum post:

There is also a technology in the works that automatically serializes LINQ to SQL objects to other tiers, change tracks and data binds the objects for you on that tier and serializes the objects back, solving the cyclic reference serialization problem and the indescribable schema problem for objects with change sets

Further on Matt describes how the Attach method was enhanced so that you can include the old version and new version of an entity object. The ‘technology in the works’ however is more like the tidbits that have come out regarding a mini-connectionless DataContext . Matt has written here that it was not implemented …for lack of time.

I have found no information on when (or if) this will be available.

March 12, 2008

Deferred loading with LINQ to SQL

Filed under: .Net,Linq — Freek Leemhuis @ 10:06 pm

The term deferred loading, sometimes also described as lazy loading. is used for the deferred loading of child objects. Suppose you are loading a customer, and the customer has a navigational property Orders. LINQ to SQL applies deferred loading by default, which means that the orders are not retrieved immediately when the customer is retrieved, but rather when you actually request the orders by enumerating over them (or bind them to a grid). LINQ to SQL will then automatically fire of a query to load the orders. 

However, in most cases this is not what you want because you of the excessive round-tripping this can cause to and from the database server. In most cases, it would be far more efficient to execute a single query and eager-load the parent object and it’s child objects with the results or that one query. As in all OR/M frameworks, LINQ to SQL offers you the possibility to specify how to load the child objects. This is done by specifying the dataload options.

First of all, you can set the default lazy loading mechanism off by specifying

ctx.DeferredLoadingEnabled = false; 

 where ctx is the instance of your DataContext class. At this point, no loading of child objects takes place at all. Next, you can specify which child objects you want to eager load 
 

 

DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Customer>(c => c.Orders);
ctx.LoadOptions = dlo; 

This means that the customers and orders will be loaded by a single query, where the customers and orders table are queried using an inner join.

Note that this works well with the first level of child objects, but people have pointed out that LINQ to SQL will not join a third level of child objects in the same single query. Serious limitation, don’t you think?

Ofcourse, you can also execute the loading per object. Suppose you want to pass the object through a service, then it would make sense to make sure the child objects are loaded. If you have not set eager loading on the DataLoadOptions of the context, you can use the Customer.Orders.Load() method.

On a final note, you can also specify for specific attributes that you want to defer loading until such a time that you really use the value. Suppose for example you have a blob column for the customer with a picture of that customer. You will want to avoid retrieving the field unless it is really used in your UI. In the LINQ to SQL designer, specify ‘False’ for the ‘Delay Loaded’ property of that field. LINQ to SQL will then automatically fire off a separate query to retrieve the content at the last possible moment.

Pointers for starting with LINQ to SQL

Filed under: .Net,Linq — Freek Leemhuis @ 10:01 pm

Here’s a few pointers on how to start learning how to use LINQ and LINQ to SQL:

  • Read the relevant entries on Scott Guthries blog, right up to here.
  •  There are some good introductory videos on the asp.net website.
  • Download Linqpad. This is a great tool that enables you to start playing with the sample queries to familiarize yourself with the different keywords en operators.
  • If you have a preference for VB over C#, check out Beth Massi’s blog. 
  • If you have mastered the basics, make sure you understand how deferred execution works. Charlie Calvert has blogged about it here. 
  • Find out about deferred loading, which I’ll talk about next. 

March 10, 2008

LINQ to SQL vs the Entity Framework

Filed under: Entity Framework,Linq — Freek Leemhuis @ 10:16 pm

I have done a number of technical sessions on Linq, Linq to SQL and the ADO.Net entity framework recently. I will try to put together a number of posts here for those who want to get started on these technologies. One of the questions I get asked regularly concerns the difference between the different frameworks. I will try to highlight some of these in this post.

Since we had the lauch of Visual Studio 2008 last week, Linq to SQL is now widely available. Linq to SQL is a light-weight OR-Mapper. Is this the best framework for persisting your data? No, by no means. If you are currently using a feature-rich OR-Mapper like NHibernate or LLBLGen, you wouldn’t want to switch to Linq to SQL. However, if you’re new to the practice of OR mapping, Linq to SQl can be a good choice to get you started. 
Included in the ASP.Net 3.5 extension pack that’s going to ship sometime this year is the ADO.Net Entity Framework. This has all of the capabilities of Linq to SQL, and much more. To give you an idea, I’ve outlined the differences between these frameworks below:

Linq to SQL

Entity Framework
RAD scenarios Enterprise development
Data centric (1 table maps to 1 data class) Conceptual modeling
Inheritance modeling

  • Single table inheritance
Inheritance modeling:

  • Single table inheritance
  • Table per subclass
  • Table per concrete class
One API(Linq) Three API’s (linq, object services, entity client)
MS SQL Server only Provider model(SQL Server, Oracle, MySQL etc)
Out now. Somewhere in 2008

This lists the main differences, but there is much more to tell. For now, let’s focus on the different ways these frameworks model the data classes.  You can generate a model in Visual Studio from an existing database by dragging tables from server explorer onto the designer canvas. Using Linq to SQL on one of my demo databases, the following data classes are generated:

datamodel_linqts.jpg

Notice that we have a many-to-many relationship in the database: A person can be linked to one or more companies,  and a company can have a link with one or more persons.

Now if we import the same database using the Entity Framework we get a slightly different result:

datamodel_ef.jpg

First thing you will notice: the CompanyPerson table has no corresponding data class. This is what we want! This table is just a construct that is used in relational database modelling to enable the Many-to-Many relationship, but it is not an entity in our OO model, and should therefore be abstracted away by the framework of our choice. In the EF, you can see how the mapping is done through the Mapping Details:

fkmapping.jpg

We have a navigational property Companies for Person, and likewise a Persons property on Companies. Using Linq to SQL we would have to use the CompanyPersons data class (yuk) or we would need to write custom navigational properties to get the same result. This is just the sort of stuff we want a framework to do for us! It would seem that the ado team that is producing the EF has put considerable more effort into the translation of relational to OO concepts. Looking at the two different models, you might have notice another key difference(pardon the pun): the noticable absence of foreign keys in the dataclasses of the EF. We have a navigational property Country on the Person class, and from Country we can navigate to persons. Why would we want an attribute CountryID in our Person data class?

Next time I will dig a bit deeper into Linq to SQL.

Blog at WordPress.com.