Retrieving values from an IDataReader

September 4, 2008 15:57 by Andre Loker

Despite the undoubted advantages of ORM you probably need to fetch data directly from an IDataReader from time to time. Getting values out of the data reader is easy at first glance. However, there are some issues that you should be aware of.

Basic usage

Assume we have a simple table that holds comments for a web log or something similar:

image

(The UI is German, but I guess you know what the columns mean. If not: it's "Column name", "Data type" and "Allow Nulls")

Let's query all data in the table:

   1: using (var con = new SqlConnection(connectionString)) {
   2:   con.Open();
   3:   var cmd = new SqlCommand("SELECT * FROM Comment", con);
   4:   using(var rdr = cmd.ExecuteReader()) {
   5:     while(rdr.Read()) {
   6:       var id = rdr.GetInt32(0);
   7:       var post = rdr.GetInt32(1);
   8:       var position = rdr.GetInt32(2);
   9:       var text = rdr.GetString(3);
  10:       Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  11:     }
  12:   }
  13: }

Nothing fancy here, we simply query all all rows and columns and print their values.

The index problem

Look at the way we retrieve data from he reader:

   1: var id = rdr.GetInt32(0);
   2: var post = rdr.GetInt32(1);
   3: var position = rdr.GetInt32(2);
   4: var text = rdr.GetString(3);

Using indices this way is probably not the best solution, especially because we use SELECT * to retrieve all columns. If the order of the columns change in the database, your application either crashes or it will using the wrong columns (the latter can arguably be the worse situation). Errors regarding this problem what be visible until runtime, which makes the code quite hard to maintain.

If we changed the query to something like

   1: SELECT ID, Post, Position, Text FROM Comment

we could at least prevent the index related issues because we would fetch the columns in an order that is defined by our application, not by the database.

As an alternative we could use IDataReader.GetOrdinal to determine the index of the columns at runtime:

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   2:  
   3: using (var rdr = cmd.ExecuteReader()) {
   4:  
   5:   // determine the indices of the columns and cache them for efficiency
   6:   var idIndex = rdr.GetOrdinal("ID");
   7:   var postIndex = rdr.GetOrdinal("Post");
   8:   var positionIndex = rdr.GetOrdinal("Position");
   9:   var textIndex = rdr.GetOrdinal("Text");
  10:  
  11:   while (rdr.Read()) {
  12:     var id = rdr.GetInt32(idIndex);
  13:     var post = rdr.GetInt32(postIndex);
  14:     var position = rdr.GetInt32(positionIndex);
  15:     var text = rdr.GetString(textIndex);
  16:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  17:   }
  18: }

Here I again used SELECT *, but it would work exactly the same way with explicit column selection.

There are more ways to achieve the same effect. You can use the indexer of the data reader to fetch the columns by name:

   1: using (var rdr = cmd.ExecuteReader()) {
   2:  
   3:   while (rdr.Read()) {
   4:     var id = rdr["ID"];
   5:     var post = rdr["Post"];
   6:     var position = rdr["Position"];
   7:     var text = rdr["Text"];
   8:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
   9:   }
  10: }

Be aware though that the indexer unlike the explicit GetXYZ() methods only returns objects. If you need the values to have the correct type you have to cast:

   1: while (rdr.Read()) {
   2:   int id = (int)rdr["ID"];
   3:   int post = (int)rdr["Post"];
   4:   int position = (int)rdr["Position"];
   5:   string text = (string)rdr["Text"];
   6:   Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
   7: }

(I didn't use variable type inference (var) in this example to stress the fact that we are using ints and strings instead of objects)

"OK", you say, "with this knowledge I can now master data readers easily". Maybe not yet. There are two more issues.

The data type problem

Assume that you decide that your app will never have more than a few hundred comments per post. To save space in the database you change the data type of the Position column (which describes the order of comments for a specific post) from int to smallint. If you run any of the code examples above you'll probably be surprised that they'll all (except the one that uses the string indexer without casting) fail with an InvalidCastException:

Unhandled Exception: System.InvalidCastException: Specified cast is not valid.

Why is that? In the case of GetInt32 let's look at the remarks in the documentation:

No conversions are performed; therefore, the data retrieved must already be a 32-bit signed integer.

If the data coming from the database is a 16-bit integer (aka smallint) this call will therefore fail. To make the code work again you'd need to use GetInt16 instead.

And what about the cast in  (int)rdr["Position"]? After all, a 16-bit signed integer (short) should be castable to an int. While this is true, keep in mind the indexer of the data reader returns a boxed version of the short value as an object. Unboxing a value must always be done using the type of the boxed value (or one of its interfaces). The conversion from short to int can only take place after the value has been unboxed. That is, use either this (for implicit conversion to int):

   1: int position = (short)rdr["Position"];

or this (for explicit conversion to int):

   1: var position = (int)(short)rdr["Position"];

The data type problem can be really annoying because data type mismatches just like the index problem will be visible at runtime only. It's tedious to hunt down these bugs and it makes modification to the database excessively expensive.

You can avoid this problem to great extent if you don't force a "hard" conversion of the column value with GetXYZ and casts. Instead use the "soft" conversion methods provided by the Convert class. Here's the example from above again, but this time it's more robust against data type changes.

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   2:  
   3: using (var rdr = cmd.ExecuteReader()) {
   4:   while (rdr.Read()) {
   5:     var id = Convert.ToInt32(rdr["ID"]);
   6:     var post = Convert.ToInt32(rdr["Post"]);
   7:     var position = Convert.ToInt32(rdr["Position"]);
   8:     var text = Convert.ToString(rdr["Text"]);
   9:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  10:   }
  11: }

(Again, feel free to improve the code by replacing SELECT * with an explicit column list and/or use the index based indexer of the data reader instead)

The DBNull problem

There's one final issue I want to write about. In the examples above all columns are explicitly NON NULL. What if the columns contain NULL?

Let's assume for now that the columns Post, Position and Index could be NULL (ignoring the fact that it wouldn't make much sense in that context). How would our code look like? Maybe like this:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post =     rdr["Post"]     == null ? (int?)null : Convert.ToInt32(rdr["Post"]);
   3: int? position = rdr["Position"] == null ? (int?)null : Convert.ToInt32(rdr["Position"]);
   4: string text =   rdr["Text"]     == null ? null       : Convert.ToString(rdr["Text"]);

(I aligned the code a bit fore readability in this example)

You'd maybe expect that rdr["Position"] returns null if the column contains NULL. Run the example and you'll see that it's not the case. This is important to now: columns that contain NULL in the database will be returned as an instance of DBNull by the data reader! Furthermore DBNull can't be converted to any other datatype (int, short etc.) with one exception: if Convert.ToString is called on a DBNull object, an empty string is returned (at least in the Sql Server implementation of IDataReader).

To check whether a column in the database contains NULL it's therefore not valid to check whether a value returned by the data reader equals null (it won't ever). Instead, check whether the returned value is a DBNull. There are basically two ways for this. Either use IsDBNull:

   1: if(rdr.IsDBNull(1 /* column index */) {
   2:   // value is NULL
   3: }

or check the type of the value directly:

   1: if(rdr["TheColumn"] is DBNull){
   2:   // value is NULL
   3: }

The example from above should therefore look something like:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post     = rdr["Post"]     is DBNull ? (int?)null : Convert.ToInt32(rdr["Post"]);
   3: int? position = rdr["Position"] is DBNull ? (int?)null : Convert.ToInt32(rdr["Position"]);
   4: string text   = rdr["Text"]     is DBNull ?       null : Convert.ToString(rdr["Text"]);

(As always: using the string based indexer is just one option to retrieve values, the same is applicable in case you use indices)

While the code above is not the most efficient piece of C# ever written it covers many of the problems I mentioned in this article:

  • the code does not depend on the order of the columns in the database.
  • the code can handle a fair amount of possible data type changes made to the table
  • the code can handle NULL values

For convenience, you might want to write some helper methods that simplify the task of retrieving values from the data reader.

   1: public static int? GetInt32(IDataRecord dr, string columnName) {
   2:   var value = dr[columnName];
   3:   return value is DBNull ? (int?) null : Convert.ToInt32(value);
   4: }
   5:  
   6: public static int? GetInt32(IDataRecord dr, int columnIndex) {
   7:   return dr.IsDBNull(columnIndex) ? (int?) null : Convert.ToInt32(dr[columnIndex]);
   8: }
   9:  
  10: public static string GetString(IDataRecord dr, string columnName) {
  11:   var value = dr[columnName];
  12:   return value is DBNull ? null : Convert.ToString(value);
  13: }
  14:  
  15: public static string GetString(IDataRecord dr, int columnIndex) {
  16:   return dr.IsDBNull(columnIndex) ? null : Convert.ToString(dr[columnIndex]);
  17: }

You can then use those methods like this:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post = GetInt32(rdr, "Post");
   3: int? position = GetInt32(rdr, "Position");
   4: string text = GetString(rdr, "Text");

Of course, if you're using C# 3 you could make the methods above extensions to IDataRecord. In fact, I've written those extension methods for you (download ISC licensed source here). It basically does two things:

  • Provide equivalents to all those GetXYZ methods in IDataRecord that accept a column name instead of an index. Those methods are shortcuts for casting the values of reader["ColumnName"] to the proper type, but they suffer from the same data type and NULL value problem as their index based counterparts. Use them only for non null columns of which you know the data type exactly.
  • Provide "safe" versions of those methods that don't suffer from the data type problem and the NULL problem. I called those methods GetSafeXYZ (where XYZ is the data type of course). You won't find "safe" methods that accept a column index, but with the knowledge you - hopefully - gained from this article you should be able to write them yourself.

Here's the example from above using the "safe" methods:

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   2:  
   3: using (var rdr = cmd.ExecuteReader()) {
   4:   while (rdr.Read()) {
   5:     int id = rdr.GetInt32("ID");
   6:     int? post = rdr.GetSafeInt32("Post");
   7:     int? position = rdr.GetSafeInt32("Position");
   8:     string text = rdr.GetSafeString("Text");
   9:  
  10:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  11:   }
  12: }

Attachments:

DataRecordExtensions.cs (20.37 kb)


How deep is your clone?

July 30, 2008 12:35 by Andre Loker

Someone asked in a forum:

Why is there no ICloneable<T> only ICloneable?

Good question. It wouldn't have taken too much effort to introduce a generic version when .NET 2.0 came out. But I think MS had a good reason not to introduce ICloneable<T>. And that's because they realized that ICloneable sucks in the first place! Why is that? Because it is effectively undefined what cloning does.

Let's have a look at the MSDN library. It has this to say about ICloneable:

Supports cloning, which creates a new instance of a class with the same value as an existing instance.

And this about ICloneable.Clone:

Creates a new object that is a copy of the current instance.

...

Clone can be implemented either as a deep copy or a shallow copy. In a deep copy, all objects are duplicated; whereas, in a shallow copy, only the top-level objects are duplicated and the lower levels contain references.

All right, doesn't sound too bad, does it. There are two interesting points, though:

Point 1: Clone returns a new object. Really? Not necessarily. System.String implements ICloneable.Clone like this:

   1: public object Clone(){
   2:   return this;
   3: }

Not necessarily problematic, as strings are immutable, but still explicitly against the documentation of ICloneable.Clone.

Point 2: shallow vs. deep copy. This is hell, trust me. Any implementor of ICloneable is free to choose "how deeply" it copies itself. This can give a multitude of different meanings to Clone().

Let us have a look at an example of shallow copy. To implement Clone as a shallow copy method create a new instance of the class and set all instance variables to the value of the original class. This is in fact what Object.MemberwiseClone() does, so let's just use that:

   1: class Place {
   2:   public string Name { get; set; }
   3:   public string Postcode { get; set; }
   4: }
   5:  
   6: class Address {
   7:   public string Street { get; set; }
   8:   public string HouseNumber { get; set; } // string to support '23b'
   9:   public Place Place { get; set; }
  10: }
  11:  
  12: class Order : ICloneable {
  13:   public Address ShippingAddress { get; set; }
  14:  
  15:   public object Clone() {
  16:     // shallow copy
  17:     return MemberwiseClone();
  18:   }
  19: }

It's easy to do. MemberwiseClone() creates a new instance of Order. The returned instance uses the same Address object as the original Order. That's fine until someone does this:

   1: Order order = // order from database 
   2: Order similarOrder = (Order) order.Clone();
   3: similarOrder.ShippingAddress.Street = "Somewhere";
   4: similarOrder.ShippingAddress.HouseNumber = "12c";

By changing the Address instance of similarOrder (which is the same as order.Address) we changed the shipping address of the original order. Whoops. Might be better to do a deep copy! Here's the modified code that does a deep copy:

   1: class Place : ICloneable {
   2:   public string Name { get; set; }
   3:   public string Postcode { get; set; }
   4:  
   5:   public object Clone() {
   6:     // shallow copy is enough here
   7:     return MemberwiseClone();
   8:   }
   9: }
  10:  
  11: class Address : ICloneable {
  12:   public string Street { get; set; }
  13:   public string HouseNumber { get; set; } // string to support '23b'
  14:   public Place Place { get; set; }
  15:  
  16:   public object Clone() {
  17:     return new Address() {
  18:       Street = Street,
  19:       HouseNumber = HouseNumber,
  20:       Place = (Place) Place.Clone()
  21:     };
  22:   }
  23: }
  24:  
  25: class Order : ICloneable {
  26:   public Address ShippingAddress { get; set; }
  27:  
  28:   public object Clone() {
  29:     // deep copy
  30:     return new Order {
  31:       ShippingAddress = (Address)ShippingAddress.Clone()
  32:     };
  33:   }
  34: }

Now we are on the safe side. We can mess with the address of a cloned order anyway we like without affecting the original order. But wait... all of a sudden we realize that the postcode of the address was wrong, so we fix that:

   1: Order order = // order from database 
   2: Order similarOrder = (Order)order.Clone();
   3: similarOrder.ShippingAddress.Place.Postcode = "1234";

But now we have a new problem: by doing a deep copy we duplicated the Place instance as well. If we change the postcode in one of the instances, it won't affect the other one - but it should! So what we actually need in this case is a semi-deep copy. Some parts of the object graph have to be copied deeply (the Address), some parts need a shallow copy (the Place). There's clearly no general pattern in this.

While this example is a bit made up you might find such situations in practice. Sometimes you won't have any chance to avoid it. But you see that it can get complicated. More complicated than a single one-size-fits-all interface like ICloneable could handle. In general "cloning" an object is by far not as transparent as ICloneable.Clone might suggest. If you need some sort of copying function, go ahead. Give it a clear name and implement it in a reasonable way. But don't implement ICloneable as it can rise false assumptions.

MS probably realized this problem. They did not want to advertise "general purpose cloning" by introducing another ICloneable interface which would make cloning even more convenient for the user.


Post-Redirect-Get

June 27, 2008 17:01 by Andre Loker

HTTP methods

As you probably know, HTTP supports several methods that define the nature of the current request. The two most important ones are GET and POST. GET is the primary method to get content (so called entities) from the server such as  HTML pages, images, CSS style sheets etc. The POST method on the other hand is meant to transport entities to the server, for example login credentials or a blog comment. On the server side a POST request often results in an update of certain data (databases, session state).

Both GET and POST can return an entity as a response. For GET this is obvious - it's what the method exists for in the first place. For POST it might sound reasonable in the first place as well, but it brings a pile of problems.

A simple scenario

Imagine you fill in a sign-up form of some web based e-mail service and POST it to the server using a submit button. The server processes the new account and updates its database. Maybe it even logs you in directly. In response of the POST request the server directly shows you a view of your inbox. Here's a diagram of what happens between browser and server:

 image

  1. The browser POSTs form data to an URL called signup.aspx
  2. The server processes the request
  3. The server responds with a status code of 200 (OK) and sends back a view of the new users inbox rendered as HTML

You leave the computer to have a coffee and when you come back 5 minutes later you refresh the page (using CTRL+R or F5 or whatever shortcut your browser uses) to see whether you already have new messages. You are a bit puzzled why this (or a similar) message box appears:

image

You click on OK and are even more confused as the page that appears says "This user name is already taken" instead of showing your inbox .

What has happened? Remember that the page you saw was the response of a POST request (submitting the sign up form). When you refreshed the page and confirmed to "resend the data" you actually repeated the POST request with the same form data. The server processed the "new" account and found that the user name is already in use (by yourself!), therefore it showed an error. "But wait", you say, "I just wanted the server to refresh the view of my inbox, what have I done wrong? " The answer is: nothing! The problem is that the application abused the POST response to transport an entity back to the client that should have been accessed with a GET request in the first place.

POST related issues

Here are some of the problems that occur if you abuse POST requests to return entities:

1. Refreshing the page results in a re-transmission of the POST data

This is what I described above. Hitting "refresh page" for a reponse based on a POST request will re-issue the POST request. Instead of refreshing what you see this will repeat what you did to reach the current page. This is not "refresh page" anymore, it becomes "repeat last action" - which is most likely not what the user wants. If you see a summary page after you have submitted an order in an online store, you don't want F5 to drop another order, do you?

2. POST responses are hard to bookmark

Bookmarks (or favourites etc.) normally only remember the URL of the bookmarked page (along with some user supplied meta data). Because a POST request transports data in the request body instead as query parameters in the URL like GET does, bookmarking the result of a POST will not work in most cases.

3. POST responses pollute the browser history

If the browser keeps the result of a POST request in it's history, going back to that history entry will normally result in POST data to be retransmitted. This again causes the same issues as mentioned in point 1.

POST-Redirect-GET

"But I need POSTs to send forms to the server - how can I avoid the problems mentioned above?" you might say. Here's where the POST-Redirect-GET (PRG hereafter) pattern enters the stage.

Instead of sending entity content with the POST response after we processed the request, we return the URL of a new location which the browser should visit afterwards. Normally this new location shows the result of the POST or an updated view of some domain model.

This can be achieved by not returning a result code of 200 (success) but instead returning a status code that indicates a new location for the result, for example 303 ("See other") or 302("Found"/"Moved temporarily"), the latter of which is used most often nowadays. Together with the 30x result code a Location header is sent which contains the URL of the page to which the request is redirected. Only the headers are sent, no body is included.

If the browser sees the 30x status code, it will look for the Location header and issue a GET request to the URL mentioned there. Finally the user will see the body of that GET request in the browser.

The browser-server communication would look like this:

 image

  1. The browser POSTs to signup.aspx
  2. The server updates some state etc.
  3. The response is 302 redirect with a Location header value of inbox.aspx
  4. The browser realizes that the response is redirected and issues a GET to inbox.aspx
  5. The server returns 200 together with the content of the resource.

What do we gain?

  • The page can be safely refreshed. Refreshing will cause another GET to inbox.aspx which won't cause any updates on the server
  • The result page can be easily bookmarked. Because the current resource is defined by the URL a bookmark to this URL is likely to be valid.
  • The browser history stays clean. Responses that have a redirect status code (such as 302) will not be put into the browser cache by most browsers. Only the location to which the response is redirecting is. Therefore signup.aspx won't be added to the history and we can safely go back and forth through the history without having to resubmit any POST data

The drawbacks of POST-Redirect-GET

While it should be clear by now that the POST-Redirect-GET pattern is the way to go in most situations, I'd like to point at the few drawbacks that come along with this pattern.

First of all, redirection from one request to another causes an extra roundtrip to the server (one for the POST request, one for the GET request it redirects to). In this context the roundtrip should be understood as all processing and transmission time that is required and fixed per request, ie. transmission delay, creation and invation of the HTTP handler, opening and closing database connections/transactions, filling ORM caches etc.

If both requests can be handled very quickly by the server this will essentially double the response time. If your roundtrip time is 200ms, using PRG will cause a minimum delay of 400ms between submitting the form and the result page being visible. This issues has to be put in perspective with reality, however. The server will need some time processing both requests, so the percentage of time needed for the roundtrips decreases with the amount of time server processing time takes. The response from the POST itself can be extremely small (few hundred bytes), because only the headers need to be transmitted.

In practice I haven't noticed a real performance problem with PRG. A slow app will stay slow, a fast one won't truly suffer from the extra roundtrip. And besides, if you replace POSTs by GETs where appropriate the effect of PRG will be even less noticeable.

The problem with ASP.NET WebForms

Now that you know about POST-Redirect-GET you are of course eager to use it (at least I hope I could convince you). But as an ASP.NET WebForms developer you will soon run into problems: ASP.NET WebForms is fundamentally based on POSTs to the server. In essence, all ASP.NET web pages are wrapped in one huge <form> element with "method" set to "POST". Whenever you click a button, you essentially POST all form fields to the server. Of course you can redirect from a Button.Click handler. If you do so, you're applying PRG. At the same time you're working quite against the WebForms philosophy, especially the ViewState (which will get lost as soon as you redirect), which will force you to rethink a lot of your application logic. And if you don't rely on all this postback behaviour inherent to ASP.NET WebForms you might as well ask why you're using WebForms in the first place.

This makes clear why a lot of developers (including me) think that WebForms are inherently "broken" (viewstate, ubiquitous postbacks and the hard-to-mock HttpContext are just a few reasons). If you share these concerns but like .NET just as I do, you might want to look at alternate .NET based web frameworks such as Castle MonoRail or ASP.NET MVC.

PRG and AJAX

In situations where you use AJAX the whole PRG issue becomes a new story. AJAX responses don't appear in the history, you wouldn't want to bookmark them and refreshing a web page does not re-issue any AJAX requests (except those fired on page load). Therefore I have no problem with returning entitiest (HTML fragments, JSON, XML) from AJAX POSTs - PRG is not of much use here.

Conclusion

To conclude this article here's a list of some basic rules that have been useful to me:

  1. Use POST-Redirect-GET whenever you can, that is: whenever you process a POST request on the server, send a redirect to a GETtable resource as response. It's applicable in almost all cases and will make your site much more usable for the visitor
  2. Don't POST what you can GET. If you only want to retrieve a parameterised resource it might be completely suitable to use a GET request with query string parameters. Google is a good example. The start page contains a simple form with a single text field to enter the search terms. Submitting the form causes a GET to /search with the search terms passed as the query string parameter q. This can be easily done by providing method="GET" on the <form> element (or just leave out the method attribute, as GET is the default).
  3. POST requests from AJAX are allowed to return entities directly as they don't suffer from the problems like "full" POSTs.

Getting rid of strings (2): use lambda expressions

June 12, 2008 12:08 by Andre Loker

Intro

In the first article of this series I talked about the problems with strings in code. This article will show you how you can use lambda expressions and expression trees as another tool to avoid strings.

About Lambda Expressions

C# 3.0 brought a cool new feature call lambda expressions. On the one hand they are a nice abbreviation for anonymous delegates:

   1: Button b = /*..*/
   2: b.Click += (sender, e) => MessageBox(String.Format("{0} clicked", sender);

But there's an additional feature that might not be obvious to everyone. .NET 3.5 introduced the System.Linq.Expressions namespace which allows us to inspect a code expression tree. One special expression type is Expression<TDelegate> which derives from LambdaExpression. This expression type handles the expression tree represented by a lambda expression. Let's look at an example:

   1: public class Program {
   2:  
   3:   public static void ExpressionTest(Expression<Func<DateTime>> expression) {
   4:     Console.WriteLine("Expression body is '{0}'", expression.Body);
   5:     Console.WriteLine("Node type is {0}", expression.Body.NodeType);
   6:     Console.WriteLine("Expression body type is {0}", expression.Body.GetType());
   7:   }
   8:  
   9:   public static void Main(string[] args) {
  10:     ExpressionTest(() => DateTime.Now);
  11:   }
  12: }

ExpressionTest expects an expression representing a Func<DateTime>, that is a function that returns a DateTime. In the Main method ExpressionTest is invoked - not with an Expression<Func<DateTime>> but simply with a lambda expression with the Func<DateTime> signature. The C# 3.0 compiler will convert the lambda expression that is passed as argument to a expression tree with a top node of type Expression<Func<DateTime>>. The Body property of that expression contains the right hand side of the lambda expression.

Running this code prints:

   1: Expression body is 'DateTime.Now'
   2: Node type is MemberAccess
   3: Expression body type is System.Linq.Expressions.MemberExpression

The runtime type of the expression body is MemberExpression, which makes sense, because DateTime.Now represents access to a member (Now) of DateTime. Run the code in the debugger to see how the expression is represented in the expression tree.

I won't go into too much detail on expressions here. Browse through the documentation of the Expressions namespace to see what kind of expressions you can expect (and inspect for that matter).

How can lambda expressions help to avoid strings?

Expressions can be useful in situations where you need to provide the name of a member or a MethodInfo/PropertyInfo/FIeldInfo for that member. Ever needed to pass a MethodInfo somewhere? You'll most likely ended up with something like

   1: var info = typeof (DateTime).GetMethod("ToShortDateSting");
   2: Console.WriteLine(info.Name);

This compiles fine, of course, but when you run the code you'll get a null pointer exception at info.Name. Why? Because there's a typo in "ToShortDateSting". The method I was looking for is ToShortDateString (realize the 'r' in String). Did you see the typo at a glance? The situation gets worse if the name of the method changes, because now the code would break at runtime without being changed (see the first article to learn about problems with strings and refactoring).

Captain Lambda to the rescue

Here's an approach that is much more solid:

   1: var info = Reflect.GetMethod<DateTime>(dt => dt.ToShortDateString());
   2: Console.WriteLine(info.Name);

No strings attached so to speak. If you had a typo in ToShortDateString the code would not even compile. Additionally, the code is much more open to refactoring. But wait, how does it work? Here's the simple answer:

   1: public static class Reflect {
   2:   /// <summary>
   3:   /// Gets the MethodInfo for the method that is called in the provided <paramref name="expression"/>
   4:   /// </summary>
   5:   /// <typeparam name="TClass">The type of the class.</typeparam>
   6:   /// <param name="expression">The expression.</param>
   7:   /// <returns>Method info</returns>
   8:   /// <exception cref="ArgumentException">The provided expression is not a method call</exception>
   9:   public static MethodInfo GetMethod<TClass>(Expression<Action<TClass>> expression) {
  10:     var methodCall = expression.Body as MethodCallExpression;
  11:     if(methodCall == null) {
  12:       throw new ArgumentException("Expected method call");
  13:     }
  14:     return methodCall.Method;
  15:   }
  16: }

GetMethod expects an expression with a delegate of type Action<TClass>, that is a void method having a TClass as it's only argument. GetMethod checks that the expression passed in is a method call by casting the body to a MethodCallExpression. If the cast succeeds, the Method property already contains the MethodInfo that we were looking for. Thank you C# 3.0 compiler for doing the work for us :-)

A more practical example

In MonoRail the Controller class has a method called RedirectAction which - well - redirects the response to a new action. It expects the name of an action as its argument. So you might see code like this:

   1: public class HomeController : Controller {
   2:   
   3:   public void Index() {
   4:     if(!UserIsLoggedIn){
   5:         RedirectToAction("Login");
   6:     }
   7:   }
   8:  
   9:   public void Login() {
  10:   }
  11: }

This works fine but of course it suffers from all the string related problems I have been talking about so far. Let's see if we can use our new friend (Expression<TDelegate>) to improve the situation:

   1: public static class ControllerExtensions {
   2:   public static void RedirectToAction<TController>(this TController controller, Expression<Action<TController>> expression) where TController : Controller {
   3:     var methodCall = expression.Body as MethodCallExpression;
   4:     if (methodCall == null) {
   5:       throw new ArgumentException("Expected method call");
   6:     }
   7:     controller.RedirectToAction(methodCall.Method.Name);
   8:   }
   9: }

Now we have an extension method that we can use instead of the original RedirectToAction:

   1: public class HomeController : SmartDispatcherController {
   2:  
   3:   public void Index() {
   4:     // RedirectToAction("Login");
   5:     this.RedirectToAction(c => c.Login());
   6:   }
   7:  
   8:   public void Login(){
   9:   }
  10: }

Is this cool or what? Once again we got rid of a string. You can redirect to a method with parameters as well:

   1: public void Index() {
   2:   this.RedirectToAction(c => c.ShowItem(0));
   3: }
   4:  
   5: public void ShowItem(int id){
   6: }

You can pass any value to the "call" to ShowItem that you like. Remember: the expression is only examined but not executed. If you want to pass an actual value to the redirected action, create extension methods for the RedirectToAction overloads that accept parameters. I won't show this here because it is not too hard to implement (and I'm only showing some examples here anyway).

Properties

You can also get a PropertyInfo (and similarly a FieldInfo) using Lambdas, here's an example:

   1: public static class Reflect {
   2:   public static PropertyInfo GetProperty<TClass, TValue>(Expression<Func<TClass, TValue>> expression) {
   3:     var memberExpression = expression.Body as MemberExpression;
   4:     if (memberExpression == null || !(memberExpression.Member is PropertyInfo)) {
   5:       throw new ArgumentException("Expected property expression");
   6:     }
   7:     return (PropertyInfo) memberExpression.Member;
   8:   }    
   9: }
  10:  
  11: // use:
  12: var dayProperty = Reflect.GetProperty<DateTime, int>(dt => dt.Day);
  13: Console.WriteLine(dayProperty.Name);

NB: the code shown above only works for properties that are not write-only (otherwise you will not be able to "return" the property value in the expression). I don't consider this a big limitations. How many write-only properties have you written in the past two months?

In the example above we have to provide both the type of the class as well as the type of the property. It makes the code slightly less elegant. We can improve it like this:

   1: public static PropertyInfo GetProperty<TClass>(Expression<Func<TClass, object>> expression) {
   2:     MemberExpression memberExpression;
   3:     // if the return value had to be cast to object, the body will be an UnaryExpression
   4:     var unary = expression.Body as UnaryExpression;
   5:     if (unary != null) {
   6:       // the operand is the "real" property access
   7:       memberExpression = unary.Operand as MemberExpression;
   8:     } else {
   9:       // in case the property is of type object the body itself is the correct expression
  10:       memberExpression = expression.Body as MemberExpression;
  11:     }
  12:     // as before:
  13:     if (memberExpression == null || !(memberExpression.Member is PropertyInfo)) {
  14:       throw new ArgumentException("Expected property expression");
  15:     }
  16:     return (PropertyInfo) memberExpression.Member;
  17: }
  18:  
  19: var dayProperty = Reflect.GetProperty<DateTime>(dt => dt.Day);
  20: Console.WriteLine(dayProperty.Name);
  21:  

It's a bit more complicated, but still understandable.

Update 07/22/2008: RednaxelaFX came up with a third alternative. GetProperty stays the same as in the first (simpler) version, but we call the method differently:

   1: // use:
   2: var dayProperty = Reflect.GetProperty( (DateTime dt) => dt.Day);
   3: Console.WriteLine(dayProperty.Name);

By providing the type of the expression's argument explicitly the compiler is now able to infer the return value of the expression. Thanks RednaxelaFX for your comment!

Summing it up

Lambda expression trees are a great to tool that allows us to point to members without using strings. There are some limitations, though:

  • It only works for "compile time reflection". The expression tree is created during compilation, so you cannot get the name of "some member" of "some type" at runtime.
  • It won't work with static members
  • It will only work for members that are visible in the context where the expression is built. Non-public fields, properties and methods can therefore be tricky using this technique.

Still there are plenty of situations where you need to provide the MemberInfo of a public instance method/property (or it's name) known at compile time.


Castle DictionaryAdapter

June 10, 2008 15:02 by Andre Loker

DictionaryAdapter is a component of the Castle stack which can create an adapter between an interface and a dictionary at runtime. The DictionaryAdapter combines the flexibility of a dictionary with the type safety and convenience provided by a strong API. Here's an example:

   1: // the interface that will be used to access the content of the dictionary
   2: public interface ISimple {
   3:   string Name { get; set; }
   4:   int Age { get; set; }
   5: }
   6:  
   7: //...
   8:  
   9: // the backing store
  10: IDictionary dict = new Hashtable(); 
  11:  
  12: // dynamically create an adapter around dict 
  13: ISimple wrapper = new DictionaryAdapterFactory().GetAdapter<ISimple>(dict);
  14: wrapper.Name = "Andre"; // will be written to dict["Name"];
  15: wrapper.Age = 26;       // will be written to dict["Age"];
  16:  
  17: Assert.AreEqual("Andre", dict["Name"]);
  18: Assert.AreEqual("Andre", wrapper.Name);
  19:  
  20: Assert.AreEqual(26, dict["Age"]);
  21: Assert.AreEqual(26, wrapper.Age);

How to get the DictionaryAdapter

The DictionaryAdapter can be downloaded as part of the Castle project:

How does it work?

Calling DictionaryAdapterFactory.GetAdapter will create a new type on the fly. This type implements the interface that was provided as the type argument (here: ISimple). For each property on the interface a getter and/or setter will be implemented that reads a value from the provided dictionary or writes a value to the dictionary. The key that is used is based on the property name (the default setting is to use the property name as the key, later we will see how we can influence the key being used).

Creating an adapter is rather costly. It requires the creation of a new in-memory assembly containing the dynamic adapter type. However, this performance only exists the first time GetAdapter is called for a specific type. Subsequent requests for the same adapter type will reuse the generated assembly.

More examples and options

DictionaryAdapter has plenty of options to customize how the mapping between the adapter interface and the underlying dictionary happens. In this section I'll give some examples of what DictionaryAdapter is capable of without covering every detail (for a complete reference I'd recommend the Castle source code).

I will provide code examples in the form of an MbUnit test with the following skeleton:

   1: using System.Collections;
   2: using MbUnit.Framework;
   3: using Castle.Components.DictionaryAdapter;
   4:  
   5: [TestFixture]
   6: public class DictionaryAdapterTests {
   7:   private DictionaryAdapterFactory factory;
   8:   private IDictionary dict;
   9:  
  10:   [SetUp]
  11:   public void SetUp() {
  12:     factory = new DictionaryAdapterFactory();
  13:     dict = new Hashtable();
  14:   }
  15: }

The IDictionary instance is used as the backing store for which an adapter will created. The factory handles the dynamic creation of the adapters.

Download the source code of the examples (DictionaryAdapterTests.cs)

Basic example

Ones again, here's the most basic example:

   1: public interface ISimple {
   2:   string Name { get; set; }
   3:   int Age { get; set; }
   4: }
   5:  
   6: [Test]
   7: public void SimpleTest() {
   8:   var wrapper = factory.GetAdapter<ISimple>(dict);
   9:   wrapper.Name = "Andre";
  10:   wrapper.Age = 26;
  11:  
  12:   Assert.AreEqual("Andre", dict["Name"]);
  13:   Assert.AreEqual(26, dict["Age"]);
  14: }

Type prefix

Especially if you want to use multiple adapters on the same dictionary naming collisions can occur if a property with the same name is present in more than one adapter interface. To avoid this problem you can specify key prefixes using attributes. One possible prefix is the (full) name of the interface type:

   1: [DictionaryTypeKeyPrefix]
   2: public interface IWithTypePrefix {
   3:   string Name { get; set; }
   4:   int Age { get; set; }
   5: }
   6:  
   7: [Test]
   8: public void WithTypePrefix() {
   9:   var wrapper = factory.GetAdapter<IWithTypePrefix>(dict);
  10:   wrapper.Name = "Andre";
  11:   wrapper.Age = 26;
  12:  
  13:   Assert.AreEqual("Andre", dict[typeof (IWithTypePrefix).FullName + "#Name"]);
  14:   Assert.AreEqual(26, dict[typeof (IWithTypePrefix).FullName + "#Age"]);
  15: }

Custom prefix

The type prefix can be quite lengthy, so you might prefer setting your own prefix:

   1: [DictionaryKeyPrefix("My")]
   2: public interface IWithCustomPrefix {
   3:   string Name { get; set; }
   4:   int Age { get; set; }
   5: }
   6:  
   7: [Test]
   8: public void WithCustomPrefix() {
   9:   var wrapper = factory.GetAdapter<IWithCustomPrefix>(dict);
  10:   wrapper.Name = "Andre";
  11:   wrapper.Age = 26;
  12:  
  13:   Assert.AreEqual("Andre", dict["MyName"]);
  14:   Assert.AreEqual(26, dict["MyAge"]);
  15: }

Custom keys

You can opt for not using a prefix altogether and define the keys manually:

   1: public interface ICustomKeys {
   2:   [DictionaryKey("Foo")]
   3:   string Name { get; set; }
   4:  
   5:   [DictionaryKey("Bar")]
   6:   int Age { get; set; }
   7: }
   8:  
   9: [Test]
  10: public void WithCustomKey() {
  11:   var wrapper = factory.GetAdapter<ICustomKeys>(dict);
  12:   wrapper.Name = "Andre";
  13:   wrapper.Age = 26;
  14:  
  15:   Assert.AreEqual("Andre", dict["Foo"]);
  16:   Assert.AreEqual(26, dict["Bar"]);
  17: }

Convert property to string

In some scenarios you might need all values in the dictionary to be strings. This can be enforced by using the DictionaryStringValuesAttribute:

   1: public interface IPropertyAsString {
   2:   string Name { get; set; }
   3:  
   4:   [DictionaryStringValues]
   5:   int Age { get; set; }
   6: }
   7:  
   8: [Test]
   9: public void ConvertPropertyToString() {
  10:   var wrapper = factory.GetAdapter<IPropertyAsString>(dict);
  11:   wrapper.Name = "Andre";
  12:   wrapper.Age = 26;
  13:  
  14:   Assert.AreEqual("Andre", dict["Name"]);
  15:   Assert.AreEqual("26", dict["Age"]);
  16: }

This attribute can be applied to the interface itself as well to force all properties to be converted to strings.

Collection as string list

If your property implements IEnumerable you can have it converted to a string containing a list of all values:

   1: public interface IAsStringList {
   2:   [DictionaryStringList]
   3:   int[] Values { get; set; }
   4: }
   5:  
   6: [Test]
   7: public void ConvertPropertyToStringList() {
   8:   var wrapper = factory.GetAdapter<IAsStringList>(dict);
   9:   wrapper.Values = new[] {1, 2, 3, 4, 5};
  10:  
  11:   Assert.AreEqual("1,2,3,4,5", dict["Values"]);
  12: }

Several aspects are configurable, like the separator character.

Adapting complex properties

You are not limited to using "flat" interfaces with only simple types. You can also create an adapter that automatically adapts complex properties:

   1: public interface IWithComponent {
   2:   [DictionaryComponent]
   3:   ISimple Simple { get; }
   4: }
   5:  
   6: [Test]
   7: public void CanUseComponent() {
   8:   var wrapper = factory.GetAdapter<IWithComponent>(dict);
   9:   wrapper.Simple.Name = "Andre";
  10:   wrapper.Simple.Age = 26;
  11:   Assert.AreEqual("Andre", dict["Simple_Name"]);
  12:   Assert.AreEqual(26, dict["Simple_Age"]);
  13: }

As you can see IWithComponent.Simple is automatically populated with an adapter itself.

Conclusion

DictionaryAdapter provides means to access the content of a dictionary in an elegant, type safe way. It is a great tool to get rid of strings and to to make your source code robust and refactorable.

Pro

  • Easy to use
  • Very flexible
  • Type safe access even to IDictionary
  • Great tool to improve robustness of code

Cons

  • One time overhead (creating adapters)
  • Memory overhead (in-memory assemblies created by the factory)

Download source code: DictionaryAdapterTests.cs (2.97 kb)