Retrieving values from an IDataReader

September 4, 2008 at 3:57 PMAndre Loker

Despite the undoubted advantages of ORM you probably need to fetch data directly from an IDataReader from time to time. Getting values out of the data reader is easy at first glance. However, there are some issues that you should be aware of.

Basic usage

Assume we have a simple table that holds comments for a web log or something similar:

image

(The UI is German, but I guess you know what the columns mean. If not: it's "Column name", "Data type" and "Allow Nulls")

Let's query all data in the table:

   1: using (var con = new SqlConnection(connectionString)) {
   2:   con.Open();
   3:   var cmd = new SqlCommand("SELECT * FROM Comment", con);
   4:   using(var rdr = cmd.ExecuteReader()) {
   5:     while(rdr.Read()) {
   6:       var id = rdr.GetInt32(0);
   7:       var post = rdr.GetInt32(1);
   8:       var position = rdr.GetInt32(2);
   9:       var text = rdr.GetString(3);
  10:       Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  11:     }
  12:   }
  13: }

Nothing fancy here, we simply query all all rows and columns and print their values.

The index problem

Look at the way we retrieve data from he reader:

   1: var id = rdr.GetInt32(0);
   2: var post = rdr.GetInt32(1);
   3: var position = rdr.GetInt32(2);
   4: var text = rdr.GetString(3);

Using indices this way is probably not the best solution, especially because we use SELECT * to retrieve all columns. If the order of the columns change in the database, your application either crashes or it will using the wrong columns (the latter can arguably be the worse situation). Errors regarding this problem what be visible until runtime, which makes the code quite hard to maintain.

If we changed the query to something like

   1: SELECT ID, Post, Position, Text FROM Comment

we could at least prevent the index related issues because we would fetch the columns in an order that is defined by our application, not by the database.

As an alternative we could use IDataReader.GetOrdinal to determine the index of the columns at runtime:

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   2:  
   3: using (var rdr = cmd.ExecuteReader()) {
   4:  
   5:   // determine the indices of the columns and cache them for efficiency
   6:   var idIndex = rdr.GetOrdinal("ID");
   7:   var postIndex = rdr.GetOrdinal("Post");
   8:   var positionIndex = rdr.GetOrdinal("Position");
   9:   var textIndex = rdr.GetOrdinal("Text");
  10:  
  11:   while (rdr.Read()) {
  12:     var id = rdr.GetInt32(idIndex);
  13:     var post = rdr.GetInt32(postIndex);
  14:     var position = rdr.GetInt32(positionIndex);
  15:     var text = rdr.GetString(textIndex);
  16:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  17:   }
  18: }

Here I again used SELECT *, but it would work exactly the same way with explicit column selection.

There are more ways to achieve the same effect. You can use the indexer of the data reader to fetch the columns by name:

   1: using (var rdr = cmd.ExecuteReader()) {
   2:  
   3:   while (rdr.Read()) {
   4:     var id = rdr["ID"];
   5:     var post = rdr["Post"];
   6:     var position = rdr["Position"];
   7:     var text = rdr["Text"];
   8:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
   9:   }
  10: }

Be aware though that the indexer unlike the explicit GetXYZ() methods only returns objects. If you need the values to have the correct type you have to cast:

   1: while (rdr.Read()) {
   2:   int id = (int)rdr["ID"];
   3:   int post = (int)rdr["Post"];
   4:   int position = (int)rdr["Position"];
   5:   string text = (string)rdr["Text"];
   6:   Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
   7: }

(I didn't use variable type inference (var) in this example to stress the fact that we are using ints and strings instead of objects)

"OK", you say, "with this knowledge I can now master data readers easily". Maybe not yet. There are two more issues.

The data type problem

Assume that you decide that your app will never have more than a few hundred comments per post. To save space in the database you change the data type of the Position column (which describes the order of comments for a specific post) from int to smallint. If you run any of the code examples above you'll probably be surprised that they'll all (except the one that uses the string indexer without casting) fail with an InvalidCastException:

Unhandled Exception: System.InvalidCastException: Specified cast is not valid.

Why is that? In the case of GetInt32 let's look at the remarks in the documentation:

No conversions are performed; therefore, the data retrieved must already be a 32-bit signed integer.

If the data coming from the database is a 16-bit integer (aka smallint) this call will therefore fail. To make the code work again you'd need to use GetInt16 instead.

And what about the cast in  (int)rdr["Position"]? After all, a 16-bit signed integer (short) should be castable to an int. While this is true, keep in mind the indexer of the data reader returns a boxed version of the short value as an object. Unboxing a value must always be done using the type of the boxed value (or one of its interfaces). The conversion from short to int can only take place after the value has been unboxed. That is, use either this (for implicit conversion to int):

   1: int position = (short)rdr["Position"];

or this (for explicit conversion to int):

   1: var position = (int)(short)rdr["Position"];

The data type problem can be really annoying because data type mismatches just like the index problem will be visible at runtime only. It's tedious to hunt down these bugs and it makes modification to the database excessively expensive.

You can avoid this problem to great extent if you don't force a "hard" conversion of the column value with GetXYZ and casts. Instead use the "soft" conversion methods provided by the Convert class. Here's the example from above again, but this time it's more robust against data type changes.

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   2:  
   3: using (var rdr = cmd.ExecuteReader()) {
   4:   while (rdr.Read()) {
   5:     var id = Convert.ToInt32(rdr["ID"]);
   6:     var post = Convert.ToInt32(rdr["Post"]);
   7:     var position = Convert.ToInt32(rdr["Position"]);
   8:     var text = Convert.ToString(rdr["Text"]);
   9:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  10:   }
  11: }

(Again, feel free to improve the code by replacing SELECT * with an explicit column list and/or use the index based indexer of the data reader instead)

The DBNull problem

There's one final issue I want to write about. In the examples above all columns are explicitly NON NULL. What if the columns contain NULL?

Let's assume for now that the columns Post, Position and Index could be NULL (ignoring the fact that it wouldn't make much sense in that context). How would our code look like? Maybe like this:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post =     rdr["Post"]     == null ? (int?)null : Convert.ToInt32(rdr["Post"]);
   3: int? position = rdr["Position"] == null ? (int?)null : Convert.ToInt32(rdr["Position"]);
   4: string text =   rdr["Text"]     == null ? null       : Convert.ToString(rdr["Text"]);

(I aligned the code a bit fore readability in this example)

You'd maybe expect that rdr["Position"] returns null if the column contains NULL. Run the example and you'll see that it's not the case. This is important to now: columns that contain NULL in the database will be returned as an instance of DBNull by the data reader! Furthermore DBNull can't be converted to any other datatype (int, short etc.) with one exception: if Convert.ToString is called on a DBNull object, an empty string is returned (at least in the Sql Server implementation of IDataReader).

To check whether a column in the database contains NULL it's therefore not valid to check whether a value returned by the data reader equals null (it won't ever). Instead, check whether the returned value is a DBNull. There are basically two ways for this. Either use IsDBNull:

   1: if(rdr.IsDBNull(1 /* column index */) {
   2:   // value is NULL
   3: }

or check the type of the value directly:

   1: if(rdr["TheColumn"] is DBNull){
   2:   // value is NULL
   3: }

The example from above should therefore look something like:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post     = rdr["Post"]     is DBNull ? (int?)null : Convert.ToInt32(rdr["Post"]);
   3: int? position = rdr["Position"] is DBNull ? (int?)null : Convert.ToInt32(rdr["Position"]);
   4: string text   = rdr["Text"]     is DBNull ?       null : Convert.ToString(rdr["Text"]);

(As always: using the string based indexer is just one option to retrieve values, the same is applicable in case you use indices)

While the code above is not the most efficient piece of C# ever written it covers many of the problems I mentioned in this article:

  • the code does not depend on the order of the columns in the database.
  • the code can handle a fair amount of possible data type changes made to the table
  • the code can handle NULL values

For convenience, you might want to write some helper methods that simplify the task of retrieving values from the data reader.

   1: public static int? GetInt32(IDataRecord dr, string columnName) {
   2:   var value = dr[columnName];
   3:   return value is DBNull ? (int?) null : Convert.ToInt32(value);
   4: }
   5:  
   6: public static int? GetInt32(IDataRecord dr, int columnIndex) {
   7:   return dr.IsDBNull(columnIndex) ? (int?) null : Convert.ToInt32(dr[columnIndex]);
   8: }
   9:  
  10: public static string GetString(IDataRecord dr, string columnName) {
  11:   var value = dr[columnName];
  12:   return value is DBNull ? null : Convert.ToString(value);
  13: }
  14:  
  15: public static string GetString(IDataRecord dr, int columnIndex) {
  16:   return dr.IsDBNull(columnIndex) ? null : Convert.ToString(dr[columnIndex]);
  17: }

You can then use those methods like this:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post = GetInt32(rdr, "Post");
   3: int? position = GetInt32(rdr, "Position");
   4: string text = GetString(rdr, "Text");

Of course, if you're using C# 3 you could make the methods above extensions to IDataRecord. In fact, I've written those extension methods for you (download ISC licensed source here). It basically does two things:

  • Provide equivalents to all those GetXYZ methods in IDataRecord that accept a column name instead of an index. Those methods are shortcuts for casting the values of reader["ColumnName"] to the proper type, but they suffer from the same data type and NULL value problem as their index based counterparts. Use them only for non null columns of which you know the data type exactly.
  • Provide "safe" versions of those methods that don't suffer from the data type problem and the NULL problem. I called those methods GetSafeXYZ (where XYZ is the data type of course). You won't find "safe" methods that accept a column index, but with the knowledge you - hopefully - gained from this article you should be able to write them yourself.

Here's the example from above using the "safe" methods:

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   2:  
   3: using (var rdr = cmd.ExecuteReader()) {
   4:   while (rdr.Read()) {
   5:     int id = rdr.GetInt32("ID");
   6:     int? post = rdr.GetSafeInt32("Post");
   7:     int? position = rdr.GetSafeInt32("Position");
   8:     string text = rdr.GetSafeString("Text");
   9:  
  10:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  11:   }
  12: }

Attachments:

DataRecordExtensions.cs (20.37 kb)

Posted in: C# | Databases | Patterns

Tags: , ,