Retrieving values from an IDataReader

September 4, 2008 at 3:57 PMAndre Loker

Despite the undoubted advantages of ORM you probably need to fetch data directly from an IDataReader from time to time. Getting values out of the data reader is easy at first glance. However, there are some issues that you should be aware of.

Basic usage

Assume we have a simple table that holds comments for a web log or something similar:


(The UI is German, but I guess you know what the columns mean. If not: it's "Column name", "Data type" and "Allow Nulls")

Let's query all data in the table:

   1: using (var con = new SqlConnection(connectionString)) {
   2:   con.Open();
   3:   var cmd = new SqlCommand("SELECT * FROM Comment", con);
   4:   using(var rdr = cmd.ExecuteReader()) {
   5:     while(rdr.Read()) {
   6:       var id = rdr.GetInt32(0);
   7:       var post = rdr.GetInt32(1);
   8:       var position = rdr.GetInt32(2);
   9:       var text = rdr.GetString(3);
  10:       Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  11:     }
  12:   }
  13: }

Nothing fancy here, we simply query all all rows and columns and print their values.

The index problem

Look at the way we retrieve data from he reader:

   1: var id = rdr.GetInt32(0);
   2: var post = rdr.GetInt32(1);
   3: var position = rdr.GetInt32(2);
   4: var text = rdr.GetString(3);

Using indices this way is probably not the best solution, especially because we use SELECT * to retrieve all columns. If the order of the columns change in the database, your application either crashes or it will using the wrong columns (the latter can arguably be the worse situation). Errors regarding this problem what be visible until runtime, which makes the code quite hard to maintain.

If we changed the query to something like

   1: SELECT ID, Post, Position, Text FROM Comment

we could at least prevent the index related issues because we would fetch the columns in an order that is defined by our application, not by the database.

As an alternative we could use IDataReader.GetOrdinal to determine the index of the columns at runtime:

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   3: using (var rdr = cmd.ExecuteReader()) {
   5:   // determine the indices of the columns and cache them for efficiency
   6:   var idIndex = rdr.GetOrdinal("ID");
   7:   var postIndex = rdr.GetOrdinal("Post");
   8:   var positionIndex = rdr.GetOrdinal("Position");
   9:   var textIndex = rdr.GetOrdinal("Text");
  11:   while (rdr.Read()) {
  12:     var id = rdr.GetInt32(idIndex);
  13:     var post = rdr.GetInt32(postIndex);
  14:     var position = rdr.GetInt32(positionIndex);
  15:     var text = rdr.GetString(textIndex);
  16:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  17:   }
  18: }

Here I again used SELECT *, but it would work exactly the same way with explicit column selection.

There are more ways to achieve the same effect. You can use the indexer of the data reader to fetch the columns by name:

   1: using (var rdr = cmd.ExecuteReader()) {
   3:   while (rdr.Read()) {
   4:     var id = rdr["ID"];
   5:     var post = rdr["Post"];
   6:     var position = rdr["Position"];
   7:     var text = rdr["Text"];
   8:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
   9:   }
  10: }

Be aware though that the indexer unlike the explicit GetXYZ() methods only returns objects. If you need the values to have the correct type you have to cast:

   1: while (rdr.Read()) {
   2:   int id = (int)rdr["ID"];
   3:   int post = (int)rdr["Post"];
   4:   int position = (int)rdr["Position"];
   5:   string text = (string)rdr["Text"];
   6:   Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
   7: }

(I didn't use variable type inference (var) in this example to stress the fact that we are using ints and strings instead of objects)

"OK", you say, "with this knowledge I can now master data readers easily". Maybe not yet. There are two more issues.

The data type problem

Assume that you decide that your app will never have more than a few hundred comments per post. To save space in the database you change the data type of the Position column (which describes the order of comments for a specific post) from int to smallint. If you run any of the code examples above you'll probably be surprised that they'll all (except the one that uses the string indexer without casting) fail with an InvalidCastException:

Unhandled Exception: System.InvalidCastException: Specified cast is not valid.

Why is that? In the case of GetInt32 let's look at the remarks in the documentation:

No conversions are performed; therefore, the data retrieved must already be a 32-bit signed integer.

If the data coming from the database is a 16-bit integer (aka smallint) this call will therefore fail. To make the code work again you'd need to use GetInt16 instead.

And what about the cast in  (int)rdr["Position"]? After all, a 16-bit signed integer (short) should be castable to an int. While this is true, keep in mind the indexer of the data reader returns a boxed version of the short value as an object. Unboxing a value must always be done using the type of the boxed value (or one of its interfaces). The conversion from short to int can only take place after the value has been unboxed. That is, use either this (for implicit conversion to int):

   1: int position = (short)rdr["Position"];

or this (for explicit conversion to int):

   1: var position = (int)(short)rdr["Position"];

The data type problem can be really annoying because data type mismatches just like the index problem will be visible at runtime only. It's tedious to hunt down these bugs and it makes modification to the database excessively expensive.

You can avoid this problem to great extent if you don't force a "hard" conversion of the column value with GetXYZ and casts. Instead use the "soft" conversion methods provided by the Convert class. Here's the example from above again, but this time it's more robust against data type changes.

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   3: using (var rdr = cmd.ExecuteReader()) {
   4:   while (rdr.Read()) {
   5:     var id = Convert.ToInt32(rdr["ID"]);
   6:     var post = Convert.ToInt32(rdr["Post"]);
   7:     var position = Convert.ToInt32(rdr["Position"]);
   8:     var text = Convert.ToString(rdr["Text"]);
   9:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  10:   }
  11: }

(Again, feel free to improve the code by replacing SELECT * with an explicit column list and/or use the index based indexer of the data reader instead)

The DBNull problem

There's one final issue I want to write about. In the examples above all columns are explicitly NON NULL. What if the columns contain NULL?

Let's assume for now that the columns Post, Position and Index could be NULL (ignoring the fact that it wouldn't make much sense in that context). How would our code look like? Maybe like this:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post =     rdr["Post"]     == null ? (int?)null : Convert.ToInt32(rdr["Post"]);
   3: int? position = rdr["Position"] == null ? (int?)null : Convert.ToInt32(rdr["Position"]);
   4: string text =   rdr["Text"]     == null ? null       : Convert.ToString(rdr["Text"]);

(I aligned the code a bit fore readability in this example)

You'd maybe expect that rdr["Position"] returns null if the column contains NULL. Run the example and you'll see that it's not the case. This is important to now: columns that contain NULL in the database will be returned as an instance of DBNull by the data reader! Furthermore DBNull can't be converted to any other datatype (int, short etc.) with one exception: if Convert.ToString is called on a DBNull object, an empty string is returned (at least in the Sql Server implementation of IDataReader).

To check whether a column in the database contains NULL it's therefore not valid to check whether a value returned by the data reader equals null (it won't ever). Instead, check whether the returned value is a DBNull. There are basically two ways for this. Either use IsDBNull:

   1: if(rdr.IsDBNull(1 /* column index */) {
   2:   // value is NULL
   3: }

or check the type of the value directly:

   1: if(rdr["TheColumn"] is DBNull){
   2:   // value is NULL
   3: }

The example from above should therefore look something like:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post     = rdr["Post"]     is DBNull ? (int?)null : Convert.ToInt32(rdr["Post"]);
   3: int? position = rdr["Position"] is DBNull ? (int?)null : Convert.ToInt32(rdr["Position"]);
   4: string text   = rdr["Text"]     is DBNull ?       null : Convert.ToString(rdr["Text"]);

(As always: using the string based indexer is just one option to retrieve values, the same is applicable in case you use indices)

While the code above is not the most efficient piece of C# ever written it covers many of the problems I mentioned in this article:

  • the code does not depend on the order of the columns in the database.
  • the code can handle a fair amount of possible data type changes made to the table
  • the code can handle NULL values

For convenience, you might want to write some helper methods that simplify the task of retrieving values from the data reader.

   1: public static int? GetInt32(IDataRecord dr, string columnName) {
   2:   var value = dr[columnName];
   3:   return value is DBNull ? (int?) null : Convert.ToInt32(value);
   4: }
   6: public static int? GetInt32(IDataRecord dr, int columnIndex) {
   7:   return dr.IsDBNull(columnIndex) ? (int?) null : Convert.ToInt32(dr[columnIndex]);
   8: }
  10: public static string GetString(IDataRecord dr, string columnName) {
  11:   var value = dr[columnName];
  12:   return value is DBNull ? null : Convert.ToString(value);
  13: }
  15: public static string GetString(IDataRecord dr, int columnIndex) {
  16:   return dr.IsDBNull(columnIndex) ? null : Convert.ToString(dr[columnIndex]);
  17: }

You can then use those methods like this:

   1: int id = Convert.ToInt32(rdr["ID"]);
   2: int? post = GetInt32(rdr, "Post");
   3: int? position = GetInt32(rdr, "Position");
   4: string text = GetString(rdr, "Text");

Of course, if you're using C# 3 you could make the methods above extensions to IDataRecord. In fact, I've written those extension methods for you (download ISC licensed source here). It basically does two things:

  • Provide equivalents to all those GetXYZ methods in IDataRecord that accept a column name instead of an index. Those methods are shortcuts for casting the values of reader["ColumnName"] to the proper type, but they suffer from the same data type and NULL value problem as their index based counterparts. Use them only for non null columns of which you know the data type exactly.
  • Provide "safe" versions of those methods that don't suffer from the data type problem and the NULL problem. I called those methods GetSafeXYZ (where XYZ is the data type of course). You won't find "safe" methods that accept a column index, but with the knowledge you - hopefully - gained from this article you should be able to write them yourself.

Here's the example from above using the "safe" methods:

   1: var cmd = new SqlCommand("SELECT * FROM Comment", con);
   3: using (var rdr = cmd.ExecuteReader()) {
   4:   while (rdr.Read()) {
   5:     int id = rdr.GetInt32("ID");
   6:     int? post = rdr.GetSafeInt32("Post");
   7:     int? position = rdr.GetSafeInt32("Position");
   8:     string text = rdr.GetSafeString("Text");
  10:     Console.WriteLine("{0,-3} {1,-3} #{2,-3} {3}", id, post, position, text);
  11:   }
  12: }


DataRecordExtensions.cs (20.37 kb)

Posted in: C# | Databases | Patterns

Tags: , ,

Get the names of databases on a SQL Server

June 20, 2008 at 6:34 PMAndre Loker

For administrative tasks you might need the names of all databases on a given SQL Server. Luckily SQL Server comes with some neat stored procedures that help a lot, for example:

   1: EXEC sp_databases; -- get name, size and remarks
   2: EXEC sp_helpdb;    -- get name, size, owner, dbid, creation date, 
   3:                    -- status and compatibility level

Those two SPs are certainly nice to have, but they return more than you might need. Given that it is not so easy to perform a SELECT on the results here's a simple query to return the names of all (online) databases:

   1: SELECT db_name(database_id) 
   2: FROM sys.master_files
   3: WHERE state = 0 -- only fetch databases that are online
   4: GROUP BY database_id;

Admittedly I did not come up with this all by myself. I simply looked at what sp_databases does and extracted the stuff I needed :-)

Posted in: Databases | Snippets

Tags: ,

Using table valued results of stored procedures

June 20, 2008 at 6:31 PMAndre Loker

If a stored procedure returns a table of data, it's not as easy as one might think to work with the result. Assume the following scenario:

- we have a table Account


- we have a (admittedly silly) stored procedure that performs some query and returns "Account" rows:

   1: CREATE PROCEDURE sp_GetAccounts
   2: AS
   3:     SELECT * FROM Account;

- we want to perform additional filtering & projection on the data returned by the SP, like:

   1: SELECT UserName, Email 
   2: FROM (EXEC sp_GetAccounts)
   3: WHERE Email LIKE ''
   4: ORDER BY UserName ASC;

This, however, does not work:

Incorrect syntax near the keyword 'EXEC'.

It seems that we cannot use the result of the SP as a normal table. But here's a little trick that will do what we want:

   1: -- create a table variable (ie. a temporary in-memory table)
   2: DECLARE @results TABLE (
   3:  ID int,
   4:  UserName nvarchar(50),
   5:  Email nvarchar(256),
   6:  HashedPasswod nvarchar(32),
   7:  Salt nvarchar(16)
   8: );
  10: -- fill it with the results from the SP
  11: INSERT INTO @results EXEC sp_GetAccounts;
  13: -- perform normal queries on the table
  14: SELECT UserName, Email 
  15: FROM @results
  16: WHERE Email LIKE ''
  17: ORDER BY UserName ASC;

We simply create a table variable compatible to the Account table to temporarily hold the results of the stored procedure. This table is filled using the SP (INSERTs work with SPs). We can then execute queries against that table variable as we like. By using a table variable (instead of a temporary table) everything is held in memory and we don't need to drop the table explicitly after we're done using it.

I'm not a SP guru, so if there are simpler ways to achieve the same result, drop me a comment.

Posted in: Databases | Snippets

Tags: , ,

NHibernate: counting database queries per web request

May 9, 2008 at 4:45 PMAndre Loker

While the upcoming NHibernate 2.0 supports statistics features out of the box, people using NHibernate 1.2 might as well be interested to determine the number of database queries that are executed on the database server.

In my specific case I was just interested in the number of DB queries per web requests, without any additional in-depth analysis. This was mostly meant as an easy detection of SELECT N+1 situations or other obvious bottlenecks. Luckily this can be quite easily done with a neat trick which is based on the fact that NHibernate uses log4net for log output. Ayende Rahien realized that NHibernate writes all executed statements to the NHibernate.SQL logger. So, to get informed over all executed database statements one would only have to implement a custom IAppender implementation and let it listen to the DEBUG messages NHibernate sends to NHibernate.SQL.

Within a few minutes I came up with this appender which counts the number of log messages - i.e. the number of SQL queries - per web request:

   1: using System.Web;
   2: using log4net.Appender;
   3: using log4net.Core;
   5: public class CountQueriesAppender : AppenderSkeleton {
   6:     private static readonly object Key = new object();
   8:     /// <summary>
   9:     /// Gets the number of database queries that have been executed
  10:     /// during this web request so far.
  11:     /// </summary>
  12:     /// <value>The current query count.</value>
  13:     public static int CurrentQueryCount {
  14:         get {
  15:             var val = HttpContext.Current.Items[Key];
  16:             return val == null ? 0 : (int) val;
  17:         }
  18:     }
  20:     protected override void Append(LoggingEvent loggingEvent) {
  21:         var items = HttpContext.Current.Items;
  22:         if (items.Contains(Key)) {
  23:             // increase query count
  24:             items[Key] = 1 + (int) items[Key];
  25:         } else {
  26:             // first query, initialize with 1
  27:             items[Key] = 1;
  28:         }
  29:     }
  30: }

I only need to configure log4net to send the appropriate messages:

   1: <!-- set additivity false to prevent output to other appenders -->
   2: <logger name="NHibernate.SQL" additivity="false">
   3:   <level value="DEBUG" />
   4:   <appender-ref ref="QueryCounter" />
   5: </logger>

That's about it. Now you can access the current query count with CountQueriesAppender.CurrentQueryCount everywhere you like. For example, I render the number of executed queries in the footer of each page in debug mode. Or you could write the count back into a log on end request.

Pay attention though to the moment when you read CurrentQueryCount. Obviously it returns only queries made until that very moment. If you query the value to soon, you might miss some queries.

You can make your appender as sophisticated as you like, of course, e.g. by dissecting the provided log message (i.e. the SQL query) to distinguish between SELECT, UPDATE, INSERT and DELETE queries. Be aware that NHibernate can come up with quite some tricky queries, though, that might not be easy to parse. You might therefore want to leave it at the simple query counter as a rough estimation for now and wait for the release of NHibernate 2.0.

Last but not least: here is an example of how one could use NHibernate 2.0 statistics with MonoRail

Posted in: Databases | NHibernate | Snippets

Tags: , ,

Tabellen mit bigint-Prim&auml;rschl&uuml;ssel in Access verkn&uuml;pfen

February 26, 2008 at 3:08 PMAndre Loker

Für ein Projekt benötigten wir folgendes Szenario: eine SQL Server Datenbank sollte mittels ODBC für Access verfügbar gemacht werden. Eine der Tabellen sollte dann in Access verwendet werden. Dazu bietet Access zwei Optionen:

  • Daten aus der Tabelle importieren. Hierbei wird der Inhalt der Tabelle in eine lokale Access-Datenbank kopiert
  • Tabelle mit der Access Datenbank verknüpfen. Hier wird lediglich eine Verknüpfung mit der Tabelle im SQL Server hergestellt, sodass die Daten immer aktuell sind.

Die erste Möglichkeit funktionierte auch tadellos, beim Verknüpfen der Tabelle trat jedoch ein Problem auf: sämtliche Spalten aller Reihen enthielten statt der erwarteten Werte nur ein "#gelöscht" (bzw. #deleted oder #verwijderd, je nach Sprachversion). Nach einigem Suchen sind wir darauf gestoßen, dass das Problem darin bestand, dass die Tabelle als Primärschlüssel einen bigint Datentypen verwendete. Damit kam Access bzw. der JET Treiber nicht zurecht und interpretiert den Schlüssel als Binärdaten bzw. als Text. KB321901 erwähnte dieses Problem bereits im Juni 2004. Warum Microsoft es bisher nicht geschafft hat, dieses Problem zu lösen, ist mir schleierhaft.

Auf der Suche nach einem Workaround wurde ich in einem Forenbeitrag fündig: man erstellt einfach eine Sicht (View) der gewünschten Daten und konvertiert in dieser Sicht den Primärschlüssel in einen varchar.

   1: CREATE VIEW vwTheTable AS 
   2: SELECT CAST(ID AS varchar(20)) AS AccessID, *
   3: FROM TheTable

Die Sicht kann man dann in Access verknüpfen. Wird man dabei nach dem Primärschlüssel gefragt, so wählt man die Spalte AccessID aus. Lesezugriff und Updates sollten damit funktionieren. Inserts haben wir bisher nicht getestet. Sollte Access dennoch die alte ID Spalte als Primärschlüssel auswählen, kann man die Spalte beim Erstellen der Sicht weglassen.

Posted in: Databases

Tags: ,