XmlSerializer and automatic collection properties

August 7, 2009 at 2:09 PMAndre Loker

I noticed an interesting detail in the way XmlSerializer handles collection properties. FxCop rule CA2227 suggest to make collection properties read-only, because XmlSerializer is treating collections differently as described here (scroll down to the “Note” box under “Overriding Default Serialization”). When I tried to serialize the following class, however, I got an error during serialization:

   1: public class SomeClass
   2: {
   3:   public SomeClass()
   4:   {
   5:     CollectionProperty = new List<int>();
   6:   }
   7:  
   8:   public List<int> CollectionProperty { get; private set; }
   9: }

The error was something like:

System.InvalidOperationException: Unable to generate a temporary class (result=1).

error CS0200: Property or indexer 'SomeClass.CollectionProperty' cannot be assigned to — it is read only.

Erm, right, I made it read-only because FxCop suggested it, so what’s wrong here? The answer is that “making the collection property read-only” should be read as “don’t provide a setter of any visibility”, because technically a property with a private setter is still writable, albeit from within the class only, leading to the runtime error as described above. FxCop on the other hand will stop complaining as soon as the setter is made private. 

As soon as I changed the automatic property to a property with backing field without a setter the application ran fine:

   1: public class SomeClass
   2: {
   3:   private readonly List<int> collectionProperty;
   4:  
   5:   public SomeClass()
   6:   {
   7:     collectionProperty = new List<int>();
   8:   }
   9:  
  10:   public List<int> CollectionProperty
  11:   {
  12:     get { return collectionProperty; }
  13:   }
  14: }

Posted in: C#

Tags: , ,

Getting rid of strings (3): take your app settings to the next level

September 5, 2008 at 1:23 PMAndre Loker

In the previous parts of this series (part 1, part 2) I talked about the problems with literal strings in source code and presented different strategies to avoid those problems. In this episode I'll explain how we can abstract from app settings and leverage the power of the Castle DictionaryAdapter to improve the way our applications access their app settings.

Basics

Probably the easiest way to make certain aspects of a .NET application configurable is by using app settings. You can define app settings by adding (or augmenting) an <appSettings> section to your web.config or app.config, depending on the project type. This could look something like this:

   1: <?xml version="1.0" encoding="utf-8" ?>
   2: <configuration>
   3:   <appSettings>
   4:     <add key="MaxUsers" value="20"/>
   5:     <add key="FeedbackMail" value="foo@localhost"/>
   6:   </appSettings>
   7: </configuration>

[As a small side note: you don't have to physically keep the appSettings section in the web.config/app.config file - read Keep your .config clean with external config files to learn how to move settings outside of the config file.]

To access the settings easily use the configuration API exposed by ConfigurationManager, specifically the AppSettings property. Here's a tiny app that accesses the values from the appSettings section:

   1: using System;
   2: using System.Configuration;
   3:  
   4: public class Program {
   5:   private static void Main() {
   6:     string feedbackMail = ConfigurationManager.AppSettings["FeedbackMail"];
   7:     int maxUsers = int.Parse(ConfigurationManager.AppSettings["MaxUsers"]);
   8:     Console.WriteLine("Feedback e-mail address: {0}", feedbackMail);
   9:     Console.WriteLine("Max users: {0}", maxUsers);
  10:   }
  11: }

As you see, that's fairly simple. Some points of interest:

  • ConfigurationManager lives in System.Configuration. You need to add the assembly System.Configuration.dll to your project to use this class
  • AppSettings has two indexers:
    • One indexer accepts a string: the string provided should be one of the keys as defined in the appSettings section. The value returned is the content of the value attribute of the respective appSettings entry or null if the key wasn't found. Keys are case insensitive by the way, so AppSettings["feedbackMAIL"] returns the same value as AppSettings["FeedbackMail"]. Note, however, that white space before or after the key does count, so AppSettings[" FeedbackMail"] will return null. Also be sure to check your config file for accidentally added white space within the key attribute if your app does not find certain keys.
    • The other indexer accepts an integer, which is the index of the app setting to return. Granted, I don't see much use for that indexer.
  • The values returned by the indexers are always strings. Therefore, if you want to have an app setting as an integer you'll need to parse it.

Let's analyse this basic approach:

  • We use strings to index the AppSettings property which - as you should know by now - is very problematic.
    • If you misspell the key, the value returned is null, but you won't notice it before runtime.
    • If you change the key in the app settings, you'll have to update all references to that key in your code. Remember that strings are hard to refactor.
  • Values are always returned as strings so you need to parse them if you need data of a type other than string

Improving the situation

If you read the first article of this series you know that there are two basic steps that can improve the situation:

  1. Avoid spreading literals all over the code, define and use constants instead
  2. Hide any string dependent code behind an appropriate API

Applying those rules might lead you to the idea to write a class that encapsulates the app settings. A good idea indeed! Here's a class that encapsulates the app settings of our tiny example:

   1: class MySettings {
   2:   // constants that define the keys in the app settings
   3:   private const string MaxUsersKey = "MaxUsers";
   4:   private const string FeedbackMailKey = "FeedbackMail";
   5:  
   6:   public static string FeedbackMail {
   7:     get { return ConfigurationManager.AppSettings[FeedbackMailKey]; }
   8:   }
   9:  
  10:   public static int MaxUsers {
  11:     get { return int.Parse(ConfigurationManager.AppSettings[MaxUsersKey]); }
  12:   }
  13: }

I made the properties static because the class is stateless. Now you can access the settings like this:

   1: public class Program {
   2:   private static void Main() {
   3:     string feedbackMail = MySettings.FeedbackMail;
   4:     int maxUsers = MySettings.MaxUsers;
   5:     Console.WriteLine("Feedback e-mail address: {0}", feedbackMail);
   6:     Console.WriteLine("Max users: {0}", maxUsers);
   7:   }
   8: }

This is much better! The strings are neatly hidden, so is the parsing of the integer. You can refactor the names of the properties easily using the refactoring tool of your choice.

Is this solution the final answer? Certainly not. This solution still suffers from some issues :

  • It's tedious to add and implement properties and possible parsing manually for each setting in the app settings section.
  • The settings are now strictly tied to ConfigurationManager. It's difficult to mock some app settings for unit testing using this approach.

Let me elaborate on the second issue: assume you have a class that is responsible for sending an email to the system administrator in case of an error. Here's a possible excerpt of such a system:

   1: // an interface we use to abstract sending of emails.
   2: public interface IMailSender {
   3:   void Send(string to, string subject, string text);
   4: }
   5:  
   6: // Sends error reports by mail
   7: public class ErrorReporter {
   8:   private readonly IMailSender mailSender;
   9:  
  10:   public ErrorReporter(IMailSender mailSender) {
  11:     this.mailSender = mailSender;
  12:   }
  13:  
  14:   public void SendErrorReport(string text) {
  15:     var email = MySettings.FeedbackMail;
  16:     mailSender.Send(email, "Application error", text);
  17:   }
  18: }
  19:  

It's certainly nice that we did not hardcode the email address that the report is sent to. But if we were to unit test the method we'd have to provide a value for the FeedbackMail app setting. (I'm using Rhino Mocks by the way)

   1: [Test]
   2: public void SendErrorReport_UsesMailSender() {
   3:   var mailSender = MockRepository.GenerateMock<IMailSender>();
   4:  
   5:   var text = "test message";
   6:   var email = "foo@localhost";
   7:  
   8:   // need to inject the email into the app settings
   9:   ConfigurationManager.AppSettings["FeedbackMail"] = email;
  10:  
  11:   var reporter = new ErrorReporter(mailSender);
  12:   reporter.SendErrorReport(text);
  13:  
  14:   mailSender.AssertWasCalled(x=>x.Send(email, "Application error", text));
  15: }

To make the test work we need to inject the expected email address into the app settings. I personally think that this situation is awkward and does certainly not isolate the unit test around one class under test. ErrorReporter uses MySettings which again uses ConfigurationManager. For my taste the number of classes involved in this unit test is unnecessarily high.

A solution

If you've never heard of Castle DictionaryAdapter, go and read my article on it, because this little tool will improve our solution a lot.

Here's what we do:

  • Delete MySettings, we won't use it anymore
  • Create instead an interface that represents your app settings:
   1: public interface ISettings {
   2:   int MaxUsers { get; }
   3:   string FeedbackMail { get; }
   4: }
  • Create a dictionary adapter that handles all the plumbing between ISettings and AppSettings:
   1: var factory = new DictionaryAdapterFactory();
   2: var adapter = factory.GetAdapter<ISettings>(ConfigurationManager.AppSettings);
  • Use this object instead of MySettings:
   1: public class Program {
   2:   private static void Main() {
   3:     var factory = new DictionaryAdapterFactory();
   4:     var settings = factory.GetAdapter<ISettings>(ConfigurationManager.AppSettings);
   5:     var feedbackMail = settings.FeedbackMail;
   6:     var maxUsers = settings.MaxUsers;
   7:     Console.WriteLine("Feedback e-mail address: {0}", feedbackMail);
   8:     Console.WriteLine("Max users: {0}", maxUsers);
   9:   }
  10: }

Again, we have a type safe interface for our app settings. But what did we gain?

First we have to use much less code to define the interface for our app settings: ISettings is a no-brainer, just define properties with the desired type and the name matching the app settings key [see this article on DictionaryAdapter to learn how this can be configured in detail]. The DictionaryAdapter will handle all necessary lookup and conversion for you. Less code on our side is A Good Thing™.

Secondly by using an interface we have created a better abstraction of the app settings. Combine this with the power of dependency injection and you've created a basis for well testable, highly configurable code. If you don't see what I mean, read on and let me elaborate.

Why this solution rocks

First of all change ErrorReporter to accept an ISettings instance and use it instead of MySettings:

   1: public class ErrorReporter {
   2:   private readonly IMailSender mailSender;
   3:   private readonly ISettings settings;
   4:  
   5:   // Inject ISettings as well
   6:   public ErrorReporter(IMailSender mailSender, ISettings settings) {
   7:     this.mailSender = mailSender;
   8:     this.settings = settings;
   9:   }
  10:  
  11:   public void SendErrorReport(string text) {
  12:     var email = settings.FeedbackMail; // use ISettings instead of MySettings
  13:     mailSender.Send(email, "Application error", text);
  14:   }
  15: }

Two changes have taken place: I inject the ISettings dependency through the constructor and I use this object to access the email address.

ISettings in action - unit test

Here's the updated version of the unit test from above:

   1: [Test]
   2: public void SendErrorReport_UsesMailSender() {
   3:   var mailSender = MockRepository.GenerateMock<IMailSender>();
   4:  
   5:   var text = "test message";
   6:   var email = "foo@localhost";
   7:  
   8:   // create a stub for ISettings
   9:   var settings = MockRepository.GenerateStub<ISettings>();
  10:   // let this stub's FeedbackMail property return a specific value
  11:   settings.Stub(x => x.FeedbackMail).Return(email);
  12:  
  13:   // inject the settings
  14:   var reporter = new ErrorReporter(mailSender, settings);
  15:   reporter.SendErrorReport(text);
  16:  
  17:   mailSender.AssertWasCalled(x=>x.Send(email, "Application error", text));
  18: }

This approach is far superior to the old approach because we only have to stub a direct dependency of ErrorReporter (ie. ISettings). There's no need to configure an indirect dependency like ConfigurationManager.AppSettings. Furthermore we could now make an ISettings mock to check whether ErrorReporter really uses the settings object. Short, we have the full control of the behaviour of the ISettings instance.

ISettings in action - the real app

To use ISettings and ErrorReporter in a real app, we'd preferably use an IoC container like Castle Windsor to configure the ErrorReporter instance.

Here's how you could configure the container:

   1: var container = new WindsorContainer();
   2: // we'll need an implementation of IMailSender
   3: container.Register(Component.For<IMailSender>().ImplementedBy<SomeClassImplementingIMailSender>());
   4:  
   5: // create the adapter
   6: var adapter = new DictionaryAdapterFactory().GetAdapter<ISettings>(ConfigurationManager.AppSettings);
   7: // register this adapter instance as the component for ISettings
   8: container.Register(Component.For<ISettings>().Instance(adapter));
   9:  
  10: // and finally register the error reporter
  11: container.Register(Component.For<ErrorReporter>());

Whenever we need an instance of ErrorReporter now we can ask the container for it (object locator approach):

   1: var reporter = container.Resolve<ErrorReporter>();
   2: reporter.SendErrorReport("Something went horribly wrong");

Alternatively you could let the IoC container inject an ErrorReporter instance where you need it (dependency injection approach).

Both ways we'll get an ErrorReporter instance that is readily configured to access the app settings in a transparent way.

Conclusion

By using the proposed combination of

  1. an interface defining the app settings
  2. a dictionary adapter that maps between the AppSettings and the interface and
  3. an IoC container that can inject the adapter wherever we need it

we were able to completely get rid of any app settings related strings and created an architecture that is easily configurable and well testable.

By abstracting from the appSettings in the app.config file it's also easy to use a different configuration source instead. DictionaryAdapter can wrap all kinds of dictionaries and NameValueCollections. And if you come to the conclusion that you'd rather want to store the app settings in a database, it's easy to do, as well. Instead of using the DictionaryAdapter you could implement ISettings in a way that loads app settings from a database.

Posted in: C# | Patterns

Tags: , , , ,

Getting rid of strings (2): use lambda expressions

June 12, 2008 at 12:08 PMAndre Loker

Intro

In the first article of this series I talked about the problems with strings in code. This article will show you how you can use lambda expressions and expression trees as another tool to avoid strings.

About Lambda Expressions

C# 3.0 brought a cool new feature call lambda expressions. On the one hand they are a nice abbreviation for anonymous delegates:

   1: Button b = /*..*/
   2: b.Click += (sender, e) => MessageBox(String.Format("{0} clicked", sender);

But there's an additional feature that might not be obvious to everyone. .NET 3.5 introduced the System.Linq.Expressions namespace which allows us to inspect a code expression tree. One special expression type is Expression<TDelegate> which derives from LambdaExpression. This expression type handles the expression tree represented by a lambda expression. Let's look at an example:

   1: public class Program {
   2:  
   3:   public static void ExpressionTest(Expression<Func<DateTime>> expression) {
   4:     Console.WriteLine("Expression body is '{0}'", expression.Body);
   5:     Console.WriteLine("Node type is {0}", expression.Body.NodeType);
   6:     Console.WriteLine("Expression body type is {0}", expression.Body.GetType());
   7:   }
   8:  
   9:   public static void Main(string[] args) {
  10:     ExpressionTest(() => DateTime.Now);
  11:   }
  12: }

ExpressionTest expects an expression representing a Func<DateTime>, that is a function that returns a DateTime. In the Main method ExpressionTest is invoked - not with an Expression<Func<DateTime>> but simply with a lambda expression with the Func<DateTime> signature. The C# 3.0 compiler will convert the lambda expression that is passed as argument to a expression tree with a top node of type Expression<Func<DateTime>>. The Body property of that expression contains the right hand side of the lambda expression.

Running this code prints:

   1: Expression body is 'DateTime.Now'
   2: Node type is MemberAccess
   3: Expression body type is System.Linq.Expressions.MemberExpression

The runtime type of the expression body is MemberExpression, which makes sense, because DateTime.Now represents access to a member (Now) of DateTime. Run the code in the debugger to see how the expression is represented in the expression tree.

I won't go into too much detail on expressions here. Browse through the documentation of the Expressions namespace to see what kind of expressions you can expect (and inspect for that matter).

How can lambda expressions help to avoid strings?

Expressions can be useful in situations where you need to provide the name of a member or a MethodInfo/PropertyInfo/FIeldInfo for that member. Ever needed to pass a MethodInfo somewhere? You'll most likely ended up with something like

   1: var info = typeof (DateTime).GetMethod("ToShortDateSting");
   2: Console.WriteLine(info.Name);

This compiles fine, of course, but when you run the code you'll get a null pointer exception at info.Name. Why? Because there's a typo in "ToShortDateSting". The method I was looking for is ToShortDateString (realize the 'r' in String). Did you see the typo at a glance? The situation gets worse if the name of the method changes, because now the code would break at runtime without being changed (see the first article to learn about problems with strings and refactoring).

Captain Lambda to the rescue

Here's an approach that is much more solid:

   1: var info = Reflect.GetMethod<DateTime>(dt => dt.ToShortDateString());
   2: Console.WriteLine(info.Name);

No strings attached so to speak. If you had a typo in ToShortDateString the code would not even compile. Additionally, the code is much more open to refactoring. But wait, how does it work? Here's the simple answer:

   1: public static class Reflect {
   2:   /// <summary>
   3:   /// Gets the MethodInfo for the method that is called in the provided <paramref name="expression"/>
   4:   /// </summary>
   5:   /// <typeparam name="TClass">The type of the class.</typeparam>
   6:   /// <param name="expression">The expression.</param>
   7:   /// <returns>Method info</returns>
   8:   /// <exception cref="ArgumentException">The provided expression is not a method call</exception>
   9:   public static MethodInfo GetMethod<TClass>(Expression<Action<TClass>> expression) {
  10:     var methodCall = expression.Body as MethodCallExpression;
  11:     if(methodCall == null) {
  12:       throw new ArgumentException("Expected method call");
  13:     }
  14:     return methodCall.Method;
  15:   }
  16: }

GetMethod expects an expression with a delegate of type Action<TClass>, that is a void method having a TClass as it's only argument. GetMethod checks that the expression passed in is a method call by casting the body to a MethodCallExpression. If the cast succeeds, the Method property already contains the MethodInfo that we were looking for. Thank you C# 3.0 compiler for doing the work for us :-)

A more practical example

In MonoRail the Controller class has a method called RedirectAction which - well - redirects the response to a new action. It expects the name of an action as its argument. So you might see code like this:

   1: public class HomeController : Controller {
   2:   
   3:   public void Index() {
   4:     if(!UserIsLoggedIn){
   5:         RedirectToAction("Login");
   6:     }
   7:   }
   8:  
   9:   public void Login() {
  10:   }
  11: }

This works fine but of course it suffers from all the string related problems I have been talking about so far. Let's see if we can use our new friend (Expression<TDelegate>) to improve the situation:

   1: public static class ControllerExtensions {
   2:   public static void RedirectToAction<TController>(this TController controller, Expression<Action<TController>> expression) where TController : Controller {
   3:     var methodCall = expression.Body as MethodCallExpression;
   4:     if (methodCall == null) {
   5:       throw new ArgumentException("Expected method call");
   6:     }
   7:     controller.RedirectToAction(methodCall.Method.Name);
   8:   }
   9: }

Now we have an extension method that we can use instead of the original RedirectToAction:

   1: public class HomeController : SmartDispatcherController {
   2:  
   3:   public void Index() {
   4:     // RedirectToAction("Login");
   5:     this.RedirectToAction(c => c.Login());
   6:   }
   7:  
   8:   public void Login(){
   9:   }
  10: }

Is this cool or what? Once again we got rid of a string. You can redirect to a method with parameters as well:

   1: public void Index() {
   2:   this.RedirectToAction(c => c.ShowItem(0));
   3: }
   4:  
   5: public void ShowItem(int id){
   6: }

You can pass any value to the "call" to ShowItem that you like. Remember: the expression is only examined but not executed. If you want to pass an actual value to the redirected action, create extension methods for the RedirectToAction overloads that accept parameters. I won't show this here because it is not too hard to implement (and I'm only showing some examples here anyway).

Properties

You can also get a PropertyInfo (and similarly a FieldInfo) using Lambdas, here's an example:

   1: public static class Reflect {
   2:   public static PropertyInfo GetProperty<TClass, TValue>(Expression<Func<TClass, TValue>> expression) {
   3:     var memberExpression = expression.Body as MemberExpression;
   4:     if (memberExpression == null || !(memberExpression.Member is PropertyInfo)) {
   5:       throw new ArgumentException("Expected property expression");
   6:     }
   7:     return (PropertyInfo) memberExpression.Member;
   8:   }    
   9: }
  10:  
  11: // use:
  12: var dayProperty = Reflect.GetProperty<DateTime, int>(dt => dt.Day);
  13: Console.WriteLine(dayProperty.Name);

NB: the code shown above only works for properties that are not write-only (otherwise you will not be able to "return" the property value in the expression). I don't consider this a big limitations. How many write-only properties have you written in the past two months?

In the example above we have to provide both the type of the class as well as the type of the property. It makes the code slightly less elegant. We can improve it like this:

   1: public static PropertyInfo GetProperty<TClass>(Expression<Func<TClass, object>> expression) {
   2:     MemberExpression memberExpression;
   3:     // if the return value had to be cast to object, the body will be an UnaryExpression
   4:     var unary = expression.Body as UnaryExpression;
   5:     if (unary != null) {
   6:       // the operand is the "real" property access
   7:       memberExpression = unary.Operand as MemberExpression;
   8:     } else {
   9:       // in case the property is of type object the body itself is the correct expression
  10:       memberExpression = expression.Body as MemberExpression;
  11:     }
  12:     // as before:
  13:     if (memberExpression == null || !(memberExpression.Member is PropertyInfo)) {
  14:       throw new ArgumentException("Expected property expression");
  15:     }
  16:     return (PropertyInfo) memberExpression.Member;
  17: }
  18:  
  19: var dayProperty = Reflect.GetProperty<DateTime>(dt => dt.Day);
  20: Console.WriteLine(dayProperty.Name);
  21:  

It's a bit more complicated, but still understandable.

Update 07/22/2008: RednaxelaFX came up with a third alternative. GetProperty stays the same as in the first (simpler) version, but we call the method differently:

   1: // use:
   2: var dayProperty = Reflect.GetProperty( (DateTime dt) => dt.Day);
   3: Console.WriteLine(dayProperty.Name);

By providing the type of the expression's argument explicitly the compiler is now able to infer the return value of the expression. Thanks RednaxelaFX for your comment!

Summing it up

Lambda expression trees are a great to tool that allows us to point to members without using strings. There are some limitations, though:

  • It only works for "compile time reflection". The expression tree is created during compilation, so you cannot get the name of "some member" of "some type" at runtime.
  • It won't work with static members
  • It will only work for members that are visible in the context where the expression is built. Non-public fields, properties and methods can therefore be tricky using this technique.

Still there are plenty of situations where you need to provide the MemberInfo of a public instance method/property (or it's name) known at compile time.

Posted in: C# | Patterns

Tags: , , ,

Getting rid of strings (1): meet the villain

June 10, 2008 at 1:30 PMAndre Loker

Intro

This series covers the potential problems that arise from the use of string literals in source code. Avoiding string literals can make your source code more robust, more manageable and less fragile regarding refactoring. The articles in this series will show several solutions on how to replace string literals with smarter constructs.

The problem: meaningful strings

Strings in source code can be a tricky thing. How much depends on the content and the context. A "Hello, world!" that is spit out somewhere is not that much of a problem. The meaningful strings are the problematic ones. I consider a string meaningful if it contains a named entity in the source code (types, members etc.) or a named resource (configuration settings, embedded resources, file names).  Here are two examples of meaningful strings that can be tricky:

   1: string setting = ConfigurationManager.AppSettings["UserName"];
   2: MethodInfo info = typeof (Foo).GetMethod("Execute");

What's the deal with them? Generally, strings like these cause trouble if

  1. the string is used multiple times in the code
  2. the string literal has to be changed

In the first example the string "UserName" refers the key of an application setting, found in the <appSettings> element in the application/web configuration file. If you were to change the key for whatever reason you would have to hunt down all occurrences of "UserName" and replace them. While practically all modern IDE's and text editors provide a search and replace function over multiple files, blindly replacing all occurrences of "UserName" might also override instances of "UserName" that are in no way related to the configuration setting. For example "UserName" could be used as a filter criterion in an NHibernate query:

   1: public Account GetAccountByUserName(string userName){
   2:   ISession session = //.. get an NHibernate session
   3:   return session.CreateCriteria(typeof(Account))
   4:     .Add(Expression.Eq("UserName", userName))
   5:     .UniqueResult<Account>();
   6: }

Changing "UserName" to anything else will most likely cause an error at runtime.

In the second example "Execute" references the name of a method of some fictional type Foo. Refactoring tools are widely available these days, so renaming Execute to something different is done quickly. Changing the method name will brake the code, again at runtime. While decent refactoring tools will try to find the symbol being renamed in strings, relying on this feature suffers from the same problem as the one above.

How to handle strings: basic rules

The basic rules of using meaningful string literals in source code:

  1. Avoid them. If possible, simply don't use string literals which are meaningful and suffer from the problems described above.
  2. If the previous is not possible, at least remove all duplicate occurrences of the same string literal and replace them with a named symbol. Just like 'magic numbers', replace all string literals with the same meaning by a string constant with a descriptive name. For the first example above, one could introduce a static class containing the settings keys (see below)
  3. Hide the use of string literals and/or string constants behind an API. (see below)

For the first example (the app settings key), rule 2 could be applied like this:

   1: public class AppSettingsKeys {
   2:   public const string UserName = "UserName";
   3: }
   4: //...
   5: var userName = ConfigurationManager.AppSettings[AppSettingsKeys.UserName];

But maybe applying rule 2 and 3 is even better:

   1: public static class AppSettings {
   2:   const string UserNameKey = "UserName";
   3:  
   4:   public static string UserName {
   5:     get{ return Get(UserNameKey);}
   6:   }
   7:  
   8:   static string Get(string key) {
   9:     return ConfigurationManager.AppSettings[key];
  10:   }
  11:  
  12: }
  13: //...
  14: var userName = AppSettings.UserName;

The fact that we are dealing with strings to access the app settings is nicely hidden behind the API provided by AppSettings. (Note: a subsequent article of this series will show a much better way to access app settings).

In both examples if the key of the "UserName" app setting changed, only one string had to be updated (AppSettingsKeys.UserName or AppSettings.UserNameKey). The name of the string constant could be left alone or it could be easily renamed using refactoring tools.

Conclusion

This article gives a fairly basic overview of the problems that can occur when dealing with meaningful string literals in source code. The next parts of this series will cover more sophisticated techniques to get rid of strings.

Posted in: C# | Patterns

Tags: , ,

Arrange items in a list according to a list of keys

June 5, 2008 at 12:47 PMAndre Loker

Recently I needed an algorithm to change the order of elements in an item list of type IList<T>. The items had a unique key. The new order of the items was defined by a second collection which only contained the keys in the new order. Therefore, the algorithm had to move the elements in the IList<T> in such a way that their order matches the order of the key in the key collection.

To better illustrate the scenario, here's an example. Assume a domain object Person:

   1: public class Person {
   2:   public int ID { get; set; }
   3:   public string Name { get; set; }
   4: }

Say we have an array with Person instances:

   1: var people = new[]
   2: {
   3:   new Person{ID=1, Name="Jan"},
   4:   new Person{ID=2, Name="Piet"},
   5:   new Person{ID=3, Name="Klaas"},
   6: };

I now want to rearrange the order of the Person instances, given a second array which contains the keys of the items in the correct new order:

   1: var newOrderKeys = new[] {3, 1, 2};

That is, after rearranging the items, the following should hold:

   1: Assert.AreEqual(3, people[0].ID );
   2: Assert.AreEqual("Klaas", people[0].Name);
   3:  
   4: Assert.AreEqual(1, people[1].ID );
   5: Assert.AreEqual("Jan", people[1].Name);
   6:  
   7: Assert.AreEqual(2, people[2].ID );
   8: Assert.AreEqual("Piet", people[2].Name);

Here are some constraints and assumptions for the algorithm

  • The algorithm must not add or remove items from the list as it must be applicable to arrays (which do not allow appending or removing of items)
  • The keys must implement IEquatable<TKey>
  • Each key appear only once in the original list
  • The algorithm should gracefully handle errors in the key list, like
    • Duplicate keys
    • Too few or too many keys
    • Keys that don't exist in the original item list
  • If the number of distinct keys in the key list does not match the number of items in the item list, arrange only as many items as there are distinct valid keys (or at most itemList.Count items). If there are n valid distinct keys, the first min(n, itemList.Count) items should be arranged, all other items may be left in any order.

This was my first version:

   1: public static void Arrange<TItem, TKey>(this IList<TItem> items, IEnumerable<TKey> newOrderKeys, Func<TItem, TKey> keyExtractor)
   2:   where TKey : IEquatable<TKey> {
   3:   // index at which the next item will be placed
   4:   var writerIndex = 0;
   5:   foreach (var key in newOrderKeys.Distinct().TakeWhile(k => writerIndex < items.Count)) {
   6:     var index = -1;
   7:     // find the index of the item with the given key
   8:     for (var i = writerIndex; i < items.Count; ++i) {
   9:       // do keys match?
  10:       if (keyExtractor(items[i]).Equals(key)) {
  11:         index = i;
  12:         break;
  13:       }
  14:     }
  15:     if (index >=0) {
  16:       // swap the item with the one at the write index
  17:       // but only if the index of that item is higher than the 
  18:       // write index
  19:       if (index > writerIndex) {
  20:         var temp = items[index];
  21:         items[index] = items[writerIndex];
  22:         items[writerIndex] = temp;
  23:       }
  24:       // increase write index if element was found
  25:       ++writerIndex;
  26:     }
  27:   }
  28: }

It works like this:

  • First, remove duplicates from the newOrderKeys list (the key list), this is done with Distinct()
  • Limit the number of items to grab from the remaining list to he length of the item list lists. This is done with TakeWhile(). I could have used Take(items.Count), but this would count keys that are not found in the item list, so I chose to track the number of valid keys in writerIndex and compare this variable to items.Count. Note that Resharper gives a warning ("Access to modified closure") when accessing writerIndex in the lambda. In this case, this is no problem as the lambda invocation and increment of writerIndex take turns. Let's just say: it works :-)
  • For each item in the filtered and limited key list, find the index of the corresponding item in the item list items. All elements with an index less than writerIndex will have been arranged correctly, so start searching at index writerIndex.
  • If such an element was found, swap it with the item at writerIndex and increase writerIndex. As an optimization, do not swap if writerIndex == index.
  • The method is an extension method for IList<T>s, so usage is simple.

And here's how to use it:

   1: people.Arrange(newOrderKeys, p => p.ID);

Easy, isn't it? Just pass an enumeration with the keys in the new order and provide a function that maps from an item to a key.

Improvements

As you see the indices of the items are found by a linear search. For small lists this isn't a problem, for larger ones it can become really inefficient. A faster alternative is the use of a dictionary which stores the indices for each key in the item list. This dictionary can be easily created like this:

   1: var index = 0;
   2: var dict = items.ToDictionary(keyExtractor, item => index++);

This creates a dictionary of all keys in the item list mapped to the index in the list. Looking up the index is now easy and fast:

   1: if(!dict.TryGetValue(key, out index){
   2:   index = -1;
   3: }

I did some benchmarks and found that for lists with up to ~40 elements it is faster to simply use a linear search for the indices. Larger lists could benefit from the dictionary method. This looked like a good place for the Strategy Pattern. My final version (see attached file) therefore switched the strategy depending on the size of the collection.

Using a strategy for index lookup and refactoring element swapping out to a separate method the core of the arrange method becomes this:

   1: foreach (var key in newKeysOrder.Distinct().TakeWhile(k => writeIndex < items.Count)) {
   2:   var index = indexOf(items, writeIndex, key);
   3:   if (index >= writeIndex){
   4:     items.Swap(index, writeIndex++);
   5:   }
   6: }

indexOf is a delegate of type Func<IList<TItem>, int, TKey, int> that will return the index of the element in items that has the given key (given a start index as a hint).

What is it good for?

If you have read until here you might ask yourself where one would possibly use an algorithm like this. The place where I needed it is a web application which allowed the user to sort items using drag and drop (using jQuery's sortable feature). When the user has changed the order of the items in the browser, jQuery sends an AJAX request to the server that contains the keys of the items in the new sort order. The server uses the Arrange method to change the order of the items in its domain model.

I have put up a small example app that does a similar thing.

To the demo app.

You can change the order of the album tracks using simple drag and drop. jQuery will send the new order to the server which responds with the order as it is known by the server (this will allow you to compare client sort order with server sort order). The app uses MonoRail and the action that handles the AJAX request looks like this:

   1: public void UpdateOrder(string items) {
   2:   CancelView();
   3:   if (items == null) {
   4:     RenderText("Bad argument");
   5:   } else {
   6:     var album = GetAlbum();
   7:  
   8:     // parse ids from parameter
   9:     var ids = items.Split('&').Select(s => new Guid(s.Substring("item[]=".Length)));
  10:     // change order of album tracks according to order of GUIDs
  11:     album.Tracks.Arrange(ids, t => t.ID);
  12:  
  13:     RenderText("New order (on server): " + TitleOrderToList(album));
  14:   }
  15: }

The items parameters contains the IDs (GUIDs) of the tracks like this: "item[]=some-guid&itemp[]=some-guid&...".

The first cool thing is the way the Guids are extracted from the string, Then the Arrange method is used to let the model reflect the new sort order.

If you have any use for the code, use it, spread it, modify it any way you like. Don't blame me if explodes, though ;-)

Download the source code:

ArrangeList.cs (3.39 kb)

Update 6/9/2008:

JSK detected a small bug in the source code which caused the writeIndex not to be updated if an item already was in place. I fixed it in the article, the downloadable source code and the demo app. Thanks for spotting this!

Posted in: C# | Patterns | Snippets

Tags: , , ,