Getting rid of strings (1): meet the villain

June 10, 2008 at 1:30 PMAndre Loker

Intro

This series covers the potential problems that arise from the use of string literals in source code. Avoiding string literals can make your source code more robust, more manageable and less fragile regarding refactoring. The articles in this series will show several solutions on how to replace string literals with smarter constructs.

The problem: meaningful strings

Strings in source code can be a tricky thing. How much depends on the content and the context. A "Hello, world!" that is spit out somewhere is not that much of a problem. The meaningful strings are the problematic ones. I consider a string meaningful if it contains a named entity in the source code (types, members etc.) or a named resource (configuration settings, embedded resources, file names).  Here are two examples of meaningful strings that can be tricky:

   1: string setting = ConfigurationManager.AppSettings["UserName"];
   2: MethodInfo info = typeof (Foo).GetMethod("Execute");

What's the deal with them? Generally, strings like these cause trouble if

  1. the string is used multiple times in the code
  2. the string literal has to be changed

In the first example the string "UserName" refers the key of an application setting, found in the <appSettings> element in the application/web configuration file. If you were to change the key for whatever reason you would have to hunt down all occurrences of "UserName" and replace them. While practically all modern IDE's and text editors provide a search and replace function over multiple files, blindly replacing all occurrences of "UserName" might also override instances of "UserName" that are in no way related to the configuration setting. For example "UserName" could be used as a filter criterion in an NHibernate query:

   1: public Account GetAccountByUserName(string userName){
   2:   ISession session = //.. get an NHibernate session
   3:   return session.CreateCriteria(typeof(Account))
   4:     .Add(Expression.Eq("UserName", userName))
   5:     .UniqueResult<Account>();
   6: }

Changing "UserName" to anything else will most likely cause an error at runtime.

In the second example "Execute" references the name of a method of some fictional type Foo. Refactoring tools are widely available these days, so renaming Execute to something different is done quickly. Changing the method name will brake the code, again at runtime. While decent refactoring tools will try to find the symbol being renamed in strings, relying on this feature suffers from the same problem as the one above.

How to handle strings: basic rules

The basic rules of using meaningful string literals in source code:

  1. Avoid them. If possible, simply don't use string literals which are meaningful and suffer from the problems described above.
  2. If the previous is not possible, at least remove all duplicate occurrences of the same string literal and replace them with a named symbol. Just like 'magic numbers', replace all string literals with the same meaning by a string constant with a descriptive name. For the first example above, one could introduce a static class containing the settings keys (see below)
  3. Hide the use of string literals and/or string constants behind an API. (see below)

For the first example (the app settings key), rule 2 could be applied like this:

   1: public class AppSettingsKeys {
   2:   public const string UserName = "UserName";
   3: }
   4: //...
   5: var userName = ConfigurationManager.AppSettings[AppSettingsKeys.UserName];

But maybe applying rule 2 and 3 is even better:

   1: public static class AppSettings {
   2:   const string UserNameKey = "UserName";
   3:  
   4:   public static string UserName {
   5:     get{ return Get(UserNameKey);}
   6:   }
   7:  
   8:   static string Get(string key) {
   9:     return ConfigurationManager.AppSettings[key];
  10:   }
  11:  
  12: }
  13: //...
  14: var userName = AppSettings.UserName;

The fact that we are dealing with strings to access the app settings is nicely hidden behind the API provided by AppSettings. (Note: a subsequent article of this series will show a much better way to access app settings).

In both examples if the key of the "UserName" app setting changed, only one string had to be updated (AppSettingsKeys.UserName or AppSettings.UserNameKey). The name of the string constant could be left alone or it could be easily renamed using refactoring tools.

Conclusion

This article gives a fairly basic overview of the problems that can occur when dealing with meaningful string literals in source code. The next parts of this series will cover more sophisticated techniques to get rid of strings.

Posted in: C# | Patterns

Tags: , ,

Comments (2) -

Yep. strings are evil.
You also lose some of the benefits of the refactoring tools such as Resharper.

Specifically for AppSettings, I'd like to point you to the DictionaryAdapter in Castle:
using.castleproject.org/.../...s.DictionaryAdapter

Of course I was thinking of DictionaryAdapter (among other techniques) when writing the article Smile That's why I wrote blog.andreloker.de/.../...e-DictionaryAdapter.aspx

The next article in this series will describe means to link AppSettings and DictionaryAdapter in a smart way.

Pingbacks and trackbacks (4)+