Arrange items in a list according to a list of keys

June 5, 2008 at 12:47 PMAndre Loker

Recently I needed an algorithm to change the order of elements in an item list of type IList<T>. The items had a unique key. The new order of the items was defined by a second collection which only contained the keys in the new order. Therefore, the algorithm had to move the elements in the IList<T> in such a way that their order matches the order of the key in the key collection.

To better illustrate the scenario, here's an example. Assume a domain object Person:

   1: public class Person {
   2:   public int ID { get; set; }
   3:   public string Name { get; set; }
   4: }

Say we have an array with Person instances:

   1: var people = new[]
   2: {
   3:   new Person{ID=1, Name="Jan"},
   4:   new Person{ID=2, Name="Piet"},
   5:   new Person{ID=3, Name="Klaas"},
   6: };

I now want to rearrange the order of the Person instances, given a second array which contains the keys of the items in the correct new order:

   1: var newOrderKeys = new[] {3, 1, 2};

That is, after rearranging the items, the following should hold:

   1: Assert.AreEqual(3, people[0].ID );
   2: Assert.AreEqual("Klaas", people[0].Name);
   3:  
   4: Assert.AreEqual(1, people[1].ID );
   5: Assert.AreEqual("Jan", people[1].Name);
   6:  
   7: Assert.AreEqual(2, people[2].ID );
   8: Assert.AreEqual("Piet", people[2].Name);

Here are some constraints and assumptions for the algorithm

  • The algorithm must not add or remove items from the list as it must be applicable to arrays (which do not allow appending or removing of items)
  • The keys must implement IEquatable<TKey>
  • Each key appear only once in the original list
  • The algorithm should gracefully handle errors in the key list, like
    • Duplicate keys
    • Too few or too many keys
    • Keys that don't exist in the original item list
  • If the number of distinct keys in the key list does not match the number of items in the item list, arrange only as many items as there are distinct valid keys (or at most itemList.Count items). If there are n valid distinct keys, the first min(n, itemList.Count) items should be arranged, all other items may be left in any order.

This was my first version:

   1: public static void Arrange<TItem, TKey>(this IList<TItem> items, IEnumerable<TKey> newOrderKeys, Func<TItem, TKey> keyExtractor)
   2:   where TKey : IEquatable<TKey> {
   3:   // index at which the next item will be placed
   4:   var writerIndex = 0;
   5:   foreach (var key in newOrderKeys.Distinct().TakeWhile(k => writerIndex < items.Count)) {
   6:     var index = -1;
   7:     // find the index of the item with the given key
   8:     for (var i = writerIndex; i < items.Count; ++i) {
   9:       // do keys match?
  10:       if (keyExtractor(items[i]).Equals(key)) {
  11:         index = i;
  12:         break;
  13:       }
  14:     }
  15:     if (index >=0) {
  16:       // swap the item with the one at the write index
  17:       // but only if the index of that item is higher than the 
  18:       // write index
  19:       if (index > writerIndex) {
  20:         var temp = items[index];
  21:         items[index] = items[writerIndex];
  22:         items[writerIndex] = temp;
  23:       }
  24:       // increase write index if element was found
  25:       ++writerIndex;
  26:     }
  27:   }
  28: }

It works like this:

  • First, remove duplicates from the newOrderKeys list (the key list), this is done with Distinct()
  • Limit the number of items to grab from the remaining list to he length of the item list lists. This is done with TakeWhile(). I could have used Take(items.Count), but this would count keys that are not found in the item list, so I chose to track the number of valid keys in writerIndex and compare this variable to items.Count. Note that Resharper gives a warning ("Access to modified closure") when accessing writerIndex in the lambda. In this case, this is no problem as the lambda invocation and increment of writerIndex take turns. Let's just say: it works :-)
  • For each item in the filtered and limited key list, find the index of the corresponding item in the item list items. All elements with an index less than writerIndex will have been arranged correctly, so start searching at index writerIndex.
  • If such an element was found, swap it with the item at writerIndex and increase writerIndex. As an optimization, do not swap if writerIndex == index.
  • The method is an extension method for IList<T>s, so usage is simple.

And here's how to use it:

   1: people.Arrange(newOrderKeys, p => p.ID);

Easy, isn't it? Just pass an enumeration with the keys in the new order and provide a function that maps from an item to a key.

Improvements

As you see the indices of the items are found by a linear search. For small lists this isn't a problem, for larger ones it can become really inefficient. A faster alternative is the use of a dictionary which stores the indices for each key in the item list. This dictionary can be easily created like this:

   1: var index = 0;
   2: var dict = items.ToDictionary(keyExtractor, item => index++);

This creates a dictionary of all keys in the item list mapped to the index in the list. Looking up the index is now easy and fast:

   1: if(!dict.TryGetValue(key, out index){
   2:   index = -1;
   3: }

I did some benchmarks and found that for lists with up to ~40 elements it is faster to simply use a linear search for the indices. Larger lists could benefit from the dictionary method. This looked like a good place for the Strategy Pattern. My final version (see attached file) therefore switched the strategy depending on the size of the collection.

Using a strategy for index lookup and refactoring element swapping out to a separate method the core of the arrange method becomes this:

   1: foreach (var key in newKeysOrder.Distinct().TakeWhile(k => writeIndex < items.Count)) {
   2:   var index = indexOf(items, writeIndex, key);
   3:   if (index >= writeIndex){
   4:     items.Swap(index, writeIndex++);
   5:   }
   6: }

indexOf is a delegate of type Func<IList<TItem>, int, TKey, int> that will return the index of the element in items that has the given key (given a start index as a hint).

What is it good for?

If you have read until here you might ask yourself where one would possibly use an algorithm like this. The place where I needed it is a web application which allowed the user to sort items using drag and drop (using jQuery's sortable feature). When the user has changed the order of the items in the browser, jQuery sends an AJAX request to the server that contains the keys of the items in the new sort order. The server uses the Arrange method to change the order of the items in its domain model.

I have put up a small example app that does a similar thing.

To the demo app.

You can change the order of the album tracks using simple drag and drop. jQuery will send the new order to the server which responds with the order as it is known by the server (this will allow you to compare client sort order with server sort order). The app uses MonoRail and the action that handles the AJAX request looks like this:

   1: public void UpdateOrder(string items) {
   2:   CancelView();
   3:   if (items == null) {
   4:     RenderText("Bad argument");
   5:   } else {
   6:     var album = GetAlbum();
   7:  
   8:     // parse ids from parameter
   9:     var ids = items.Split('&').Select(s => new Guid(s.Substring("item[]=".Length)));
  10:     // change order of album tracks according to order of GUIDs
  11:     album.Tracks.Arrange(ids, t => t.ID);
  12:  
  13:     RenderText("New order (on server): " + TitleOrderToList(album));
  14:   }
  15: }

The items parameters contains the IDs (GUIDs) of the tracks like this: "item[]=some-guid&itemp[]=some-guid&...".

The first cool thing is the way the Guids are extracted from the string, Then the Arrange method is used to let the model reflect the new sort order.

If you have any use for the code, use it, spread it, modify it any way you like. Don't blame me if explodes, though ;-)

Download the source code:

ArrangeList.cs (3.39 kb)

Update 6/9/2008:

JSK detected a small bug in the source code which caused the writeIndex not to be updated if an item already was in place. I fixed it in the article, the downloadable source code and the demo app. Thanks for spotting this!

Posted in: C# | Patterns | Snippets

Tags: , , ,

.NET 3.5 setup woes

June 5, 2008 at 1:32 AMAndre Loker

Today I tried to install .NET 3.5 on one of my web servers. It's a virtual server running Windows Server 2003 Standard x64 under Virtuozzo. I downloaded the complete redistributable (197 MB). The setup, however, failed with an error message that was not too comprehensible:

[06/05/08,00:51:10] XPSEPSC x64 Installer: [2] Error code 1603 for this component means "Fatal error during installation."
[06/05/08,00:51:10] XPSEPSC x64 Installer: [2] Setup Failed on component XPSEPSC x64 Installer

So there was something wrong with the XPSEPSC component (until know I never even heard of it). After digging through heaps of installation logs I found a file called dd_XPS_x64.txt in %USER%\Local Settings\Temp\28. That file appeared to be the installation log file for the XPSEPSC component. At least it was a bit more chatty:

3.078: XpsEPSC Setup encountered an error:  Setup could not verify the integrity of the file Update.inf.  Make sure the Cryptographic service is running on this computer.

Nice clue, I thought, expect that the cryptographic service was already running in automatic mode. At least I somehow managed to find this knowledge base article that referred to the error message. Already the first method that was described in there helped a lot:

Method 1: Rename the Edb.log file

Rename the Edb.log file, and then try to install the program again. To rename the Edb.log file, follow these steps:

1. Click Start, click Run, type cmd in the Open box, and then OK.
Note On a Windows Vista-based computer, click Start, type cmd in the Start Search text box, right-click cmd.exe, and then click Run as administrator.

2.At the command prompt, type the following command, and then press ENTER:

ren %systemroot%\system32\catroot2\Edb.log *.tst

After renaming Edb.log I ran the setup again. During installation a different error message popped up complaining about the VC++ runtime:

The application has requested the Runtime to terminate it in an unusual way.

I already thought setup was to fail again, but interestingly it completed successfully and told me that .NET 3.5 was installed correctly. Really, I had never come up with this myself, so I'm really glad I found the KB article.

By the way: the reason why the VC++ runtime error occurred was likely that the printer spooler service was disabled (and it is on my web server). Right, the printer spooler. Who on earth should predict that? At least there are blog posts about it, e.g. here and here. Just out of curiosity I started the printer spooler, re-ran the .NET 3.5 installer and chose the repair function (go get some coffee, it takes aeons to complete). At least during the repair no error occurred, so I assume everything's fine now.

Posted in: Other | Tools

Tags: , ,

Feels like the 90s: disabling right mouse button with JS

June 2, 2008 at 2:27 PMAndre Loker

It's Monday again and I feel like ranting. I can't believe that there are still web sites out there that try to "protect" their precious (?) content by trying to disable the right mouse button with JavaScript. Never seen it? Those pages hook up the mouse down or mouse up event with JavaScript. When you use the right mouse button to bring up the context menu on those pages (e.g. to copy&paste text, save images, view the source code etc.) instead an alert box pops up telling you smart things like "Please don't steal our content" or "If you are interested in our web site please contact our web master". Those protection mechanisms are remains of the late 90s when people discovered that JavaScript was sooo cool.

So here's to you, all you self-designated web masters:

  • First of all, everything the user of your web site sees has been downloaded to his/her computer anyway and can most likely be found in a cache somewhere on the hard disk
  • Disabling the right mouse button does not prevent anything that could not be done otherwise in the browser
    • Most keyboards have a context menu key
    • The same functions are most likely available as normal menu commands...
    • ... and by keyboard shortcut
  • One can easily disable JavaScript in the browser
  • And finally, probably your content isn't worth ripping anyway

Therefore, this kind of protection is no protection whatsoever. It is only

  • plain stupid
  • annoying people who want to copy information like links or e-mail addresses that really would be useful to copy
  • preventing users from taking your site seriously and coming back

So please, please, dear web masters: stop doing stupid things to your web site and its visitors. Or stop making web sites at all :-)

And now: back to work

Posted in: Other

Tags: , ,

NHibernate: hiding ID setters

May 27, 2008 at 2:36 PMAndre Loker

Encapsulation is a key concern in object oriented development. Sometimes you have to break encapsulation a bit to make your code interoperable with the infrastructure. In this article I explain how to make a domain entity play well with NHibernate while keeping an encapsulated interface towards the user.

A brief introduction to access methods in NHibernate

Normally I prefer to map fields to database fields rather than properties. This allows me to put logic into the getter and/or setter of the properties if required while persisting the "raw" state of the fields to and from the database. To map fields instead of properties use the access attribute on the respective property element in the mapping file. You can also define the default access method by setting the default-access attribute on the class element. My mapping files generally look like this

   1: <?xml version="1.0" encoding="utf-8" ?>
   2: <hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
   3:                    assembly="TheAssembly"
   4:                    namespace="SomeNamespace"
   5:                    default-access="field.camelcase"
   6:                    default-lazy="true">
   7:  
   8:   <class name="TheClass" proxy="ITheClassProxyInterface">
   9:     <id name="ID" access="property">
  10:       <generator class="native"/>
  11:     </id>
  12:  
  13:     <property name="SomeProperty"/>
  14:     <!-- etc -->
  15:   </class>
  16: </hibernate-mapping>

In line 5 I define the default access to be "fields with camel case naming". This means that a property in the mapping file named "SomeProperty" (line 13) will be mapped to a field named someProperty.

You might have noticed that I explicitly set the access strategy to "property" for the ID. This is necessary when you use proxy interfaces. NHibernate needs to be able to access the ID property even if only a proxy is available. If you try to use field access for the ID property you will most likely get an error message like this:

Could not find field 'id' in class 'DessinDB.Process.IDesign'

So we only have to set access to property and everything is cool? Almost.

The dilemma

Let's start with some code:

   1: /// <summary>
   2: /// Proxy interface
   3: /// </summary>
   4: public interface ITestEntity {
   5:   int ID { get; }
   6: }
   7:  
   8: /// <summary>
   9: /// Entity class
  10: /// </summary>
  11: public class TestEntity : ITestEntity {
  12:   private int id;
  13:  
  14:   public int ID {
  15:     get { return id; }
  16:   }
  17: }

And here's the mapping file:

   1: <?xml version="1.0" encoding="utf-8" ?>
   2: <hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
   3:                    assembly="NTest"
   4:                    namespace="NHTest"
   5:                    default-access="field.camelcase"
   6:                    default-lazy="true">
   7:  
   8:   <class name="TestEntity" proxy="ITestEntity">
   9:     <id name="ID" access="property" >
  10:       <generator class="native"/>
  11:     </id>
  12:     
  13:   </class>
  14: </hibernate-mapping>

Really simple, still it won't work: NHibernate complains that it cannot find the setter for ID.

Could not find a setter for property 'ID' in class 'NHTest.TestEntity'

Adding a setter to TestEntity.ID is trivial. When we use native ID generation we normally would not want the property to be settable by user code.  As long as we only deal with ITestEnties in our app, this is not problematic. However, NHibernate is still complaining:

Could not find a setter for property 'ID' in class NHTest.ITestEntity'

Adding the setter to the interface is easy, but will also allow any client to set the ID manually - something we do not want.

The solution

I came up with a solution to the problem which is simple, easy to reuse and works without to much "hacking".

First I wrote an interface that provides read/write access to the ID:

   1: /// <summary>
   2: /// Interface for objects that have an id.
   3: /// </summary>
   4: /// <typeparam name="TId">The type of the id.</typeparam>
   5: public interface IIdentified<TId> {
   6:   /// <summary>
   7:   /// Gets or sets the ID.
   8:   /// </summary>
   9:   /// <value>The ID.</value>
  10:   TId ID { get; set; }
  11: }

I then introduced an interface that derives from IIdentified and which hides the setter of its parent:

   1: /// <summary>
   2: /// Interface for domain entities.
   3: /// </summary>
   4: /// <typeparam name="TId">The type of the id.</typeparam>
   5: public interface IEntity<TId> : IIdentified<TId>{
   6:   /// <summary>
   7:   /// Gets the ID.
   8:   /// </summary>
   9:   /// <value>The ID.</value>
  10:   /// <remarks>This property hides <see cref="IIdentified{TId}.ID"/></remarks>
  11:   new TId ID { get; }
  12: }

Now I introduced a convenient abstract base class that implements IEntity<ID> (and indirectly IIdentified<ID>)

   1: public abstract class Entity<TId> :IEntity<TId> {
   2:   private TId id;
   3:   public TId ID { get { return id; }}
   4:  
   5:   TId IIdentified<TId>.ID {
   6:     get { return id; }
   7:     set { id = value; }
   8:   }
   9:  
  10: }

By implementing IIdentified<TId>.ID explicitly I hide the setter from users of IEntity<TId> and Entity<TId>. I can now derive my proxy interface and the entity class from IEntity and Entity:

   1: /// <summary>
   2: /// Proxy interface
   3: /// </summary>
   4: public interface ITestEntity : IEntity<int> {
   5: }
   6:  
   7: /// <summary>
   8: /// Entity class
   9: /// </summary>
  10: public class TestEntity : Entity<int>, ITestEntity {
  11: }

NHibernate is smart enough to detect that IIdentified provides a setter to ID. On the other hand, users of IEntity and Entity will only see the read-only version of ID.

Using ITestEntity:

read-only-id

Using TestEntity:

read-only-id-entity

Of course the setter is available as soon as you cast the entity to an IIdentified<int>:

read-write-id

On the one hand this allows us to inject the ID for example in unit tests. On the other hand it is not very likely that the user will deal with IIdentified<TId> references. So unless the user explicitly converts the entity to an IIdentified<TId> he/she won't see the setter. For my scenarios this is certainly enough.

Posted in: NHibernate | Snippets

Tags: , ,

MonoRail, AspView and ReSharper: skip ViewAtDesignTime

May 24, 2008 at 1:53 AMAndre Loker

AspView and Intellisense

AspView is a MonoRail viewengine which provides compile time checking of views and Intellisense support in Visual Studio. The latter is achieved by using the known ASP.NET WebForms syntax to define views, see below for an example:

aspview_viewatdesigntime

It uses the @Page directive and sets the Inherits attribute to a base class called ViewAtDesignTime defined by AspView. This base class contains fake methods and properties that are similar to the ones that are available to the runtime view class - Caslte.MonoRail.Views.AspView.AspViewBase. The only purpose of the ViewAtDesignTimeClass is to make Visual Studio's Intellisense work. It does, as can be seen in the picture. Additionally the user can provide a custom pair of design time and runtime classes that derive from ViewAtDesignTime and AspViewBase.

However, there is a caveat. ViewAtDesignTime derives from System.Web.UI.Page. This is a requirement of Visual Studio to make Intellisense work. As a result Intellisense will show all members of Page, which is at least confusing. More likely, it's misleading as many of the Members won't be available in AspViewBase and will let compilation fail.

aspview_intellisense

In the example above Intellisense shows members like DataBind which will not be available during compilation.

ReSharper to the rescue: skip ViewAtDesignTime

However, as Ken Egozi pointed out on the Castle Developer mailing list, if you use ReSharper (and you should) you can skip the ViewAtDesignTime class altogether and just use the AspViewBase class (or a derived class) in the view. ReSharper will still provide Intellisense. The list of members is much more reliable as those members will definitively be available during compilation.

aspview_resharper

Advantages of using the AspViewBase based class instead of ViewAtDesignTime:

  • Get better Intellisense: Page members are ommitted
  • Less code: if you provide your own base classes for views because you want to provide some extra members to the view (a good example would be the code generated by CodeGenerator) you do not have to provide a design time counterpart as well.
  • More robust code: similarly, if you have to provide a runtime class and a design time class you have to ensure that both interfaces match exactly, otherwise the design support is misleading (this problem does actually happen). Without the design time view, only one interface has to be maintained.

Disadvantages:

  • No Intellisense on dev machines without ReSharper (which is a good excuse to ask your boss for a R# licence anyway)

Thanks to Ken Egozi for this neat hint.

Posted in: Castle | Tools

Tags: , , ,