Arrange items in a list according to a list of keys

June 5, 2008 at 12:47 PMAndre Loker

Recently I needed an algorithm to change the order of elements in an item list of type IList<T>. The items had a unique key. The new order of the items was defined by a second collection which only contained the keys in the new order. Therefore, the algorithm had to move the elements in the IList<T> in such a way that their order matches the order of the key in the key collection.

To better illustrate the scenario, here's an example. Assume a domain object Person:

   1: public class Person {
   2:   public int ID { get; set; }
   3:   public string Name { get; set; }
   4: }

Say we have an array with Person instances:

   1: var people = new[]
   2: {
   3:   new Person{ID=1, Name="Jan"},
   4:   new Person{ID=2, Name="Piet"},
   5:   new Person{ID=3, Name="Klaas"},
   6: };

I now want to rearrange the order of the Person instances, given a second array which contains the keys of the items in the correct new order:

   1: var newOrderKeys = new[] {3, 1, 2};

That is, after rearranging the items, the following should hold:

   1: Assert.AreEqual(3, people[0].ID );
   2: Assert.AreEqual("Klaas", people[0].Name);
   3:  
   4: Assert.AreEqual(1, people[1].ID );
   5: Assert.AreEqual("Jan", people[1].Name);
   6:  
   7: Assert.AreEqual(2, people[2].ID );
   8: Assert.AreEqual("Piet", people[2].Name);

Here are some constraints and assumptions for the algorithm

  • The algorithm must not add or remove items from the list as it must be applicable to arrays (which do not allow appending or removing of items)
  • The keys must implement IEquatable<TKey>
  • Each key appear only once in the original list
  • The algorithm should gracefully handle errors in the key list, like
    • Duplicate keys
    • Too few or too many keys
    • Keys that don't exist in the original item list
  • If the number of distinct keys in the key list does not match the number of items in the item list, arrange only as many items as there are distinct valid keys (or at most itemList.Count items). If there are n valid distinct keys, the first min(n, itemList.Count) items should be arranged, all other items may be left in any order.

This was my first version:

   1: public static void Arrange<TItem, TKey>(this IList<TItem> items, IEnumerable<TKey> newOrderKeys, Func<TItem, TKey> keyExtractor)
   2:   where TKey : IEquatable<TKey> {
   3:   // index at which the next item will be placed
   4:   var writerIndex = 0;
   5:   foreach (var key in newOrderKeys.Distinct().TakeWhile(k => writerIndex < items.Count)) {
   6:     var index = -1;
   7:     // find the index of the item with the given key
   8:     for (var i = writerIndex; i < items.Count; ++i) {
   9:       // do keys match?
  10:       if (keyExtractor(items[i]).Equals(key)) {
  11:         index = i;
  12:         break;
  13:       }
  14:     }
  15:     if (index >=0) {
  16:       // swap the item with the one at the write index
  17:       // but only if the index of that item is higher than the 
  18:       // write index
  19:       if (index > writerIndex) {
  20:         var temp = items[index];
  21:         items[index] = items[writerIndex];
  22:         items[writerIndex] = temp;
  23:       }
  24:       // increase write index if element was found
  25:       ++writerIndex;
  26:     }
  27:   }
  28: }

It works like this:

  • First, remove duplicates from the newOrderKeys list (the key list), this is done with Distinct()
  • Limit the number of items to grab from the remaining list to he length of the item list lists. This is done with TakeWhile(). I could have used Take(items.Count), but this would count keys that are not found in the item list, so I chose to track the number of valid keys in writerIndex and compare this variable to items.Count. Note that Resharper gives a warning ("Access to modified closure") when accessing writerIndex in the lambda. In this case, this is no problem as the lambda invocation and increment of writerIndex take turns. Let's just say: it works :-)
  • For each item in the filtered and limited key list, find the index of the corresponding item in the item list items. All elements with an index less than writerIndex will have been arranged correctly, so start searching at index writerIndex.
  • If such an element was found, swap it with the item at writerIndex and increase writerIndex. As an optimization, do not swap if writerIndex == index.
  • The method is an extension method for IList<T>s, so usage is simple.

And here's how to use it:

   1: people.Arrange(newOrderKeys, p => p.ID);

Easy, isn't it? Just pass an enumeration with the keys in the new order and provide a function that maps from an item to a key.

Improvements

As you see the indices of the items are found by a linear search. For small lists this isn't a problem, for larger ones it can become really inefficient. A faster alternative is the use of a dictionary which stores the indices for each key in the item list. This dictionary can be easily created like this:

   1: var index = 0;
   2: var dict = items.ToDictionary(keyExtractor, item => index++);

This creates a dictionary of all keys in the item list mapped to the index in the list. Looking up the index is now easy and fast:

   1: if(!dict.TryGetValue(key, out index){
   2:   index = -1;
   3: }

I did some benchmarks and found that for lists with up to ~40 elements it is faster to simply use a linear search for the indices. Larger lists could benefit from the dictionary method. This looked like a good place for the Strategy Pattern. My final version (see attached file) therefore switched the strategy depending on the size of the collection.

Using a strategy for index lookup and refactoring element swapping out to a separate method the core of the arrange method becomes this:

   1: foreach (var key in newKeysOrder.Distinct().TakeWhile(k => writeIndex < items.Count)) {
   2:   var index = indexOf(items, writeIndex, key);
   3:   if (index >= writeIndex){
   4:     items.Swap(index, writeIndex++);
   5:   }
   6: }

indexOf is a delegate of type Func<IList<TItem>, int, TKey, int> that will return the index of the element in items that has the given key (given a start index as a hint).

What is it good for?

If you have read until here you might ask yourself where one would possibly use an algorithm like this. The place where I needed it is a web application which allowed the user to sort items using drag and drop (using jQuery's sortable feature). When the user has changed the order of the items in the browser, jQuery sends an AJAX request to the server that contains the keys of the items in the new sort order. The server uses the Arrange method to change the order of the items in its domain model.

I have put up a small example app that does a similar thing.

To the demo app.

You can change the order of the album tracks using simple drag and drop. jQuery will send the new order to the server which responds with the order as it is known by the server (this will allow you to compare client sort order with server sort order). The app uses MonoRail and the action that handles the AJAX request looks like this:

   1: public void UpdateOrder(string items) {
   2:   CancelView();
   3:   if (items == null) {
   4:     RenderText("Bad argument");
   5:   } else {
   6:     var album = GetAlbum();
   7:  
   8:     // parse ids from parameter
   9:     var ids = items.Split('&').Select(s => new Guid(s.Substring("item[]=".Length)));
  10:     // change order of album tracks according to order of GUIDs
  11:     album.Tracks.Arrange(ids, t => t.ID);
  12:  
  13:     RenderText("New order (on server): " + TitleOrderToList(album));
  14:   }
  15: }

The items parameters contains the IDs (GUIDs) of the tracks like this: "item[]=some-guid&itemp[]=some-guid&...".

The first cool thing is the way the Guids are extracted from the string, Then the Arrange method is used to let the model reflect the new sort order.

If you have any use for the code, use it, spread it, modify it any way you like. Don't blame me if explodes, though ;-)

Download the source code:

ArrangeList.cs (3.39 kb)

Update 6/9/2008:

JSK detected a small bug in the source code which caused the writeIndex not to be updated if an item already was in place. I fixed it in the article, the downloadable source code and the demo app. Thanks for spotting this!

Posted in: C# | Patterns | Snippets

Tags: , , ,