Sorting generic lists using IComparer<T>, IComparable<T>, and the Comparison<T> delegate

Filed Under (.Net) by John Miller on 31-01-2009

Tagged Under : , ,

Within .Net 3.5, we were given Linq which makes working with generic lists a piece of cake. But if you’re like me, you probably still find yourself supporting a lot of code bases that have been written in the 2.0 framework for clients that aren’t ready to move to 3.5. However, .Net 2.0 did bring some cool sorting features that give us some flexibility when working with generic lists.

The base class library gives us several different ways to apply custom sorting to generic lists being held in memory. The System.Collections.Generic.List class has a sort method that handles the basic sorting logic. However, this method needs to be told how to compare the objects within the list, which can be done by in a variety of ways.  It also includes an overload that allows us to inject the sorting logic, which is a great implementation of the Strategy pattern.

IComparable<T>

The default method of applying comparison logic to the sort method is to simply ensure that the objects in your list implement the IComparable<T> interface. This is already done for you with many types in the BCL. You can sort List<string> and List<int> with no extra work. For custom classes, we’d need to implement IComparable<T> manually.

For example, let’s say we have a City class that we want to be able to sort within a list. We’d need to implement IComparable<T> within the City class:

    public class City : IComparable<City>
    {
        private string _name;
        private decimal _medianResidentAge;
        private decimal _medianHouseholdIncome;
        private int _population;
        private decimal _medianHouseValue;
        public string Name
        {
            get { return _name; }
            set { _name = value; }
        }
        public decimal MedianResidentAge
        {
            get { return _medianResidentAge; }
            set { _medianResidentAge = value; }
        }
        public decimal MedianHouseholdIncome
        {
            get { return _medianHouseholdIncome; }
            set { _medianHouseholdIncome = value; }
        }
        public int Population
        {
            get { return _population; }
            set { _population = value; }
        }
        public decimal MedianHouseValue
        {
            get { return _medianHouseValue; }
            set { _medianHouseValue = value; }
        }
        public override string ToString()
        {
            StringBuilder text = new StringBuilder();
            text.Append(Name);
            text.AppendLine();
            text.AppendFormat("Med. Resident Age: {0}",
                                MedianResidentAge);
            text.AppendLine();
            text.AppendFormat("Med. Household Income: {0}",
                                MedianHouseholdIncome);
            text.AppendLine();
            text.AppendFormat("Population: {0}",
                                Population);
            text.AppendLine();
            text.AppendFormat("Med. House Value: {0}",
                                MedianHouseValue);
            text.AppendLine();
            return string.Format(text.ToString());
        }
        public int CompareTo(City other)
        {
            return Name.CompareTo(other.Name);
        }
    }

Note that the interface introduces a CompareTo method that returns an integer value based on the comparison of the instance object with the “other” object passed in as a parameter. This will return 1 if the instance object is greater than the other object, 0 if they are the same, or -1 if the instance object is less than the other. In this case, we’re simply using the City name for our comparison which causes the cities to be sorted alphabetically. Lets create a list a of cities and attempt to sort them.

        static void Main(string[] args)
        {
            List<City> cities = LoadCities();
            cities.Sort();
            foreach (City city in cities)
                Console.WriteLine(city + Environment.NewLine);
        }
        private static List<City> LoadCities()
        {
            List<City> cities = new List<City>();
            City chardon = new City();
            chardon.Name = "Chardon";
            chardon.MedianResidentAge = new decimal(37.40);
            chardon.Population = 5236;
            chardon.MedianHouseholdIncome = new decimal(51490.00);
            chardon.MedianHouseValue = new decimal(193213.00);
            cities.Add(chardon);
            City cleveland = new City();
            cleveland.Name = "Cleveland";
            cleveland.MedianResidentAge = new decimal(33.00);
            cleveland.Population = 438042;
            cleveland.MedianHouseholdIncome = new decimal(28512.00);
            cleveland.MedianHouseValue = new decimal(89700.00);
            cities.Add(cleveland);
            City massillon = new City();
            massillon.Name = "Massillon";
            massillon.MedianResidentAge = new decimal(37.60);
            massillon.Population = 32361;
            massillon.MedianHouseholdIncome = new decimal(36899.00);
            massillon.MedianHouseValue = new decimal(105979.00);
            cities.Add(massillon);
            City akron = new City();
            akron.Name = "Akron";
            akron.MedianResidentAge = new decimal(34.20);
            akron.Population = 207934;
            akron.MedianHouseholdIncome = new decimal(32928.00);
            akron.MedianHouseValue = new decimal(94100.00);
            cities.Add(akron);
            City twinsburg = new City();
            twinsburg.Name = "Twinsburg";
            twinsburg.MedianResidentAge = new decimal(36.40);
            twinsburg.Population = 17468;
            twinsburg.MedianHouseholdIncome = new decimal(68965.00);
            twinsburg.MedianHouseValue = new decimal(226383.00);
            cities.Add(twinsburg);
            return cities;
        }

SortResult1

IComparer<T>

But what if we want to be able to sort on more than one field? Or maybe give the option to sort in ascending or descending order? One option is to create an independent class that implements the IComparer<T> interface.

    public class PopulationComparer : IComparer<City>
    {
        public int Compare(City x, City y)
        {
            return x.Population.CompareTo(y.Population);
        }
    }

When we call the Sort() method on the list, we pass it an instance of the PopulationComparer object.

        static void Main(string[] args)
        {
            List<City> cities = LoadCities();
            cities.Sort(new PopulationComparer());
            foreach (City city in cities)
                Console.WriteLine(city + Environment.NewLine);
        }

SortResult2

We could also add logic to the comparer to sort the list in descending order instead of ascending.

    public enum SortDirection
    {
        Ascending,
        Descending
    }
    public class PopulationComparer : IComparer<City>
    {
        private SortDirection _sortDirection = SortDirection.Ascending;
        public PopulationComparer()
        {
        }
        public PopulationComparer(SortDirection sortDirection)
        {
            _sortDirection = sortDirection;
        }
        public int Compare(City x, City y)
        {
            int result = x.Population.CompareTo(y.Population);
            if (_sortDirection == SortDirection.Descending) result *= -1;
            return result;
        }
    }

And our sort call would look like this.

        static void Main(string[] args)
        {
            List<City> cities = LoadCities();
            cities.Sort(new PopulationComparer(SortDirection.Descending));
            foreach (City city in cities)
                Console.WriteLine(city + Environment.NewLine);
        }

And the cities are printed sorted by population in descending order.

SortResult3

But what if you want to sort on multiple properties? There are a few different ways this could be done but one of my favorite techniques was one I learned from JP’s Nothin’ But .Net Bootcamp. (And if you ever have the opportunity to take one of his classes, don’t hesitate. By far the most valuable experience I’ve had in my career!). To do this, we’ll create an abstract ChainedComparer class that our other custom comparers will derive from.

    public interface IChainedComparer<T> : IComparer<T>
    {
        int DoCompare(T x, T y);
        IChainedComparer<T> Then(IChainedComparer<T> nextComparer);
    }
    public abstract class ChainedComparer<T> : IChainedComparer<T>
    {
        private IChainedComparer<T> _nextComparer;
        private SortDirection _sortDirection = SortDirection.Ascending;
        public abstract int DoCompare(T x, T y);
        protected ChainedComparer()
        {
        }
        protected ChainedComparer(SortDirection sortDirection)
        {
            _sortDirection = sortDirection;
        }
        public IChainedComparer<T> Then(IChainedComparer<T> nextComparer)
        {
            _nextComparer = nextComparer;
            return this;
        }
        public int Compare(T x, T y)
        {
            int ret = DoCompare(x, y);
            if (ret == 0 && _nextComparer != null)
                ret = _nextComparer.Compare(x, y);
            if (_sortDirection == SortDirection.Descending)
                ret *= -1;
            return ret;
        }
    }

And we’ll create a couple of custom comparers that derive from ChainedComparer.

    public class PopulationComparer : ChainedComparer<City>
    {
        public override int DoCompare(City x, City y)
        {
            return x.Population.CompareTo(y.Population);
        }
        public PopulationComparer(SortDirection sortDirection)
            : base(sortDirection)
        { }
    }
    public class MedianHouseValueComparer : ChainedComparer<City>
    {
        public override int DoCompare(City x, City y)
        {
            return x.MedianHouseValue.CompareTo(y.MedianHouseValue);
        }
        public MedianHouseValueComparer(SortDirection sortDirection)
            : base(sortDirection)
        { }
    }

Last, we just need to call the Sort() method, which we can now pass a chain of IChainedComparer objects.

        static void Main(string[] args)
        {
            List<City> cities = LoadCities();
            cities.Sort(new PopulationComparer(SortDirection.Descending)
                .Then(new MedianHouseValueComparer(SortDirection.Ascending)));
            foreach (City city in cities)
                Console.WriteLine(city + Environment.NewLine);
        }

To demonstrate that we are in fact sorting by both population and median house value, I had to tweak the numbers a bit. Note that Chardon and Cleveland both now have the same population value.

SortResult4

Comparison Delegate

.Net also allows you to pass in a delegate that points to a method that can perform the comparison as well. This can be done in a couple of ways. The most straight forward is to simply pass in the name of a method that matches the delegate definition.

        static void Main(string[] args)
        {
            List<City> cities = LoadCities();
            cities.Sort(CompareMedianResidentAge);
            foreach (City city in cities)
                Console.WriteLine(city + Environment.NewLine);
        }
        private static int CompareMedianResidentAge(City x, City y)
        {
            return x.MedianResidentAge.CompareTo(y.MedianResidentAge);
        }

In C# 2.0, you could also do this via an anonymous method.

        static void Main(string[] args)
        {
            List<City> cities = LoadCities();
            cities.Sort(
                delegate(City x, City y)
                {
                  return x.MedianResidentAge.CompareTo(y.MedianResidentAge);
                });
            foreach (City city in cities)
                Console.WriteLine(city + Environment.NewLine);
        }

And I meant for this to be post strictly using what’s available in .Net 2.0, but I think it’s worth noting that this is much cleaner in .Net 3.5 using lambda expressions.

static void Main(string[] args)
{
    List<City> cities = LoadCities();
    cities.Sort((x, y) => x.MedianResidentAge.CompareTo(y.MedianResidentAge));
    foreach (City city in cities)
        Console.WriteLine(city + Environment.NewLine);
}

And all three of these implementations will yield the same result.

SortResult5

How is this easier in .Net 3.5?

With the addition of Linq, we’re given an OrderBy extension method on our list that allows us to sort our list with much less code.

static void Main(string[] args)
{
    IEnumerable<City> cities = LoadCities().OrderBy(x => x.MedianResidentAge);
    foreach (City city in cities)
        Console.WriteLine(city + Environment.NewLine);
}

System.InvalidOperationException: Failed to compare two elements in the array

Note that if you attempt to call Sort() on a collection who’s objects do not implement IComparabe<T> and you don’t pass in an IComparer<T> object or a Comparison delegate, you’ll get the exception: “System.InvalidOperationException: Failed to compare two elements in the array —> System.ArgumentException: At least one object must implement IComparable.”

DotNetKicks Image

Share/Save/Bookmark

Introduction to Closures in C#

Filed Under (.Net) by John Miller on 20-01-2009

Tagged Under : ,

Have you noticed that Resharper will occasionally give you a warning with the message “Access to modified closure” when referencing a variable within an anonymous method or lambda expression? Take the following code snippet for example:

    class Program
    {
        static void Main(string[] args)
        {
            List<Action> actions = new List<Action>();
            for (int i = 1; i <= 10; i++)
                actions.Add(() => Console.WriteLine(i));
            foreach (var action in actions)
                action();
        }
    }

Note that we could also write this example in .Net 2.0 using anonymous methods:

    class Program
    {
        private delegate void Action();
        static void Main(string[] args)
        {
            List<Action> actions = new List<Action>();
            for (int i = 1; i <= 10; i++)
                actions.Add(delegate() { Console.WriteLine(i); });
            foreach (var action in actions)
                action();
        }
    }

At first glance, it looks like the output of this example should count from 1 to 10. Instead, we see this:

Capture

The use of the variable i in the delegate makes use of a language feature called Closure. In this case, the newly created delegate is a closure that is dependant upon the variable i. Through closures, the delegates are tied to the variable i (outside of the delegate) instead of the value that’s in i when the delegate is created. The value of i is incremented via the for loop, so when we actually invoke the collection of delegates, they all print the same value since they are all bound to the same instance of i (which sounds really weird since i is actually a value type, but I believe .Net is actually creating a reference object to represent the variable i within the delegate).

So how do we get around this? The solution is actually pretty easy. Create a local variable in the for loop that is instantiated in each iteration with the current value of i.

    class Program
    {
        static void Main(string[] args)
        {
            List<Action> actions = new List<Action>();
            for (int i = 1; i <= 10; i++)
            {
                int j = i;
                actions.Add(() => Console.WriteLine(j));
            }
            foreach (var action in actions)
                action();
        }
    }

Note that we’re now using the new local variable j within the delegate. Since it’s created within the loop, each delegate will be referencing to their own j variable.Capture2

While the concept of closures was new to me, there are a lot of really good blog articles out there on the subject. Check out the following for more reading:

DotNetKicks Image

Share/Save/Bookmark