Click here to Skip to main content
15,891,529 members
Articles / Programming Languages / C#
Article

Fluid IEnumerable_T_.Except()

Rate me:
Please Sign up or sign in to vote.
4.43/5 (3 votes)
30 Nov 2008CPOL2 min read 26K   70   10   1
This article presents a solution to the problem of performing flexible Set exclusion (A-B) on an IEnumerable while maintaining a fluid programming style.

Introduction

It is often useful to be able to find the items in one set that are not in another set.

A - B

This article presents a method for doing so in C# while retaining a fluid programming style.

The problem

Set subtraction is a common coding problem. I've done this many times in SQL Server with queries similar to the following, for example:

select u.*
from Users u
left join Administrators a on u.UserId = a.UserId
where a.AdministratorId is null

LINQ provides the IEnumerable<T>.Except() method that provides the same functionality:

C#
IEnumerable<string> result = users.Except(administrators);

Unfortunately, in order to do anything interesting with it, you have to provide an IEqualityComparer<T>..., something that I will not be covering in this article. Fortunately, we can also implement the Except concept using the join syntax as follows:

C#
IEnumerable<string> result = 
    from item in users
    join otherItem in administrators on item equals otherItem into tempItems
    from temp in tempItems.DefaultIfEmpty()
    where temp != null
    select item;

This looks a lot like the SQL implementation. The notation gets messy fast when building complex queries however, and can result in code that is difficult to maintain. In this article, I'll build upon the join implementation to get more flexibility.

One solution

First, build an Extension method to hide the complexity of the join syntax:

C#
[NotNull]
public static IEnumerable<T> Except<T>([NotNull] this IEnumerable<T> items,
                                       [CanBeNull] IEnumerable<T> other)
{
    // ... argument checks

    return from item in items
           join otherItem in other on item equals otherItem into tempItems
           from temp in tempItems.DefaultIfEmpty()
           where ReferenceEquals(null, temp) || temp.Equals(default(T))
           select item;
}

Note that the where clause has been changed to allow the extension method to work whether T is a struct or a class. Also note that the method returns an IEnumerable<T> so you can chain the result into another LINQ method fluidly; for example:

C#
IEnumerable<string> result = 
   users.Except(administrators).ToList().ForEach(Console.WriteLine);

Here are the more interesting NUnit tests:

C#
[TestFixture]
public class When_asked_to_get_items_from_a_set_that_are_not_in_another_set
{
    [Test]
    public void Should_return_only_those_items_that_are_not_in_the_
                other_set_where_T_is_a_class()
    {
        List<string> input = new List<string> {"cat", 
                                   "ran", "fast"};
        List<string> other = new List<string> {"dog", 
                                   "ran", "too", "slow"};
        IEnumerable<string> result = input.Except(other);
        Assert.IsNotNull(result, "result should never be null");
        Assert.AreEqual(2, result.Count(), "count does not match");
        Assert.AreEqual("cat", result.First(), "first item in result is incorrect");
        Assert.AreEqual("fast", result.Last(), "last item in result is incorrect");
    }

    [Test]
    public void Should_return_only_those_items_that_are_
                not_in_the_other_set_where_T_is_a_struct()
    {
        List<int> input = new List<int> {1, 2, 3};
        List<int> other = new List<int> {0, 2, 4, 6};
        IEnumerable<int> result = input.Except(other);
        Assert.IsNotNull(result, "result should never be null");
        Assert.AreEqual(2, result.Count(), "count does not match");
        Assert.IsTrue(result.All(item => item.IsOdd()));
        Assert.AreEqual(1, result.First(), "first item in result is incorrect");
        Assert.AreEqual(3, result.Last(), "last item in result is incorrect");
    }
}

Providing a comparison method

Next, we'll create an overload that takes a Lambda expression for comparing the items in the two sets. This allows you to use something other than the natural equality key to compare them.

C#
[NotNull]
public static IEnumerable<T> Except<T, TKey>([NotNull] this IEnumerable<T> items,
                                             [CanBeNull] IEnumerable<T> other,
                                             [NotNull] Func<T, TKey> getKey)
{
    // ... argument checks

    return from item in items
           join otherItem in other on getKey(item) 
           equals getKey(otherItem) into tempItems
           from temp in tempItems.DefaultIfEmpty()
           where ReferenceEquals(null, temp) || 
                                 temp.Equals(default(T))
           select item;
}

The overloaded method can be tested with:

C#
public class TestItem
{
    public string Name { get; set; }
}

[Test]
public void Should_return_only_those_items_that_are_not_in_the_other_set()
{
    List<TestItem> input = new List<TestItem>
                               {
                                   new TestItem {Name = "cat"},
                                   new TestItem {Name = "ran"},
                                   new TestItem {Name = "fast"}
                               };
    List<TestItem> other = new List<TestItem>
                               {
                                   new TestItem {Name = "dog"},
                                   new TestItem {Name = "ran"},
                                   new TestItem {Name = "too"},
                                   new TestItem {Name = "slow"}
                               };
    IEnumerable<TestItem> result = input.Except(other, item => item.Name);
    Assert.IsNotNull(result, "result should never be null");
    Assert.AreEqual(2, result.Count(), "count does not match");
    Assert.AreEqual("cat", result.First().Name, "first item in result is incorrect");
    Assert.AreEqual("fast", result.Last().Name, "last item in result is incorrect");
}

Exclusion with different types

The last and most flexible overload we'll add allows the sets to contain different types. For example, you might have users in the main set, but only the IDs of the ones that are administrators in the comparison set, and you might want to be able to get the users that are not administrators. This overload provides that capability:

C#
[NotNull]
public static IEnumerable<T> Except<T, TOther, TKey>(
                                       [NotNull] this IEnumerable<T> items,
                                       [CanBeNull] IEnumerable<TOther> other,
                                       [NotNull] Func<T, TKey> getItemKey,
                                       [NotNull] Func<TOther, TKey> getOtherKey)
{
    // ... argument checks

    return from item in items
       join otherItem in other on getItemKey(item) 
       equals getOtherKey(otherItem) into tempItems
       from temp in tempItems.DefaultIfEmpty()
       where ReferenceEquals(null, temp) || temp.Equals(default(TOther))
       select item;
}

Test usage is as follows:

C#
public class User
{
    public int Id { get; set; }
    public string Name { get; set; }
}

[Test]
public void Should_return_only_those_items_that_are_not_in_the_other_set()
{
    List<User> users = new List<User>
                               {
                                   new User {Id = 1, Name = "Maria"},
                                   new User {Id = 2, Name = "ZiYi"},
                                   new User {Id = 3, Name = "Altair"}
                               };
    List<int> administratorIds = new List<int> {2, 4, 6};
    IEnumerable<User> result = users.Except(administratorIds, 
                            item => item.Id, administratorId => administratorId);
    Assert.IsNotNull(result, "result should never be null");
    Assert.AreEqual(2, result.Count(), "count does not match");
    Assert.AreEqual("Maria", result.First().Name, "first item in result is incorrect");
    Assert.AreEqual("Altair", result.Last().Name, "last item in result is incorrect");
}

History

  • 2008-11-30 - Initial CodeProject publication.
  • 2008-11-23 - Initial blog entry.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
I am a polyglot programmer with more than 15 years of professional programming experience and author of Genetic Algorithms with Python. When learning a new programming language, I start with a familiar problem and try to learn enough of the new language to solve it. For me, writing a genetic engine is that familiar problem. Why a genetic engine? For one thing, it is a project where I can explore interesting puzzles, and where even a child's game like Tic-Tac-Toe can be viewed on a whole new level. Also, I can select increasingly complex puzzles to drive evolution in the capabilities of the engine. This allows me to discover the expressiveness of the language, the power of its tool chain, and the size of its development community as I work through the idiosyncrasies of the language.

Comments and Discussions

 
GeneralQuite like it actually Pin
Sacha Barber30-Nov-08 22:53
Sacha Barber30-Nov-08 22:53 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.