15,112,601 members
Articles / Programming Languages / Visual Basic
Article
Posted 23 Jun 2006

159.2K views
50 bookmarked

ArrayList Sort Tutorial

Rate me:
Theory and summary of different sort methods.

Introduction

When working on a project to display and compare source codes, I had to sort objects inside an `ArrayList`. Because I never did this before, my knowledge about this issue has been zero. Therefore, I Google'd and got hundreds of results.

Two good articles I found in The Code Project:

But none of the articles I read contained the complete information I needed. So I spent several hours with many of the query results, and did a couple of tests, until I thought I knew enough.

Then I decided to write a summary here to help other newbies in the sort issue. You should download the demo program so that you can see the results of the operations described here.

Some theory

To sort items, you can use every criteria you want. Besides the 'usual' ones like size, height, weight etc., it may be also brightness, sweetness, age, etc. The internal rules of the sort algorithm is a secret of Bill G. You only have to compare two items. Your decision is given by the integer result of the `Compare` method:

• If you decide for item #2, the result is positive
• If the items are equivalent, the result is 0
• If you decide for item #1, the result is negative

A negative result is not necessarily -1. This brings an easy way to compare two numbers just by returning their difference.

IComparable vs. IComparer

Walking through the Google results, you will find the interface `IComparable`. In the following articles, you will see the `IComparer`.

A printing error?

No! Both interfaces can be used to sort simple types as well as to do very complex multi-step sorts. The differences:

 `IComparer` `IComparable` Implemented in Subclass in `ArrayList` or separate class Item to be sorted Implemented by `IComparer` class Method "`Compare(Object x, Object y)"` Method "`CompareTo (object x)`" Executed by `ArrayList.Sort()` `ArrayList.Sort()` Occurrence Multiple possible Single only International strings Possible Possible

Practically, every result can be reached by use of either of these interfaces.

Practical experiments

Let's now sort some `ArrayList`s. The leftmost `ListBox` is generated by clicking one of the left or middle `Button`s. The `Click` event then copies all `lstOrig` `ListBox` items into a standard `ArrayList`, `sArl`, which does a `sArl.Sort()` and writes the content of `sArl` to the `lstStandard` `Listbox`.

VB
```Private Sub standardSort()
Dim sArl As New ArrayList
lblType.Text = lstOrig.Items.Item(0).GetType().ToString()
' puts all lstOrig items into the ArrayList sArl
origToArray(sArl)
Try
sArl.Sort()
' puts the sorted items into the middle Listbox
arrayToListBox(sArl, lstStandard)
Catch ex As Exception
MsgBox(ex.Message)
End Try
End Sub```

Standard sorting of simple types

As you can see, the first three simple types `Int32`, `String`, and `DateTime` are sorted correctly without coding any sort algorithm anywhere. But when trying the `Color` type, you get an exception. So for this type we have to do some coding:

Standard sorting by using IComparable

We generate a class `MyColor`. Because the structure `System.Color` is `sealed`, we cannot derive directly from `System.Color` and have to define a field `m_color`. The class `MyColor` implements the interface `IComparable`.

To code the implementation, we have to define a compare method, `int IComparable.CompareTo(object x)`.

In this method, the class instance has to compare the 'delivered' object `x` with itself.

C#
```  public class MyColor : IComparable
{
private Color m_color;

#region Constructor
public MyColor(Color myColor)
{
m_color = myColor;
}
#endregion

#region Overrides
///
/// base.ToString() returns "MyColor".
/// So we return m_color.ToString()
///
public override string ToString()
{
return m_color.ToString();
}
#endregion

#region Standard Comparer
///
/// MyColor Standard Compare method: compare names
///
int IComparable.CompareTo(object x)
{
MyColor myX = (MyColor)x;
return string.Compare(this.m_color.Name, myX.m_color.Name);
}
#endregion
}```

This simple user defined class can be sorted by clicking the `MyColor` button. The one and only one method `int IComparable.CompareTo(object x)` has to be defined for the simple user defined class.

Complex sorting by using IComparer

Here, the complexity is not the reason to use the `IComparer` interface. The same could be done by using `IComparable`. It is chosen here do demonstrate how to use this technique.

Generally, there are two ways to use `ArrayList`s with an `IComparer` comparer class.

• Define the comparer as an external class, and call the overloaded `Sort` method with `<standardArrayList>.Sort(<myComparer>)`
• Define the comparer as an internal class of a derived `ArrayList`, and overwrite the `Sort()` method to use the internal comparer.

I prefer the second one because it is more transparent. The user does not necessarily need to know which comparer is used.

Lines with mixed string and numerical contents

If you click the Ifline button, you get the result shown in the sample image above. If you think about the lines as strings, they are sorted correctly. But if you think about them as statements, the logical order is wrong. So we have to define a special sort algorithm for If-Lines.

First, let's create a class `IfLine`, and a class `IfLineList` derived from `ArrayList` and containing `IfLine`s.

To sort, we have to divide the `IfLine` string into three parts:

• The leading `string` part, `StartPart`
• The `int` part, `NumVal`
• The trailing `string` part, `EndPart`

With these parts, we define a three-step compare algorithm:

C#
```private class IfLineSort : IComparer
{
/// <summary>
/// IfLineSort Compare method
/// </summary>
int IComparer.Compare(object x, object y)
{
IfLine src = new IfLine(x.ToString());
IfLine trg = new IfLine(y.ToString());
int result;
// first alpha sort by StartPart
result = string.Compare(src.StartPart, trg.StartPart);
if (result != 0)
return result; // different, compare done
// if result is 0, StartParts are identical.
// then numerical sort by NumVal
result = src.NumVal - trg.NumVal;
if (result != 0)
return result;// different, compare done
// if result is 0, NumVals are identical, too.
// then do final alpha sort by EndParts
result = string.Compare(src.EndPart, trg.EndPart);
return result;
}
}```

Finally, in the `IfLineList` class, we have to override the `Sort()` method to use our `IComparer`, `IfLineSort`:

C#
```/// <summary>
/// IfLineList class standard sort
/// </summary>
public override void Sort()
{
base.Sort(new IfLineSort());
}```

Lines to be sorted dynamically by different aspects

Imagine you have some variable definitions. In your application, you dynamically want to have the choice to sort by name, type, and access level, or a combination of them. (For the sample, we restrict to these three items, no more attributes.)

A definition line looks like: "`private int myValue;`". We create a class, `VarLine`, representing such a definition line, and a class, `VarLineList`, derived from `ArrayList` and containing `VarLine`s. The variable type and access level are described by `enum`s. For the sample, the `enum`s are restricted, too:

C#
```  public enum EType
{
// order is my personal choice
typInt = 1,
typString,
typBool
}
public enum EAccess
// order is 'by publicity'
{
accessPublic = 1,
accessInternal,
accessPrivate
}```

To tell the application about the sort methods and orders, we define:

• `T` for type sorting ascending
• `t` for type sorting descending
• `A` for access sorting descending
• `a` for access sorting descending
• `N` for name sorting descending
• `n` for name sorting descending

Any combination like 'TAN' or 'aTn' is possible. Because the names are unique, it makes no sense to add sort steps behind a 'N' or 'n'.

The 'combination string' is given to the `SortAlgoritm` property of `VarLineList`, which passes it to the `IComparer`, `VarLineSort`, inside the constructor.

C#
```/// <summary>
/// CustVarLineSort Compare method
/// compare step by step, algorithm is stored as symbolic string in m_Algo
/// </summary>
int IComparer.Compare( object x, object y )
{
VarLine src = new VarLine(x.ToString());
VarLine trg = new VarLine(y.ToString());
int result = 0;
int i;
string step;
string stepU;
for (i=0;i<m_Algo.Length;i++)
{
step = m_Algo.Substring(i,1);
stepU = step.ToUpper();
switch (stepU)     {
case "A":
result = src.VarAccess - trg.VarAccess;
break;
case "T":
result = src.VarType - trg.VarType;
break;
case "N":
result = string.Compare(src.VarName, trg.VarName);
break;
}
if (result != 0)
{
if (!step.Equals(stepU))
// descending sort
return -result;
else
return result;
}
}
// if we come here the objects are equivalent
return 0;
}```

Sorting of international strings

Preface: The following text is valid for character sets derived from the Latin alphabet. I do not kow whether it is valid for Arabic, East Asian, Greek, Cyrillic etc. char sets.

It is rather easy to sort strings containing foreign characters: When in the `IComparer` or `IComparable`, you use the `string.Compare` method, and use an overloaded version with additional arguments '`bool ignoreCase`' and '`System.Globalization.CultureInfo culture`'.

C#
```/// <summary>
/// StringList Compare method
/// </summary>
int IComparer.Compare(object x, object y)
{
string src = x.ToString();
string trg = y.ToString();
int result = string.Compare(src, trg, m_ignoreCase, m_Culture);
return result;
}```

Here, you can define the language (by defining a '`Culture`') to sort strings. You have two choices:

• A string with the culture name like 'en-US'
• A number with the culture identifier like '0x0409'

A complete list of all identifiers can be found at MSDN.

Standard sorting

If you try the 'Intl. Text 1', you will (depending on your language) usually not see any difference between 'invariant', 'local PC', and 'en-US'. Only the 'da-DK' will bring a difference. The reason is that in most languages, special characters are sorted in the neighborhood of the characters they have correlation to. However, in Danish language, those characters are sorted as symbols, that means behind the standard characters.

As a conclusion:

Using the `CultureInfo` for sorting strings, is, in most cases, not necessary. Of course, to use it will not do any harm.

Secondary algorithm sorting

Some languages have two different official sorting algorithms.

Example: The German language has:

• A Dictionary sort
• A Phone Book sort

The difference is in treating the umlauts (which are a, u, o with two dots on top, they look like ä Ä ö Ö ü Ü).

In Dictionary Sort (which is standard), the umlauts are sorted just behind the corresponding vowel:

`T, U, Ü, V ...`

In Phone Book sort, umlauts are replaced by the corresponding vowel with an additional 'e':

`Ua, Ub, Uc, Ud, Ue, Ü, Uf ...`

To perform the secondary sort algorithm, you have to define a special culture identifier (unfortunately, a special culture name does not exist). For the German language, it is 0x10407 instead of the standard 0x0407.

Try 'Intl. Text 2' now and see the results.

Here is a complete list of all secondary algorithms: MSDN.

Remarks

In the .NET framework, there exists a class '`CaseInsensitiveComparer`' which I intentionally did not use or mention. In my opinion, the `string.Compare` method offers every possibility which is needed.

History

• 17-Jun-2006 - First version published.

Share

 Web Developer Germany
Peter Schlang, working with computers since 1974
Developing mainly for newspapers since 1981, first as employee of ATEX
Freelancer since 1987
Preferred language is VB: Starting with VB 1.0 and VBDOS, up to VB.NET