Introduction
This is the first in (hopefully) a several part series on data structures. In each part we'll
introduce a data structure, discuss the basic idea, the pros, the cons, and the pitfalls.
Background
A Singly Linked List is one of the first complex structures taught. There are gajillion
implementations of it in pretty much every language (with the concept of pointers) known to man.
The concept is very simply. The list is made up of nodes. Each node tracks one piece of data.
It also keeps a reference to the next node (if there is one). You can traverse down the list
one way, but not the other because we don't have a pointer to a previous node (that would be a
Doubly Linked List - Coming Soon).
The Code
Before we start coding, we should establish a couple implementation details. First, I like
having a "root" list node. You can't ever really get back to the root from any other node
(because singly linked lists only go one way), but it keeps some aspects of the code cleaner
(like dealing with what happens when the user wants to remove the first item of the list). Also,
I don't really want anybody to be able to create, or even see any of the intermediate nodes.
For all they care, this is a black box data structure. This also avoids some nasty problems
like making circular lists. Finally, there's a little boundary case we'll get to in a few
paragraphs. When you append 2 linked lists together, you end up with a linked list which points
half way into another linked list. I think this is OK
TM, so
it'll stay that way. You may not. If this is the case, you can figure out how to deal with it.
I'll give out some pointers when we get there.
So, let's start off with a little class stub. This just gives us our simple class with a couple
private member variables which hold the data and the pointer to the next node.
using System;
namespace System.Collections.Implementations
{
public class SinglyLinkedList
{
#region Private Variables
private SinglyLinkedList _Next = null;
private object _Item = null;
#endregion
}
}
So, in case you couldn't guess, _Next
is the next node and _Item
is the value of this node. Wow. So let's add a couple constructors.
#region Constructors
public SinglyLinkedList()
{
}
protected SinglyLinkedList(object Item)
{
_Item = Item;
}
#endregion
You'll find that I like to use a lot of #region
tags. I find it keeps code
cleaner. So we've got our basic SinglyLinkedList
constructor which doesn't really
do anything. This is used when creating a new SinglyLinkedList
object.
The other constructor is used by the class itself and can not be used by users (that's why
it's private). It creates a new node within a list. Remember our ground rules for the
implementation up above? One of our tenets was that this be a black box to users. They are not
allowed to see these internal nodes. Thus, the constructor is private.
Now we need to actually be able to add data to the list.
protected SinglyLinkedList Last
{
get
{
SinglyLinkedList Target;
for (Target = this; Target._Next != null; Target = Target._Next) { }
return Target;
}
}
public void Add(object Item)
{
Last._Next = new SinglyLinkedList(Item);
}
public void Append(SinglyLinkedList OtherList)
{
Last._Next = OtherList._Next;
}
I threw two routines for adding items in here along with a property which supports both -
Last
. The property is pretty simple. It just loops through the nodes until it
finds the last one.
With that in place, adding a new item is child's play. We just find the last node using
Last
and set its' next item to be a new node in our list in Add
.
The Append
routine looks simple, but it has a little trick in it. We've want to
stick two linked lists together. It'd be easy enough to list all the items in the second list
and add them one by one (or it will be once we add routines to get items), but that would take a
long time for a long list. That is what's called an O(n²)
("Order of n squared")
operation. If it takes t
time to add one item, it'll take t * m * n
time to copy a second list over where m
and n
are the number of items
in the linked lists.
Instead what we do is just take our last item and set it's next pointer to be the second list's
first item with data. Done. This operation is O(n)
("Order of n") meaning that
it takes the same amount of time no matter how many items are in the second list. The time only
depends on the number of items in the first list (which is much nicer). There are ways to get this
down to O(1)
("Order of 1") meaning it takes the same time no matter how many items
there are, but it means tracking some extra stuff and except in rare cases, it's not really worth
it.
private SinglyLinkedList FindNode(int Index)
{
if (Index < 0)
throw new ArgumentOutOfRangeException("Index", Index,
"Index must be greater than or equal to 0");
SinglyLinkedList Target;
for (Target = this; Index > 0; Index--)
if (Target._Next != null)
Target = Target._Next;
else
throw new ArgumentOutOfRangeException("Index", Index,
"Index greater than the number of items");
return Target;
}
public void Insert(object Item, int Index)
{
SinglyLinkedList PreviousNode = FindNode(Index);
SinglyLinkedList NextItem = new SinglyLinkedList(Item);
NextItem._Next = PreviousNode._Next;
PreviousNode._Next = NextItem;
}
public void Remove(int Index)
{
SinglyLinkedList PreviousNode = FindNode(Index);
if (PreviousNode._Next == null)
throw new ArgumentOutOfRangeException("Index", Index,
"Index greater than the number of items");
else
PreviousNode._Next = PreviousNode._Next._Next;
}
The Insert
and Remove
routines are decidedly more complex. Both
rely on FindNode
to find the node they are working on. That routine is very simple.
Just loop through a counter until you get to the right node in the list.
The Insert
routine creates a new node for the list. Then it takes the node from
FindNode
and sets the new node's next node to be that of the node found. Then it
points the found node's next node to the new node. What we essentially did was wedge a new node
in that list.
The Remove
routine does the exact opposite. It finds the node then tells it to
skip over the next node by pointing directly to the next node's next node.
At this point we're done with routines that modify the data and all the ugly routines for
reading the data. All we have left are a few accessors...
public object Get(int Index)
{
return FindNode(Index)._Item;
}
public object this[int Index]
{
get
{
return FindNode(Index)._Item;
}
set
{
Insert(value, Index);
}
}
public int Length
{
get
{
int Counter = 0;
for (SinglyLinkedList Target = this; Target._Next != null;
Target = Target._Next)
{
Counter++;
}
return Counter;
}
}
We have to include a Get
routine so people can look at the data, but it's super
simple. The this
accessor just lets people get or change items based on the index.
This is just a handy convention for people to use, since they have routines for it otherwise. The
Length
accessor is a nice to have if people are going to loop through things. We could
add a Replace
routine which is simple, but whatever.
One Step Further
These days everyone wants everything handed to them on a platter. So, we're going to add
the IEnumerable
and IEnumerator
interfaces. Besides, this makes a nice
example of how to use them. First we need to add the interfaces to the class.
public class SinglyLinkedList : IEnumerable, IEnumerator
{
...
}
When you add each one, Visual Studio should be kind enough to create stubs for each routine for
you. So lets fill those stubs in:
#region Enumeration Members
public IEnumerator GetEnumerator()
{
Reset();
return this;
}
private SinglyLinkedList _EnumerationTarget = null;
public void Reset()
{
_EnumerationTarget = this;
}
public object Current
{
get
{
return _EnumerationTarget._Item;
}
}
public bool MoveNext()
{
_EnumerationTarget = _EnumerationTarget._Next;
return _EnumerationTarget != null;
}
#endregion
There's a whole bunch of stuff here, so lets take a step back. When you actually use something
like this...
foreach (object Item in List)
{
...
}
...this actually gets compiled to look something like...
IEnumerator foo = List.GetEnumerator();
foo.Reset();
foo.MoveNext();
for (object Item = foo.Current(); foo.MoveNext() != null; Item = foo.Current())
{
...
}
So, we've got 4 routines there. GetEnumerator
is required by the
IEnumerable
class. It allows you to return an object which can enumerate through
your class. However, I believe in minimalism, so I'm reusing my SinglyLinkedList
class as the enumerator. So this routine just resets the state of the enumerator and returns
its own class.
Reset
is a simple routine, just used to move to the beginning of the list.
Nothing big. So we use an internal variable to point to the root of our linked list.
Current
gets the value of the object at the current location in the list.
MoveNext
moves the pointer to the next item and returns a boolean detailing whether
we're at the end of the list.
That's all there is to it.
Pros and Cons
The biggest pro to a Singly Linked List is the simplicity. I wrote this whole article including
testing the code in a couple of hours. If you just need to keep track of a smallish list of things,
this is the way to go. You can also extend this code very simply to create structures like stacks
and queues...
public class Stack : SinglyLinkedList
{
public void Push(object Item)
{
Add(Item);
}
public object Pop()
{
if (_Next == null)
throw new Exception("Stack is Empty");
object Item = Last;
Remove(Length - 1);
return Item;
}
public object Peek()
{
if (_Next == null)
throw new Exception("Stack is Empty");
return Last._Item;
}
}
public class Queue : SinglyLinkedList
{
public void Enqueue(object Item)
{
Add(Item);
}
public object Dequeue()
{
if (_Next == null)
throw new Exception("Queue is Empty");
object Item = this[0];
Remove(0);
return Item;
}
public object Peek()
{
if (_Next == null)
throw new Exception("Queue is Empty");
return this[0];
}
}
The cons are pretty straightforward. You can't do anything really tricky with this data.
You can sort but it's a huge pain to do it with simple sorts. Ridiculously complex with fast
sorting algorithms. It's also a pretty slow structure. Most operations require iterating through
all nodes which is fine when you have a dozen or so items, but pretty sucky when you get a long
list.
Stay tuned for the next installment - Doubly Linked Lists :-
- Same Data Structure Time.
- Same Data Structure Channel
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.