Using the System.Numerics.BigInteger class for computer vision and binary morphology operations in C# 4.0

gvessere

4.91/5 (18 votes)

Aug 25, 2010

CPOL

16 min read

77918

1847

A simple framework to perform morphology operations on binary images.

Introduction

Morphology operators are commonly used in computer vision in order to carry out basic object classification and recognition tasks. Here, I describe an efficient way to implement various binary morphology operators in C#. In the general sense, morphology operations are operations that transform a gray scale or binary image into a new image after applying a set of rules dependent on the direct neighborhood of every pixel. Two very common morphology operators are the erosion and dilation operators in which every pixel being processed in an image is replaced respectively by the minimum or by the maximum of pixels in its neighborhood. The neighborhood is defined by a set of pixels surrounding the pixel being operated on using a bit mask. Erosion dilation and, more generally, morphology masks can take any possible shape or size. The simplest mask which we are going to consider in this article consists of a 5 pixel cross pattern as shown in the figure below:

The simplest morphology mask.

Applying morphology operators to an image allows you to perform a myriad of elementary tasks that are crucial in computer vision, for instance: finding a background image for an image with objects of the same size as the mask, delineating object boundaries (even when they are in contact in the original image), turning pixels on or off depending on their number of neighbors, getting a rough measure of perimeter of objects in a binary image (this requires the objects to be labeled using another algorithm), and the long list goes on... In the following article, I will show examples of applying erosion and dilation operations using the previously described mask on the following image:

Original.

Eroded original. One layer of pixels is removed from the objects' periphery.

Dilated original. One layer of pixels is added and the objects appear thicker.

Erosion followed by dilation (also called image opening) applied to the original image. Thin features on the original are erased by erosion, and the global shape of objects is restored with dilation.

The morphology framework I present:

Makes use of C# 4.0's BigInteger class for a concise and efficient implementation
Allows specifying any kind of mask shape
Implements erosion and dilation operators for the simple mask presented above
Provides other methods to perform perimeter estimation, majority and minority operations

Background

Naïve implementations of erosion and dilation

Morphology operations are most often used in computing intensive tasks such as computer vision. It is therefore important for the implementation to be efficient.

When I started implementing erosion and dilation in C#, I thought I could implement an approximately linear time algorithm through a sliding min or max window and that that would take care of the speed requirement. Linear time with a sliding window?! I hear you say. Yes. Without delving too much into the details, in short, the idea is to make a horizontal pass and a vertical pass on the original image. The window considered slides along the rows or columns of the image, and as we do so, we keep track of the location of the min value through an index or pointer in this window. As we slide, we have a new value entering the scope of the window and a value exiting the window. If the new value is inferior or equal to the current min, we update the min pointer to this new value. If the value pointed to by the min pointer exits the window, we rescan the window entirely for a new min value and pointer.

It turned out that this approach was far slower than I would have liked, so I opted to write a much more optimized version of the same idea using pointers and the C# unsafe statement. Again, the outcome was too slow.

Another drawback of this approach was that it was severely limiting the kinds of operators I could use: since all the processing was performed using the min of a sliding window, I was compelled to use only morphology operators that were separable into horizontal and vertical components composed only of contiguous pixels.

Back on track

I soon ran into a paper written in 1992 which put me back on track: the authors of [1] described how to efficiently implement morphology operators such as erosion and dilation using bitmapped binary images, the Hit or Miss transform, and the logarithmic decomposition of operator masks (which we will not cover in this article). In this article, I show how to implement the methodology described in this paper using a clever trick which requires less coding on our side.

The Hit or Miss transform

Mathematical morphology is the field of mathematics which formally describes morphological operators and provides a framework for their manipulation. The Hit or Miss transform is one of these operators, and it can be used to describe erosion and dilation.

In short, it is written as follows: X (S,T), where X is our input image, S is a hit mask, T is a miss mask; the hit and miss masks are "applied" to every pixel in the image X to return our processed image. If r is the resulting value of a pixel after applying a Hit or Miss transform, p₁,..,p_n are the values of the pixels under the hit mask where the hit mask is non zero, and q₁,...,q_n are the values of the pixels under the miss mask where the miss mask is non zero, then we have the following relation:

R = X (S,T) ;=> r=p1 & ... & pn & ~q1 & ... & ~qn

where '~' is the logical not operator and is the Hit-or-Miss operator.

This can be translated into spoken language form as follows: In order for the Hit-or-Miss operator to return a positive outcome, all hit pixels must be turned on and all miss pixels must be turned off, while ignoring the values at the zero valued pixels in the masks.

Mathematical morphology describes erosion (the operator) and dilation (the operator) in terms of the Hit or Miss transform, as follows:

X Sh = X (S,)

X Sh = (X (,S))^c

is a mask consisting of only zeros, and ^c represents the complement operator.

After doing the math using this framework, we find that:

Erosion: r= p₁&...&p_n

Dilation: r= ~( ~q₁ & ... & ~q_n) = q₁|...|q_n

where r is the pixel in the output image, and p_is and q_is are the pixel values in the original image where the corresponding mask pixels are set.

In fact, this result can also be found much more simply by considering the fact that in binary images, the range of possible values for a pixel is 0 or 1, and the min and the max operators can be replaced with the & and | bit operators. This mathematical framework is useful, however, to extend the simple operations I just described to more complicated mask shapes (using, for instance, logarithmic decompositions of masks (also called structuring elements) as described in [1]) or other morphology operators.

An obvious optimization

Now that we realized that the Hit-or-Miss transform can express erosion and dilation operations and that these are expressed only in terms of the & and | operators, an obvious optimization is to pack our images into bit arrays. In this way, we get the immediate benefit of speeding up our erosion and dilation operations by a factor of 32. As we will see later in this article, C# 4.0 provides a neat data structure which will allow us to handle those images in packed format in a fairly straightforward way.

Shifting

Applying the transforms requires access to the value of the pixels in the neighborhood of the pixel currently being processed; this can be done by generating a new image that has been shifted by one or more pixels up or down and/or one or more pixels left or right - depending on the mask we want to apply - and then applying pixel wise operations between the newly generated images and/or the original image.

Shifting left and right can be handled by using the left shift or right shift operators on our packed format image to shift the bits in the byte array accordingly. Shifting up or down by one pixel requires shifting the image data by an amount of bits equal to the size of a line in the image.

Some care must be taken to make sure the image is padded with zeros on the left or the right of every line, because pixels at the end of a line that are right shifted will end up at the beginning of the next line. Therefore, we will make sure to pad each line of the image left or right by a number of pixels that is at least as big as the expected shift as required by the bit mask. In the specific implementation presented in the article, we will see that this only applies to the left or right side of a scan line in the image, but not to the top or the bottom of a column.

Shifting is a quick operation, and is handled using a single instruction on modern CPUs. When using the C# >> or << operators, the C# compiler makes sure to use the hardware shift instruction to make things as speedy as possible. However, the scope of the shift is limited to the word length: 32 or 64 bit. When applying a shift to a word containing our image data, an empty (zero valued) bit comes in from one side, and a bit from our image (zero or one valued) exits from the other side. This means we have to do extra bookkeeping work in order to make sure that a bit exiting a 32 or 64 bit word is fed into the next 32 or 64 bit word while shifting the data contained in a line. If this is not done, every 32^nd or 64^th pixel in our image will have been lost.

Using the code

The BigInteger class: The perfect candidate for the job

Fortunately, we will not have to do any of the book keeping work described in the previous section. .NET 4.0 provides an efficient implementation for an integer data structure with potentially infinite precision (provided there is enough address space). The BigInteger class is the perfect candidate for the job: It supports all the typical bitwise operations that we would expect from an integer type: &, |, ^, ~, <<, >>, and can store data as big as an image into one BigInteger. Additionally, the implementation of BigInteger is efficient even when shifting by a large amount of bits (e.g., 2048).

The implementation

In order to support different kinds of mask shapes, we first preprocess our morphology operator mask so we can access it generically and efficiently.

The HitOrMissMask class will take an array of sbyte containing -1s, 0s, and 1s. In this mask, miss pixels are represented by -1s, and hit pixels are represented by 1s. As a result, we see that a pixel is either hit or miss, and as one would expect, cannot be both: surely we cannot expect a pixel to be both on and off at the same time. The HitOrMissMask class will accept any mask, and take care of compressing this information into a format that ignores the zero valued pixels and that can be read by the BinaryMorphology class.

An erosion mask for the mask represented in the beginning of this article would look like this:

sbyte[] mask = new sbyte[]
            { 0, 1, 0,
              1, 1, 1,
              0, 1, 0 };

If used to construct an instance of the HitOrMissMask class, it is processed in the constructor as follows:

public HitOrMissMask(sbyte[] hitmiss, int width, int xCenter, int yCenter)
{
    Width = width;
    Height = hitmiss.Length / width;
 
    // count the number of non zero pixels in the structuring element
    int length = hitmiss.Where(p => p != 0).Count();
    HitOrMissX = new sbyte[length];
    HitOrMissY = new sbyte[length];
    HitOrMiss = new sbyte[length];
 
 
    // convert to something we can use more efficiently
    int c=0;
    for (int i = 0; i < hitmiss.Length; i++)
    {
        if (hitmiss[i] != 0)
        {
            sbyte x = (sbyte) (i / width - xCenter);
            sbyte y = (sbyte) (i % width - yCenter);
 
            HitOrMissX[c] = x;
            HitOrMissY[c] = y;
            HitOrMiss[c] = hitmiss[i];
            c++;
        }
    }
}

The outcome is stored inside three member arrays in the BinaryMorphology class: HitOrMissX, HitOrMissY, which contain the locations, relative to the center pixel you specify, of the pixels to test for, and the HitOrMiss array which contains what values to test for: on or off.

In this example, the following code produces the three arrays, with the values shown in the figure below:

sbyte[] mask = new sbyte[]
{ 0, 1, 0,
  1, 1, 1,
  0, 1, 0 };
HitOrMissMask hmMask = new HitOrMissMask(mask, 3, 1, 1);

Horizontal padding

This class instance is then used to perform our morphology operations using the BinaryMorphology class. First, we have to create an instance of the BinaryMorphology class by telling it which image we want to process, and what padding we will need in order to process it properly. The required padding depends solely on the size of the mask, and should be equal to the max in either horizontal or vertical direction of about half the size of the mask. In the case of a 3x3 mask, this should correspond to (3-1)/2=1 pixel of padding around the image.

Vertical padding

No padding is necessary for the top and bottom of the image as the << and >> operators on the BigInteger object will take care of that for us: as you recall from the previous section on shifting, << and >> will allow shifting of the bits in an integer, and let in zero valued bits on the least significant or the most significant bits of the integer, depending on the direction of the shift. Shifting an image of width 1024 up by one line will equate to letting in 1024 zero valued bits on the least significant side of the BigInteger.

The constructor of the BinaryMorphology class will take a bool[] image or a byte[] image with a threshold value, its width, and the amount of padding required to apply the desired morphology operations and store it into the PackedImage member of the BinaryMorphology class as a BigInteger object.

/// <summary>
/// constructor for BinaryMorphology class
/// </summary>
/// <param name="image">the image to process</param>
/// <param name="width">the width of the provided image</param>
/// <param name="padding">the padding to apply when
/// storing this image, usually half the length of the structuring element</param>
public BinaryMorphology(bool[] image, int width, int padding)
{
    Width = width + padding;
    int height = (image.Length / width);
    Height = height;
    Padding = padding;
    // last padding is for right padding the last line
    BinaryInput = new bool[Width * Height + Padding];
 
    // copy the data in the new array with necessary amount of left padding at every line
    for (int i = 0; i < height; i++)
        Buffer.BlockCopy(image, i * width, BinaryInput, 
             i * Width + Padding, width); // one bool is encoded as 1 byte in c#
 
    // convert to packed format
    BitArray packed = new BitArray(BinaryInput);
 
    // use intermediate byte[]
    byte[] scratch = new byte[(BinaryInput.Length >\> 3) + 1];
    packed.CopyTo(scratch, 0);
 
    // create BigInteger Image
    PackedImage = new BigInteger(scratch);
}

/// <summary>
/// constructor for BinaryMorphology class
/// </summary>
/// <param name="image">the image to process</param>
/// <param name="threshold">the threshold level</param>
/// <param name="width">the width of the provided image</param>
/// <param name="padding">the padding to apply when storing
///     this image, usually half the length of the structuring element</param>
public BinaryMorphology(byte[] image, byte threshold, int width, int padding)
    : this(image.Select(b => b > threshold).ToArray(), width, padding)
{
}

Create the structures

When padding the image, we only take care of padding the left side of each line. That is sufficient as in this case, the pixel to the left of every line in the image is also the same pixel as the pixel to the right of the line previous to that. Also, for the same reason, we need to explicitly pad on the right the last line in the image.

The data is copied line by line, from a bool[] image to a new padded bool[] image, then, a BitArray is used to pack the data into a format where one bool is represented as one bit. The data is finally copied to a byte array which serves as the constructor for the BigInteger object.

Given a 1024 by 1024 bool[] image, the following lines of code creates for us a BinaryMorphology class instance which we can use to perform morphology operations using the HitOrMissMask we previously created:

sbyte[] mask = new sbyte[]
{ 0, 1, 0,
  1, 1, 1,
  0, 1, 0 };
HitOrMissMask hmMask = new HitOrMissMask(mask, 3, 1, 1);
BinaryMorphology morph = new BinaryMorphology(image, 1024, 1);

Let's do some work

Now that the image is converted to the desired packed format, we can apply our morphology operators: the core of the work is performed using the following code segment that essentially applies the Hit-or-Miss transform formula given above and the HitOrMissMask class instance that I just described.

public void ApplyHitOrMissTransform(HitOrMissMask hmMask, 
            ref BigInteger input, out BigInteger output)
{
    BigInteger scratchImageHit = -1;
    BigInteger scratchImageMiss = 0;
            
    for (int i = 0; i < hmMask.HitOrMiss.Length; i++)
        if (hmMask.HitOrMiss[i] > 0)
            scratchImageHit &= (input >> (Width * hmMask.HitOrMissY[i] + 
                                    hmMask.HitOrMissX[i]));
            
    for (int i = 0; i < hmMask.HitOrMiss.Length; i++)
        if (hmMask.HitOrMiss[i] < 0)
            scratchImageMiss |= (input >> (Width * hmMask.HitOrMissY[i] + 
                                 hmMask.HitOrMissX[i]));
 
    output = scratchImageHit & (~scratchImageMiss);
}

Given the hmMask and morph instances previously created, the following code will perform erosion on our image:

sbyte[] mask = new sbyte[]
{ 0, 1, 0,
  1, 1, 1,
  0, 1, 0 };
HitOrMissMask hmMask = new HitOrMissMask(mask, 3, 1, 1);
BinaryMorphology morph = new BinaryMorphology(image, 1024, 1);
morph.ApplyHitOrMissTransform(hmMask);

Similarly, the project submitted with this article comes with a few more helper methods that will let you perform standard morphology operations with the standard mask shape presented at the beginning of this article.

An architectural note

You may have noticed the BigIntegers are passed by reference in all the functions through the use of ref and out parameter modifiers. That is because the BigInteger object is a value type, and as such, is always passed by value by default when being fed as a function argument. Since we are filling the BigInteger objects with Image data, we don't want to copy around this much data back and forth, hence the use of the ref and out parameter modifiers.

Getting our data back

Lastly, once we have performed all the work we need to, we can recover our transformed image using the GetBoolArray method:

sbyte[] mask = new sbyte[]
{ 0, 1, 0,
  1, 1, 1,
  0, 1, 0 };
HitOrMissMask hmMask = new HitOrMissMask(mask, 3, 1, 1);
BinaryMorphology morph = new BinaryMorphology(image, 1024, 1);
morph.ApplyHitOrMissTransform(hmMask);
morph.GetBoolArray();

That will essentially undo the padding and unpack the image such that it can be more easily manipulated.

/// <summary>
/// unpack and return processed data in bool[]
/// </summary>
/// <returns></returns>
private bool[] GetBoolArray(BigInteger image)
{
    bool[] boolProcessedCropped = new bool[(Width - Padding) * Height];
    byte[] t1 = image.ToByteArray();
    BitArray t2 = new BitArray(t1);
    bool[] boolProcessed = new bool[t1.Length * 8];
    t2.CopyTo(boolProcessed, 0);
 
    for (int i = 0; i < Height; i++)
    {
        int off = i * (Width - Padding);
        int bytesToCopy = Math.Max(0, Math.Min(boolProcessedCropped.Length, 
                                   off + Width - Padding) - i * (Width - Padding));
        bytesToCopy = 
          Math.Min(bytesToCopy, boolProcessed.Length - (i * Width + Padding));
        if (bytesToCopy > 0)
            Buffer.BlockCopy(boolProcessed, i * Width + Padding, 
                             boolProcessedCropped, off, bytesToCopy);
    }
 
    return boolProcessedCropped;
}

Standard morphology operations and mask

Aside from implementing the Hit-Or-Miss transform, the framework presented also includes the erosion and dilation operators using the mask described at the beginning of this article. This comes in handy when performing operations described in the next section, such as figuring out the perimeter pixel set for an object on an image.

private void ErodeDisk1(ref BigInteger input, out BigInteger output)
{
    byte[] erodeMask = HitOrMissMask.Disk1;
    Erode(erodeMask, 3, 1, 1, ref input, out output);
}
private void Erode(byte[] mask, int width, int xCenter, int yCenter, 
        ref BigInteger input, out BigInteger output)
{
    sbyte[] erodeMask = mask.Select(p => (sbyte)((p == 0) ? 0 : 1)).ToArray();
    HitOrMissMask erodeHmMask = 
          new HitOrMissMask(erodeMask, width, xCenter, yCenter);
    ApplyHitOrMissTransform(erodeHmMask, ref input, out output);
}

Other dilation and image opening and closing operations are implemented in the same fashion as the one I just described.

Here's how you would use these standard operators:

BinaryMorphology morph = new BinaryMorphology(image, 1024, 1);
morph.ErodeDisk1();
bool[] eroded = morph.GetBoolArray();

Bonus methods: perimeter and minority/majority operations

Perimeter image

Here is how you create a perimeter image from the objects in your binary image:

public bool[] GetPerimeterImage()
{
    BigInteger perim, eroded;
    ErodeDisk1(ref PackedImage, out eroded);
    perim = PackedImage & ~eroded;
    return GetBoolArray(perim);
}

As you can see, finding out the perimeter pixel set for an object is as easy as eroding the image and removing this pixel set from the original binary image.

Minority and majority operator

I will describe the minority operator; however, the same reasoning applies to the majority operator, only in the opposite way. The minority operator consists of turning off pixels selectively depending on whether at least a certain number of pixels in the direct 8-connected neighborhood are turned off. In terms of image processing, this has the effect of selectively turning off pixels in areas that are less dense, while at the same time keeping pixel dense areas (like the object that you want to keep) unharmed.

This functionality is implemented in the Erode method that has a count parameter in its signature. I used 8 buffers to contain the data for the pixels in the 8 different possible connected neighborhoods of a pixel. The data is shifted back to the frame of the center pixel, and sorted using a temporary array, such that the bits in the lower ordinal images are set and the bits in the higher ordinal images are cleared. We then perform a logical AND between the original image and the appropriate image in the array of sorted images.

/// <summary>
/// clear pixels for which at least count neighbors are cleared else the pixel is set
/// </summary>
/// <param name="imageIn"></param>
/// <param name="imageOut"></param>
/// <param name="count"></param>
private void Erode(ref BigInteger imageIn, out BigInteger imageOut, int count)
{
    BigInteger[] BitsAround = GetSortedNeighborhoodBits(ref imageIn, count);
 
    imageOut = imageIn & BitsAround[8-count];
}
 
/// <summary>
/// returns a list of images that contain the bits in the 8-connected neighborhood of 
/// every pixel in imageIn. The neighborhood bits are sorted (set and unset bits are
/// seperated). ones go in the images with low ordinals, zeros go in the images with 
/// higher ordinals
/// </summary>
/// <param name="imageIn"></param>
/// <param name="count"></param>
/// <returns></returns>
private BigInteger[] GetSortedNeighborhoodBits(ref BigInteger imageIn, int count)
{
    BigInteger temp;
    BigInteger[] BitsAround = new BigInteger[8];
 
    int index = 0;
    for (int i = -1; i <= 1; i++)
        for (int j = -Width; j <= Width; j += Width)
        {
            if (!(i == 0 && j == 0)) // avoid considering the center pixel
            {
                // negative shifts handled by framework
                BitsAround[index] = imageIn << (j + i);
                index++;
            }
        }
 
    // swap bits such that ones go in the images with lower ordinals
    // and zeros go in the images with higher ordinals
    // need to do more than one pass to bring an on pixel in the highest ordinal
    // down to the low ordinals
    for (int i = 0; i < 7; i++)
        for (int j = 0; j < 7; j++)
        {
            temp = BitsAround[j + 1];
            BitsAround[j + 1] = BitsAround[j + 1] & BitsAround[j];
            BitsAround[j] = BitsAround[j] | temp;
        }
    return BitsAround;
}

Conclusion

In this article, we discovered a surprisingly concise way to use the .NET 4.0 BigInteger class as a means to perform efficient morphology operations on binary images. The full code for the binary morphology framework is available for peering through in the project download file associated with this article. Hopefully, this will stimulate other readers to come up with creative uses of this powerful class.

(Methods for Fast Morphological Image Transforms Using Bitmapped Binary Images. REINVAN DEN BOOMGAARD, RICHARD VAN BALEN)

Known issues

The zoom feature in the demo app is a bit clutchy.
There seems to be a bug the .NET implementation of BigInteger that causes a crash if the BigInteger is not so big (thanks to jdeh.medical for finding that out) - we are still investigating the issue.

History

August 10, 2010: First write-up.