Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / multimedia / image-processing

Simple image comparison in .NET

4.89/5 (194 votes)
25 Nov 2014CPOL8 min read 748K   47.4K  
A set of .NET extension methods to get the difference between images and more...
Introduction   

This is a started out as a set of simple extension methods for the System.Drawing.Image class which would allow you to:

  • Find out how different two images are as a percentage value based on a threshold of your choosing
  • Get a difference-image which shows where the two are different
  • Get the difference as raw data to use for your own purposes  

With the feedback received here, the solution has been expanded to 

  • A console version which can take the paths of two images as parameters, and return the difference as an errorlevel
  • A COM callable DLL (still in need of testing - anyone?)
  • The ability to find duplicates in a folder, or among a list of imagepaths

As a bonus, you get extension methods for resizing or grayscaling an image. 

Image 1

With the feedback I recieved here, the 

Background

You don't need to read this chapter to use the extension methods, so if you don't care about how I created the software - just skip to the Using the code bit.

I was happily coding away one evening, making a little tool to download images from posts in an RSS feed, when I first got the idea: "If there's a link from the blog post to another post cited as the source, why not have the code go there and check for a higher resolution version of the image?" Then I thought "but the original image might not have the same name as the one I first met in the post" Frown | <img src=

So I had to find a way of comparing the visual representation of the two images. I am not particularly good at math, so after Googling around and finding assorted algorithms using wavelets, keypoint matching, etc. (which seemed out of my mental reach Wink | <img src= ), I found out that some people were having good results using histograms. So I first went down that path.

Histograms

A histogram is a way of representing what kinds of colors are in a picture. You can create a histogram describing the light or red, green, or blue values in an image. A basic way of creating a histogram is to look at each pixel in a bitmap and for each of them find out what the value of the property you are looking at (RGB) is. For each possible value (typically 0-255), you have a variable which you increment. This way, when you are done with all the pixels, you can iterate over the variables and see how many pixels had low values, medium values, and high values of light, R, G, or B. I fiddled around a couple of hours with this, but found out that in the end, this wasn't good enough for detecting differences in pictures, as two totally different pictures depicting almost the same (e.g., two corn fields) with the same composition of colors/light can be hard to differentiate using just a histogram. I tried comparing the two using average in colors, and variance, but to no avail.

The good thing is that as a byproduct of my work, now there are histogram extension methods for the Bitmap class: GetRgbHistogram and GetRgbHistogramBitmap.

C#
//get a histogram as an image
Bitmap bmpHist =  img1.GetRgbHistogramBitmap();   
//save it
bmpHist.Save("C:\\bmphist.png");

Image 4

C#
//get a histogram object
Histogram hist =  img1.GetRgbHistogram(); //basically three arrays for RGB values
//show it in the console
Console.WriteLine(hist.ToString());

Image 5

Feel free to use them Smile | <img src= 
Note: Based on Svansickle's comments on this article on histograms I've implemented the Bhattacharyya histogram algorithm, which is a way of comparing two images based on their normalized histograms.
You can play around with this to see whether this way of comparing two images is better for your purposes. I've added the functionality because it seemed interesting, and I'd love to hear if you've found uses for it :)

Simplifying

Histograms were a dead end to me. But after Googling around some more, I read some forum posts suggesting that if images were reduced to a much smaller size, and even gray scaled, then the differences would be both faster and simpler to find. It was worth a try.

Using the .NET Framework, I could easily resize an image. Then I found some code that would grayscale an image. Now all that was left was to iterate through the pixels on both images and compare the two and then find out how many were different.

Here are the two images I used, with an XBOX controller, a post-it with and without text and two different colored pens.

Image 7 Image 8

I thought I'd start out by using a gray scaled, 16x16 pixel version of each image, and see whether I needed higher resolution for practical use.

Image 9 Image 10

Upscaled versions of the two 16x16 pixel images

For each pixel I'd then get the difference of the brightness value compared to the other image's pixel in the same spot and save it in a double array of bytes (since R, G and B values in Bitmap can be between 0 and 255). I would then count all the values in the double array which weren't zero, divide this value by the amount of pixels in an image (256), and voilà - I would have a difference value in percentages - right? ....not quite Smile | <img src= 

It works well enough to visualize differences between two images as you can see here:

Image 12Image 13Image 14

...and the low resolution seemed to work okay - yay! Big Grin | <img src=  

But...!

There was a slight problem though. My algorithm was also finding differences where none were visible to the naked eye. Here and there a re-encoding of a JPEG in the same resolution as the original or identical images in different resolutions suddenly had a lot of differences with a value of 1 or 2 showing up. This could easily give me fifty pixels with a difference - and fifty pixels out of 255 is about 20%, which makes the algorithm too blunt, since a human could detect no difference visually. So I introduced a threshold (a value that the difference had to exceed to be counted).

Using a threshold

Here you can see the differences between a 200 pixels wide version of an image and a 100 pixels wide version of the same image:

Image 16

You can see what threshold values resulted in the above (the red text). By default, pixel difference values below 4 are now treated as no difference, and it is possible to adjust this by giving the extension method an optional parameter:

C#
//find out how different two images are, with a threshold of 5 in the lightness of pixels
int threshold = 5;
float percentageDifference = img1.PercentageDifference(img2, threshold);

This also makes it possible for you to adjust the sensitivity of the code according to your needs. A default threshold of three works for me, but by all means - play around with it, and adjust this to your heart's desire or depending on the task you need it for.

That's it folks!

This is where my story ends. I now have the ability to detect whether two images are similar, how much they differ, and where, which was all I wanted - yay! XD.

I hope you've had fun reading my little coding-story, and I hope you can use the code somehow.

Kind regards - Jakob "XnaFan" Krarup.

Using the code

Get the DLL

All you need to do is download the DLL or the complete solution (see top of article), add a reference to XnaFan.ImageComparison.dll, and a using XnaFan.ImageComparison statement to your code file - and you should be set to go.

Use the public methods of ImageTool

C#
//use this method to find the difference between two images (returns a float between 0 and 1)
float GetPercentageDifference(string image1Path, string image2Path, byte threshold = 3)

//use this method to find the difference in percent between the Bhattacharyya histograms
//Bhattacharyya histogram is a normalized histogram, see comments...
float GetBhattacharyyaDifference(string image1Path, string image2Path)

//Use these methods to get a list of duplicate images in a specific folder
​List<List<string>> GetDuplicateImages(string folderPath, bool checkSubfolders)
List<List<string>> GetDuplicateImages(IEnumerable<string> pathsOfPossibleDuplicateImages)

Use the extensionmethods for Images and Bitmaps

C#
//get the difference between an image and another
float PercentageDifference(this Image img1, Image img2, byte threshold = 3)

//get a visualization of the difference between an image and another
//  (see "Visualizing the differences" below)
Bitmap GetDifferenceImage(this Image img1, Image img2, 
        bool adjustColorSchemeToMaxDifferenceFound = false, bool absoluteText = false)

//get the differences as a 16x16 array of bytes
byte[,] GetDifferences(this Image img1, Image img2)

//get the grayscalevalues as a 16x16 array of bytes
byte[,] GetGrayScaleValues(this Image img)

//get a grayscaled version of an image
Image GetGrayScaleVersion(this Image original)

//get a resized version of an image
Image Resize(this Image originalImage, int newWidth, int newHeight)

//get a bitmap containing the histogram for an image (see "Histograms" above)
Bitmap GetRgbHistogramBitmap(this Bitmap bmp)

//get histogram information about bitmap
Histogram GetRgbHistogram(this Bitmap bmp)

Visualizing the differences

I have included the possibility of color coding the difference bitmap either by using a palette of black to pink corresponding to values 0 to 255, or by having the palette map to whatever is the current max. This will enable you to highlight small differences or keep them dark as you wish. See the difference here:

Image 17

This is the difference between using the parameter adjustColorSchemeToMaxDifferenceFound as true or false:

C#
Bitmap diffNoAdjust = bmp1.GetDifferenceBitmap(bmp2);
Bitmap diffAdjusted = bmp1.GetDifferenceBitmap(bmp2, true);

Any trouble using the code, or ideas for improvements - let me know.

Sample Console and WPF application included

To get you started, I've included a sample Console application and a WPF app - to show how you can use the code. WPF also has the added benefit of giving you code to transform between System.Drawing.Image ("regular" .NET) and System.Windows.Media.Imaging.BitmapSource (WPF). I am just starting on WPF, so it was a learning-by-failing experience. Don't go looking for any best-practices there ;-D.

Console version 

A reader asked for a command prompt version of the functionality, so there is also a commandline version now, which returns the difference between the images as an error level so you can use it from a batch file or other programming language. 

Usage 

ImageComparisonConsole.exe [image1 path] [image2 path]

Here is a sample batch file using it: 

@echo off 
<pre>REM saving paths to images 
REM you can also use absolute paths. i.e "C:\something.png"  
set image1="firefox1.png" 
set image2="firefox2.png"  
REM print what is about to happen
echo 'ConsoleImageComparison.exe %image1% %image2%'   
REM execute the program  
call ConsoleImageComparison.exe %image1% %image2% 
REM tell what the detected difference is 
echo The difference is %ERRORLEVEL%%%  

You can just drag two image files onto the console app, and the errorlevel will be set, or drag them onto a batchfile which calls the app and displays the errorlevel, etc.

This is only a tool - you decide what to use it for :)

History 

  • April 2012 - Version 1.0
  • December 2012 - Version 1.1 - small clarifications
  • January 2013 - Version 1.2 - added console version which sets error level
  • September 2014 - Version 1.3 - disposed of images - thanks to commentor below :)
  • November 2014 - Version 1.4 - COM compatibility for comparison method
  • November 2014 - Version 1.5 - Find duplicates functionality added
  • November 2013 - Version 1.6 - Bhattacharyya (normalized) histogram implemented, and disposing of Image objects (thanks to commentors! ..you know who you are ;-))
  • June 2021 - published a .NET Core version to Github 
    • Just the core functionality to compare two images and find duplicates of an image
    • Parallellized version for faster comparison (using Parallel.ForEach)
    • Cleaned functionality, naming and comments

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)