Click here to Skip to main content
15,867,834 members
Articles / Programming Languages / C#

DeltaScope

Rate me:
Please Sign up or sign in to vote.
4.96/5 (10 votes)
1 Sep 2009CPOL2 min read 37.3K   408   17   17
A reusable delta engine with GDI+ front-end controls.
Image 1

Introduction

This is a candidate entry for The Code Project's Lean and Mean competition to write an efficient text delta tool.

I was checking my emails while on vacation and I came across a Code Project daily mailing with a reminder for the LCS competition. I have enjoyed reading The Code Project articles for many years now, but until this moment I never had the opportunity to contribute.

When I first started with C# I decided to write a diff program in order to really learn the language. I call this program, DeltaScope. What started out as a simple console-based diff tool has gone through many iterations. When I finally stopped work on it, I had a reusable delta engine with GDI+ front-end controls.

Unfortunately, I have not touched this code since 2005. Thankfully my fiancé is very understanding of my “coding time” and I was able to take some time to write this article and bring the code up to C# 3.0 specifications.

Longest Common Subsequence

Many of the entries have excellent explanations of the LCS algorithm. When I wrote my DeltaScope I referred to this paper extensively. http://www.ics.uci.edu/~eppstein/161/960229.html[^]

I find the best way to understand the LCS algorithm, is to think of it like this:
The output of a delta operation is a sequence of instructions to transform file 1 into file 2.

Points of Interest

The core of the system is the DeltaEngine class. The DeltaScope application wraps up the engine in the DeltaScopeEngine class.

C#
public class DeltaScopeEngine 
{
        public bool IgnoreCase { get; set; }
        public bool IgnoreTabs { get; set; }

        public TimeSpan ElapsedTime { get; private set; }
        public long PeakMemoryUsage { get; private set; }

        public List<DeltaBlock> Compare(string tsLeftFile, string tsRightFile)
        {
            var process = Process.GetCurrentProcess();
            process.Refresh();
            var peakWorkingSet64 = process.PeakWorkingSet64;

            var timer = DateTime.Now;
            var deltaEngine = new DeltaEngine();
            var result = deltaEngine.GetDifferences(RipFile(tsLeftFile),
                RipFile(tsRightFile), hashSideDelegate);
            ElapsedTime = DateTime.Now - timer;
            PeakMemoryUsage = process.PeakWorkingSet64 - peakWorkingSet64;
            
            return result;
        }
        private static string RipFile(string tsFileName)
        {
            ...
        }
        private List<DeltaString> hashSideDelegate(string sideValues)
        {
            const string blockLineTerminator = @"\r\n";
            var values = new List<string>(Regex.Split(sideValues, blockLineTerminator));
            var results = new List<DeltaString>();

            for (var lnCnt = 0; lnCnt < values.Count; lnCnt++)
            {
                var lnHashCode = _GenerateHashFromString(values[lnCnt]);
                if (lnHashCode >= 0)
                    results.Add(new DeltaString(values[lnCnt], lnHashCode));
            }
            return results;
        }
        private int _GenerateHashFromString(string tsLine)
        {
            var lnHash = 0;
            var lnRealPtr = 0;
            
            // Handle the upper casing issue...
            var lcaLine = IgnoreCase ? 
                tsLine.ToUpper().ToCharArray() : tsLine.ToCharArray();

            // hash_string(ABCW) = 65*1 + 66*2 + 67*3 + 87*4
            var lnMaxLength = lcaLine.Length;
            for (var lnCnt = 0; lnCnt < lnMaxLength; lnCnt++)
            {
                // Handle the white space characters...
                if (lcaLine[lnCnt] == '\t' && IgnoreTabs)
                    continue;

                lnRealPtr++;
                lnHash += ((byte)lcaLine[lnCnt] * lnRealPtr);
            }

            // return the hash value...
            return (lnHash);
        }
}

The hashAlgorithm delegate lets the developer control how each string is parsed into a list of DeltaStrings. In this simple example we can choose to make the difference case sensative or case insensative, and we can choose to ignore tab characters.

Timings and Memory Consumption

The average run timing is 13ms. I do not include any screen rendering in my timings. As for memory consumption, I always come up with 0 memory used. Which means I am either not using process.PeakWorkingSet64 correctly, or my code is magical. ;)

Future direction

Many diff tools allow the user to merge the 2 files into 1 new file. With the list of delta blocks that the DeltaEngine generates, it should be possible to allow the user to build a new file. This would require some sort of block choosing ability along with perhaps text editing. Unfortunately the current GDI+ controls are display only.

When I initially wrote this, GDI+ was a pretty new technology, and WPF did not yet exist. I think a good next step would be to use WPF as the display technology for the display. With WPF the swirl block control can be rewritten to include editing abilities.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Questionvery cool UI Pin
dmihailescu4-Aug-15 6:50
dmihailescu4-Aug-15 6:50 
AnswerRe: very cool UI Pin
NYCChris4-Aug-15 9:37
NYCChris4-Aug-15 9:37 
GeneralPrinting Pin
StevePasha14-Jul-10 19:10
StevePasha14-Jul-10 19:10 
GeneralRe: Printing Pin
NYCChris15-Jul-10 8:29
NYCChris15-Jul-10 8:29 
Hi Steve,

I will have to check later tonight when I have more time, but I believe the canvas painting can be redirected to different outputs. How would you expect it to work? ie: Print the whole document, or just the viewport?
GeneralRe: Printing Pin
StevePasha16-Jul-10 8:37
StevePasha16-Jul-10 8:37 
GeneralSource does not build for me due to missing App.Config in the DeltaEngine project Pin
Sacha Barber3-Sep-09 3:51
Sacha Barber3-Sep-09 3:51 
GeneralRe: Source does not build for me due to missing App.Config in the DeltaEngine project Pin
NYCChris3-Sep-09 5:57
NYCChris3-Sep-09 5:57 
GeneralRe: Source does not build for me due to missing App.Config in the DeltaEngine project Pin
Sacha Barber3-Sep-09 6:11
Sacha Barber3-Sep-09 6:11 
GeneralRe: Source does not build for me due to missing App.Config in the DeltaEngine project Pin
NYCChris3-Sep-09 6:40
NYCChris3-Sep-09 6:40 
Generalmetrics Pin
Luc Pattyn2-Sep-09 16:28
sitebuilderLuc Pattyn2-Sep-09 16:28 
GeneralRe: metrics Pin
NYCChris2-Sep-09 16:33
NYCChris2-Sep-09 16:33 
GeneralRe: metrics Pin
Luc Pattyn2-Sep-09 16:38
sitebuilderLuc Pattyn2-Sep-09 16:38 
GeneralRe: metrics Pin
NYCChris2-Sep-09 19:08
NYCChris2-Sep-09 19:08 
GeneralRe: metrics Pin
NYCChris4-Sep-09 9:19
NYCChris4-Sep-09 9:19 
Generalerror Pin
mandimandi2-Sep-09 0:01
mandimandi2-Sep-09 0:01 
GeneralRe: error Pin
NYCChris2-Sep-09 2:45
NYCChris2-Sep-09 2:45 
GeneralRe: error Pin
mandimandi2-Sep-09 6:56
mandimandi2-Sep-09 6:56 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.