Click here to Skip to main content
15,891,936 members
Articles / Programming Languages / C#

FileDiff2 Optimized

Rate me:
Please Sign up or sign in to vote.
3.80/5 (3 votes)
13 Aug 2009CPOL 31K   163   13   14
A file diff utility.

Introduction

This application is pretty basic, it uses FileStream objects to perform its task.

Using the code

C#
ASCIIEncoding Encode = new ASCIIEncoding();

// Open the files
//
FileStream streamA = File.OpenRead(args[0]);
FileStream streamB = File.OpenRead(args[1]);

// Get the stream length
// (so we don't have to caluculate this a million times)
//
long lenA = streamA.Length - 1;
long lenB = streamB.Length - 1;

// Read the bytes
//
int byteA;
int byteB;

do
{
    // Read the streams
    //
    byteA = streamA.ReadByte();
    byteB = streamB.ReadByte();

    // Are they the same
    //
    if (byteA != byteB)
    {
        // Remember where we parked the car
        //
        long startPos = streamB.Position;

        // Read streamB until we = StreamA
        //
        do
        {
            byteB = streamB.ReadByte();
        }
        while (byteA != byteB && streamB.Position <= lenB);

        // How long is the difference?
        //
        long length = streamB.Position - startPos;

        // Read the bytes
        //
        byte[] theseBytes = new byte[length];
        streamB.Seek(length * -1, SeekOrigin.Current);|
        streamB.Read(theseBytes, 0, (int)length);

        Console.WriteLine("Pos:{0}, Len:{1}, Str:{2}", startPos, 
                          length, Encode.GetString(theseBytes));
    }
}
while (streamA.Position <= lenA && streamB.Position <= lenB);

streamA.Close();
streamB.Close();

History

  • Aug 12, 2009: Written.
  • Aug 13, 2009: Rewritten to be more explicit, and I modified the file seeking and some variables to bring this down form 57ms run time to 27ms run time.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Web Developer
United States United States
I started programming for fun when I was about 10 on an Franklin Ace 1000.

I still do it just for fun but it has gotten me a few jobs over the years. More then I can say for my Microsoft Certifications. Smile | :)

The way I learned was by example, now its time to give back to the next generation of coders.



Comments and Discussions

 
GeneralWorks, but that's it... Pin
Rasqual Twilight22-Aug-09 0:58
Rasqual Twilight22-Aug-09 0:58 
GeneralMy solution Pin
Pete Souza IV14-Aug-09 6:31
professionalPete Souza IV14-Aug-09 6:31 
GeneralRe: My solution Pin
Matthew Hazlett14-Aug-09 8:51
Matthew Hazlett14-Aug-09 8:51 
GeneralRe: My solution Pin
Pete Souza IV20-Aug-09 10:33
professionalPete Souza IV20-Aug-09 10:33 
GeneralRe: My solution Pin
Matthew Hazlett21-Aug-09 20:07
Matthew Hazlett21-Aug-09 20:07 
Very interesting. You create a string builder object there by allocating memory and desttroying the object over and over.
Not sure I would have gone with this approach but my program is not complete and yours is Smile | :)

I did write another version or my program doing line by line like you suggested but I kept all the file handling in the main() block of code. Instead of passing references, and doing a lot of seeking in the file I extended the string object and added new methods to compare strings. I'm not sure how it will look when it's totally finished but the code is very simple thus far (fast too).

EX: if (StringA.isApprox(StringB))

the isApprox extension method is so simple, it takes every char form StringA and removes a corresponding char from StringB (1 to 1). Then takes the length of the resulting strings and calculates a percentage match, somthing like:

int difference = stringA.Length - stringB.Length;
int percent = (difference * 100) / stringA.Length;

then if percent is over 85 they are most likely a match, this will work for 100% matches as well as partial matches so you need only call a compare one time. This works well, you can basically just read the file one time, line by line and do all the comparisons to satisfy 3 out of the 4 matching requirements.

However, the trick comes in for matching file deletions, I have some ideas but haven't implemented anything yet as I have had other things that I have been working on (cough, lazy). I'll try to get my code finished and uploaded by the deadline, if I don't make it and you would like to look at it just let me know and i'll upload it to my web site (spelling errors and all).

Matthew Hazlett
Fighting the good fight for web usability.

GeneralRe: My solution Pin
Pete Souza IV25-Aug-09 6:56
professionalPete Souza IV25-Aug-09 6:56 
GeneralRe: My solution Pin
Matthew Hazlett25-Aug-09 8:09
Matthew Hazlett25-Aug-09 8:09 
GeneralRe: My solution Pin
Pete Souza IV25-Aug-09 8:15
professionalPete Souza IV25-Aug-09 8:15 
GeneralRe: My solution Pin
Matthew Hazlett25-Aug-09 19:38
Matthew Hazlett25-Aug-09 19:38 
GeneralSome thoughts... Pin
Pete Souza IV13-Aug-09 13:50
professionalPete Souza IV13-Aug-09 13:50 
GeneralRe: Some thoughts... Pin
Matthew Hazlett13-Aug-09 14:52
Matthew Hazlett13-Aug-09 14:52 
GeneralRe: Some thoughts... Pin
Pete Souza IV13-Aug-09 16:07
professionalPete Souza IV13-Aug-09 16:07 
GeneralRe: Some thoughts... Pin
Matthew Hazlett13-Aug-09 16:41
Matthew Hazlett13-Aug-09 16:41 
GeneralRe: Some thoughts... Pin
Pete Souza IV14-Aug-09 1:59
professionalPete Souza IV14-Aug-09 1:59 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.