Click here to Skip to main content
15,900,258 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
HI I have written a C program to detect an MD5 collision detection. I am used to C and pointers etc so the program is quite small. However, no matter how many time I port to C# it doesn't work. my C code is:

C
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <error.h>
#include <openssl/md5.h>

void print_hex(unsigned char *cs, int len)
{
    for (int i = 0; i < len; i++)
        printf("%02x", cs[i]);
    printf("\n");
}

void read_hex_file(char *path, size_t nmemb, unsigned char *buf)
{
    FILE *fd = fopen(path, "r");
    if (!fd) error(EXIT_FAILURE, errno, "read_hex_file()");

    unsigned int x;
    for (int i =  0; i < nmemb; i++) {
        fscanf(fd, "%02x", &x);
        buf[i] = (unsigned char)(0xFF & x);
    }

    fclose(fd);
}

int main(int argc, char *argv[])
{
    if (argc != 3) {
        fprintf(stderr, "Invalid args, %s FILE FILE", argv[0]);
        return EXIT_FAILURE;
    }

    unsigned char input[128], r1[MD5_DIGEST_LENGTH], r2[MD5_DIGEST_LENGTH];

    read_hex_file(argv[1], sizeof(input), input);
    MD5(input, sizeof(input), r1);
    read_hex_file(argv[2], sizeof(input), input);
    MD5(input, sizeof(input), r2);

    print_hex(r1, sizeof(r1));
    print_hex(r2, sizeof(r2));

    if (strncmp((const char *)r1, (const char *)r2, sizeof(r1)) == 0)
        printf("Hashes match\n");
    else
        printf("Hashes differ\n");

    return EXIT_SUCCESS;
}


Am getting absolutely nowhere with C# (which I want to learn) so wondering if it even possible or has it been solved elsewhere... I have looked. The two files contain the different strings which produce the same hash value. I got the data from them here:
http://en.wikipedia.org/wiki/MD5<ahref target="_blank" title="New Window">^]

and they produce the same hash 79054025255fb1a26e4bc422aef54eb4

Any help welcome. Thanks.
Posted
Updated 21-Aug-12 10:38am
v2
Comments
OriginalGriff 21-Aug-12 6:38am    
And what have you tried in C#?
BrianHamilton 21-Aug-12 7:09am    
Well, the code below is the umpteenth version:
static void Main(string[] args)
{
System.IO.StreamReader myFile1 = new System.IO.StreamReader("FileA.txt");
System.IO.StreamReader myFile2 = new System.IO.StreamReader("FileB.txt");

string myString1 = myFile1.ReadToEnd();
string myString2 = myFile2.ReadToEnd();


using (MD5 md5Hash = MD5.Create())
{
string hash1 = GetMd5Hash(md5Hash, myString1);
string hash2 = GetMd5Hash(md5Hash, myString2);

Console.WriteLine("The MD5 hash of " + myString1 + " is: " + hash1 + ".");
Console.WriteLine("The MD5 hash of " + myString2 + " is: " + hash2 + ".");

}



}
static string GetMd5Hash(MD5 md5Hash, string input)
{
byte[] data = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
StringBuilder sBuilder = new StringBuilder();

for (int i = 0; i < data.Length; i++)
{
sBuilder.Append(data[i].ToString("x2"));
}
return data.ToString();
}

You don't need all this code, nor do you even need to read the files yourself and pass the byte arrays in. You can just use the Stream variant of the ComputeHash methods to do this:
static void Main(string[] args)
{
    byte[] file1Hash = GetMd5HashOfFile(filepath1);
    byte[] file2Hash = GetMd5HashOfFile(filepath2);
}

static byte[] GetMd5HashOfFile(string filepath)
{
    using(MD5CryptoServiceProvider hashProvider = new MD5CryptoServiceProvider())
    {
        using(FileStream sr = new FileStream(filepath, FileMode.Open, FileAccess.Read))
        {
            return hashProvider.ComputeHash(sr);
        }
    }
}
 
Share this answer
 
Comments
BrianHamilton 21-Aug-12 10:31am    
This code works as well when I convert the byte array into a string, but I'm still getting different hash values for the files.
BrianHamilton 21-Aug-12 12:18pm    
Above solution works fine - just used binary files to get the hash collisions working.
You are not reading the files the same way.
Your C code converts hex strings to bytes while reading file.
In C# you use Encoding.UTF8.GetBytes(input) which is not the same.
Here is the code:
C#
static void Main(string[] args)
{
    byte[] myBytes1 = ReadHexFile("FileA.txt");
    byte[] myBytes2 = ReadHexFile("FileB.txt");

    using (MD5 md5Hash = MD5.Create())
    {
        string hash1 = GetMd5Hash(md5Hash, myBytes1);
        string hash2 = GetMd5Hash(md5Hash, myBytes2);
        Console.WriteLine("The MD5 hash of " + string.Join("", myBytes1.Select(b => b.ToString("x2"))) + " is: " + hash1 + ".");
        Console.WriteLine("The MD5 hash of " + string.Join("", myBytes1.Select(b => b.ToString("x2"))) + " is: " + hash2 + ".");
    }
    Console.ReadLine();
}

static string GetMd5Hash(MD5 md5Hash, byte[] input)
{
    byte[] data = md5Hash.ComputeHash(input);
    return string.Join("", data.Select(b => b.ToString("x2")));
}

static byte[] ReadHexFile(string path)
{
    System.IO.FileStream fs1 = System.IO.File.OpenRead(path);
    System.IO.StreamReader sr1 = new System.IO.StreamReader(fs1);

    string hexString = sr1.ReadToEnd();
    hexString = System.Text.RegularExpressions.Regex.Replace(hexString, @"\W", "", System.Text.RegularExpressions.RegexOptions.Singleline);
    int ix = 0;

    byte[] myBytes1 = new byte[hexString.Length / 2];
    while (ix < hexString.Length)
    {
        string hexByte = hexString.Substring(ix, 2);
        myBytes1[ix / 2] = byte.Parse(hexByte, System.Globalization.NumberStyles.HexNumber);
        ix += 2;
    }

    return myBytes1;
}
 
Share this answer
 
v3
Comments
BrianHamilton 21-Aug-12 9:30am    
The program works in terms of getting hash values, but they are not identical, which they are in the C program. I have altered the wiki inputs in my text files as an exact copy of the ones below and now program crashes:
http://en.wikipedia.org/wiki/MD5

FileA:
d131dd02c5e6eec4 693d9a0698aff95c 2fcab58712467eab 4004583eb8fb7f89
55ad340609f4b302 83e488832571415a 085125e8f7cdc99f d91dbdf280373c5b
d8823e3156348f5b ae6dacd436c919c6 dd53e2b487da03fd 02396306d248cda0
e99f33420f577ee8 ce54b67080a80d1e c69821bcb6a88393 96f9652b6ff72a70

FileB:
d131dd02c5e6eec4 693d9a0698aff95c 2fcab50712467eab 4004583eb8fb7f89
55ad340609f4b302 83e4888325f1415a 085125e8f7cdc99f d91dbd7280373c5b
d8823e3156348f5b ae6dacd436c919c6 dd53e23487da03fd 02396306d248cda0
e99f33420f577ee8 ce54b67080280d1e c69821bcb6a88393 96f965ab6ff72a70
sjelen 21-Aug-12 12:33pm    
I have used the same example, just removed all spaces and new-lines from file.
I do get the same hash for both files.
See updated code, added regex to remove all whitespace from hex string.
BrianHamilton 21-Aug-12 16:06pm    
Works perfectly...Thanks

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900