Click here to Skip to main content
15,867,704 members
Articles / General Programming / String
Tip/Trick

C#: Generating Hash String

Rate me:
Please Sign up or sign in to vote.
4.75/5 (4 votes)
26 Jun 2019CPOL3 min read 21.6K   215   17  
Create hash and compare

Introduction

Hashing is the transformation process of value into a usually shorter fixed-length key/value that represents the original value. A few days ago, we had to use hash comparison to sync data between two systems via API (obviously, it wasn't the most efficient way to use API for data syncing, but we had no option to add any change at source end).

Background

What we were doing:

  1. Creating a hash string at our end after object JSON deserialization
  2. Comparing that hash string with an existing DB row by a unique identifier (Primary key)
    1. If no row found by the unique identifier (Primary key), adding a new row to the DB
    2. If the hash string wasn't the same, updating the existing row with new values
  3. And few other sync log processes

Everything was working as expected until we refactored the existing code (changed name of a few models and properties). The hash string was being generated from the entire object (including all the values) rather than considering specific properties. The way we were creating the hash string was actually wrong. Let's check a few hash string examples.

Hash Helper Class

This is the utility class to manage hash related operations.

C#
using System.IO;
using System.Linq;
using System.Runtime.Serialization.Formatters.Binary;
using System.Security.Cryptography;
using System.Text;

public class HashHelper
{
    /// <summary>
    /// for custom class need [Serializable]
    /// to ignore https://stackoverflow.com/questions/33489930/
    /// ignore-non-serialized-property-in-binaryformatter-serialization
    /// </summary>
    /// <param name="value"></param>
    /// <returns></returns>
    public byte[] Byte(object value)
    {
        /*https://stackoverflow.com/questions/1446547/
          how-to-convert-an-object-to-a-byte-array-in-c-sharp*/
        using (var ms = new MemoryStream())
        {
            BinaryFormatter bf = new BinaryFormatter();
            bf.Serialize(ms, value == null ? "null" : value);
            return ms.ToArray();
        }
    }
    
    public byte[] Hash(byte[] value)
    {
        /*https://support.microsoft.com/en-za/help/307020/
          how-to-compute-and-compare-hash-values-by-using-visual-cs*/
        /*https://andrewlock.net/why-is-string-gethashcode-
          different-each-time-i-run-my-program-in-net-core*/
        byte[] result = MD5.Create().ComputeHash(value);
        return result;
    }

    public byte[] Combine(params byte[][] values)
    {
        /*https://stackoverflow.com/questions/415291/
          best-way-to-combine-two-or-more-byte-arrays-in-c-sharp*/
        byte[] rv = new byte[values.Sum(a => a.Length)];
        int offset = 0;
        foreach (byte[] array in values)
        {
            System.Buffer.BlockCopy(array, 0, rv, offset, array.Length);
            offset += array.Length;
        }
        return rv;
    }

    public string String(byte[] hash)
    {
        /*https://stackoverflow.com/questions/1300890/
          md5-hash-with-salt-for-keeping-password-in-db-in-c-sharp*/
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < hash.Length; i++)
        {
            sb.Append(hash[i].ToString("x2"));     /*do not make it X2*/
        }
        var result = sb.ToString();
        return result;
    }

    public byte[] Hash(params object[] values)
    {
        byte[][] bytes = new byte[values.Length][];
        for(int i=0; i < values.Length; i++)
        {
            bytes[i] = Byte(values[i]);
        }
        byte[] combined = Combine(bytes);
        byte[] combinedHash = Hash(combined);
        return combinedHash;
    }

    /*https://stackoverflow.com/questions/5868438/c-sharp-generate-a-random-md5-hash*/
    public string HashString(string value, Encoding encoding = null)
    {
        if (encoding == null)
        {
            encoding = Encoding.ASCII;
        }
        byte[] bytes = encoding.GetBytes(value);
        byte[] hash = Hash(bytes);
        string result = String(hash);
        return result;
    }

    public string HashString(params object[] values)
    {
        var hash = Hash(values);    /*Add more not constant properties as needed*/
        var value = String(hash);
        return value;
    }
}

Consideration

  • Using MD5 hash Hash(byte[] value)
  • Any null value is considered as 'null' string Byte(object value)

Object to Hash String Process

  1. Create bytes of that object Byte(object value)
  2. Create hash bytes from the object bytes Hash(byte[] value)
  3. String from hash bytes String(byte[] hash)

A Combined Hash of Multiple Objects

  1. Create bytes of each object Byte(object value)
  2. Combine or sum the bytes Combine(params byte[][] values)
  3. Create hash bytes from the combine or sum the bytes Hash(byte[] value)
  4. String from hash bytes String(byte[] hash)

Alternatively:

  1. Create combined hash bytes Hash(params object[] values)
  2. String from hash bytes String(byte[] hash)

Methods We Are Going to Use More Frequently

  • Create a hash string of any string HashString(string value, Encoding encoding = null)
  • Create hash/combine hash string of any/group of object HashString(params object[] values)

Hash of Entire Object

The data class or model:

C#
[Serializable]
class PeopleModel
{
    public long? Id { get; set; }
    public string Name { get; set; }
    public bool? IsActive { get; set; }
    public DateTime? CreatedDateTime { get; set; }
}

Creating a hash of the model:

C#
/*9105d073ad276d742c56a049abd4ddef
 * will change if we change 
 *      1. class name
 *      2. property name
 *      3. property data type
 *      4. add/remove new property etc
 */
var peopleModelHashString = hashHelper.HashString(new PeopleModel()
{
    Id = 1,
    Name = "Anders Hejlsberg",
    IsActive = true,
    CreatedDateTime = new DateTime(1960, 12, 2)
});

Important to Remember

This hash depends on both object structures and assigned values. The generated hash will not be the same even if we assign the same values to the properties, but added some changes like:

  • Class/Model name change
  • Property name change
  • Namespace name change
  • Property Number change (add or remove any property)

to the model. And in a development environment, refactoring can take place any time.

Hash of Data Values

Let's make a hash using only values. Creating an interface IHash.

C#
public interface IHash
{
    string HashString();
}

Using IHash to a model and using hash helper inside the method HashString().

C#
class People : IHash
{
    public long? Id { get; set; }         /*unique identifier, avoid it to use 
                                            in hash calculation*/
    public string Name { get; set; }
    public bool? IsActive { get; set; }
    public DateTime? CreatedDateTime { get; set; }

    public string HashString()
    {
        var value = new HashHelper().HashString
        (Name, IsActive, CreatedDateTime);    /*Add more not constant properties as needed*/
        return value;
    }
}

This way, the model structure is not taking part in the hash generation process, only specific property values (Name, IsActive, CreatedDateTime) are being considered.

Hash will remain the same until no new value has been set to any of those properties. Any structural change (name change, property add/remove, etc.) to the model will not affect the hash string.

Hash Result

C#
/*constant: 3953fbec5b81ccca72c98655c0c4b069*/
people = new People()
{
    Id = 1,
    Name = "Dennis Ritchie",
    IsActive = false,
    CreatedDateTime = new DateTime(1941, 9, 9)
};
hashString = people.HashString();

Other Tests

Working fine with null object values:

C#
string hashString;
/*constant: 47ccecfc14f9ed9eff5de591b8614077*/
var people = new People();
hashString = people.HashString();

We will not be able to create the entire People class as it is not using [Serializable]:

C#
var hashHelper = new HashHelper(); 
/*throws error as [Serializable] not been used*/ 
//var peopleHashString = hashHelper.HashString(people);

BONUS: String Hash

It is quite common to create a password/string hash. So here we have it.

C#
/*constant: e6fb7af54c39f39507c28a86ad98a1fd*/
string name = "Dipon Roy";
string value = new HashHelper().HashString(name);

Conclusion

  • If we have to compare considering values or specific values only, then using Hash of Data Values is the best option.
  • But if we need to compare both object structure and values altogether, go for Hash of Entire Object.

References

My first read many years ago

Bytes

Hash Bytes

Combined Bytes

Bytes to String

Limitations

I haven't considered all possible worst scenarios or code may throw unexpected errors for untested inputs. If any, just let me know.

Find Visual Studio 2017 console application sample code as attachment.

History

  • 26th June, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Bangladesh Bangladesh
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --