Click here to Skip to main content
15,899,314 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

why does a text, doc or pdf file having same contents produces different hash result. in my application, i want to compute hash value of files having same contents but different files extension using mdf algorithm . although having contents same, mdf algorithm is producing different hash code, whereas it should produce same hash code. fyi. i am using binaryreader for reading files. I also tried with streamreader but still same.

appreciate your response.
Posted
Comments
Kenneth Haugland 2-Aug-12 9:59am    
They may store the same values differently, I can take a picture of the document, but it does not have the same hash code as the text document :)

1 solution

Because they don't have the same contents.
They may contain the same text, but the text element is not the only information in a DOC or PDF file, where it is the only info in a TXT file. DOCs and PDFs (and most other file formats) add "packing" information like font, size, boldness, location, etc., etc.
 
Share this answer
 
Comments
tiggerc 3-Aug-12 2:53am    
Well answered that programmer.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900