Click here to Skip to main content
15,883,883 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi All, I've just been asked to port some old legacy C++ code to C#, the program is pretty simple - it parses some data from a text file and writes it out to another. All works ok BUT - as part of my testing whenever I do stuff like this I use the file compare function in devenv.exe ( think it used to be called windiff )

devenv /diff file1 file2 

and all is good, BUT - If I do a byte by byte comparison it fails as the files are different sizes ( the C# one is 128 bytes larger)
Any ideas guys ? it's not a show stopper I'm just curious.

What I have tried:

Googling and posting on here !
Posted
Updated 21-Jun-17 7:25am
v2
Comments
F-ES Sitecore 21-Jun-17 10:47am    
Do it on a small file then examine the bytes to see what the difference is. It could be a difference in line endings, maybe the C++ uses \n and the c# uses \r\n. We can't see these files and we don't know what the code is doing so there's not a lot of help we can give.
[no name] 21-Jun-17 10:48am    
Unicode vs. Ansi vs. MBC?
Peter_in_2780 21-Jun-17 11:32am    
old DOS style file with EOF mark (0x1a, ^Z) and padding?
pkfox 21-Jun-17 13:22pm    
Good thought i'll give it a go

1 solution

Open your files in the Notepad++ editor, and look at the menu option "Encoding".
I think your C++ file is in ANSI and your C# file in Unicode which uses up to 2 bytes per char. See: Unicode - Wikipedia[^] and C# in Depth: Unicode and .NET[^]
In C# use Encoding.ASCII to write to a file in ascii format.
You can also convert to ASCII:
C#
Encoding ascii = Encoding.ASCII;
Encoding unicode = Encoding.Unicode;
byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);
 
Share this answer
 
v4
Comments
pkfox 21-Jun-17 13:40pm    
Ok I'll give it a go thanks
RickZeeland 21-Jun-17 13:42pm    
Good luck, btw. Notepad++ has a button in the toolbar "show all characters" which can be useful to see non-printing characters like line endings.
[no name] 21-Jun-17 14:09pm    
A 5, great link"C# in Depth: Unicode and .NET"
pkfox 22-Jun-17 5:28am    
Hi Rick enabling show all characters with both of the files loaded showed me the header record on my file was somehow padded to the width of the rest of the file, the header on the old file is simply MYCOMPANYNAME20170622 but on mine it was written as "MYCOMPANYNAME20170622 128 spaces go here" thanks for your help

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900