Managed C++/CLI Discussion Boards

Re: Questions about reading binary files

GentooGuy29-Sep-07 8:05

GentooGuy

29-Sep-07 8:05

Okay thanks for the advice.
Currently, I'm having another small (I hope) problem:

#include "stdafx.h"<br />
#include <iostream><br />
#include <fstream><br />
 <br />
using namespace std;<br />
ifstream::pos_type size;<br />
char * memblock;<br />
<br />
void swapByteOrder();<br />
int readFile(char *filename);<br />
void printFile(int nr); // just for diag purposes only<br />
<br />
int main () <br />
{<br />
  if(!readFile("y:\\EXIF\\sanyo-vpcg250.jpg"))<br />
  {<br />
		cout <<"Some error occurred while opening the file"<< endl;<br />
		return 1;<br />
  }<br />
  printFile(10);<br />
  cout << endl;<br />
  swapByteOrder();<br />
  printFile(10);<br />
  return 0;<br />
}<br />
<br />
<br />
<br />
int readFile(char *filename)<br />
{<br />
  ifstream file (filename, ios::in|ios::binary);<br />
  if (file.is_open())<br />
  {<br />
	file.seekg(0, ios_base::end);<br />
	size = file.tellg();<br />
    memblock = new char [size];<br />
    file.seekg (0, ios::beg);<br />
    file.read (memblock, size);<br />
    file.close();<br />
<br />
    cout << "the complete file content is in memory\n";<br />
	<br />
	cout << "Size : "<< size  << endl;<br />
	for(int i=0;i<100;i++)<br />
	{	<br />
		char x = memblock[i];<br />
		cout << hex << (int)memblock[i]<<endl;<br />
	}	<br />
<br />
    delete[] memblock;<br />
  }<br />
  else cout << "Unable to open file";<br />
  return -1;<br />
}<br />
<br />
<br />
void swapByteOrder()<br />
{<br />
	long max = size;<br />
	char temp;<br />
	for(int i=0 ;i<max-2; i+=2)<br />
	{<br />
		temp=memblock[i];<br />
		memblock[i]=memblock[i+1];<br />
		memblock[i+1]=temp;<br />
	}<br />
}<br />
<br />
<br />
void printFile(int nr)<br />
{<br />
	for(int i=0;i<nr ;i++)<br />
	{<br />
		cout << hex << memblock[i] << endl;	<br />
	}<br />
<br />
}

The SwapByte function gets a access violation, when reaching i==3992. This is strange because it should be able to run to 62096 (the lenght of the file , as indicated by size).

What's going wrong here?

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 8:22

Mark Salsbery

29-Sep-07 8:22

I'm surprised it gets that far, since you delete memblock in readFile() Smile | :)

I'm curious....why are you reading bytes from a jpeg file as ints
cout << hex << (int)memblock[i]<<endl;
and why would you be swapping byte order? Are you trying to make the jpeg unreadable?

Actually, this whole loop doesn't make sense

for(int i=0;i<100;i++)<br />
{   <br />
    char x = memblock[i];<br />
    cout << hex << (int)memblock[i]<<endl;<br />
}

You're indexing the array by bytes but casting to int (4 bytes)???

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files

GentooGuy29-Sep-07 8:52

GentooGuy

29-Sep-07 8:52

Well I got confused too. I'm java developer (bsc. in CS) but I'm getting quite stuck on this one.
Manual GC isn't exactly my cup of tea Wink | ;)

Well, I want to retrieve some EXIF information from the file, and when not swapping the bytes (hey, I do NOT write the array to disk) I get the same ouput as when using od (UNIX tool for displaying files).

So I thought, I had a byte-order related problem.
od output:

0000000      d8ff    e1ff    8711    7845    6669    0000    4949    002a<br />

When running my own app, I found a ff first, then the d8, an ff, the e1, the 11, the 87. etc...

That's my reason to swap these bytes.

Re: Questions about reading binary files

Luc Pattyn29-Sep-07 9:15

Luc Pattyn

29-Sep-07 9:15

Hi,

A typical JPEG hex dump starts like this:
000000 FF D8 FF E0 00 10 4A 46 49 46 00 01 02 01 00 87
000010 00 87 00 00 FF ED 08 9E 50 68 6F 74 6F 73 68 6F

i.e. the very first byte is FF.

If you interpret that as a number of 16-bit words (as your od command seems to do)
then you would get D8FF E0FF etc. but that does not mean this is how you should look at it.

In fact JPEG coding is byte oriented, each FF XX pair of bytes marks the start of something
and may be preceeded by an arbitrary number of FF bytes.

I suggest you:
- start by reading the JPEG standard, you can find it on the web;
- look at JPEG files with an unbiased tool, one that shows bytes, not larger integers.

BTW: if you read a JPEG file with Image.FromFile() the Image class will offer access
to a lot of metadata as well (e.g. GetPropertyItem() method)

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles]

this weeks tips:
- make Visual display line numbers: Tools/Options/TextEditor/...
- show exceptions with ToString() to see all information
- before you ask a question here, search CodeProject, then Google

Re: Questions about reading binary files

GentooGuy29-Sep-07 9:20

GentooGuy

29-Sep-07 9:20

thanks for the info.
nut the image class is .net based, and I don't want just plain C++ without ms specific stuff.

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 9:16

Mark Salsbery

29-Sep-07 9:16

It's your binary viewer utility that's swapping the bytes.

If you go through and swap bytes, you won't have a JPEG anymore.

If you want to see the actual bytes in order, change your byte viewer loop to

	for(int i=0;i<100;i++)<br />
	{	<br />
   cout << hex << (int)(unsigned char)memblock[i] << endl;<br />
	}

And for your non-GC related issue - you don't want to use your array after you delete it Smile | :)

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files

GentooGuy29-Sep-07 9:20

GentooGuy

29-Sep-07 9:20

okay that's strange.

thanks you for the help, I really appreciated it Smile | :)

Re: Questions about reading binary files

GentooGuy29-Sep-07 10:27

GentooGuy

29-Sep-07 10:27

it works fine now Smile | :)

thanks a lot!

Re: Questions about reading binary files

GentooGuy29-Sep-07 10:52

GentooGuy

29-Sep-07 10:52

I'm sorry, but I still think they're swapped.
My reason for this is the fact that I'm looking up 0x9003, which is a tag in a JPEG file indicating the date of the picture.
When using 'od', I see this:

<br />
$ od -x file.jpg  | grep 9003<br />
0000520      0004    0000    3230    3030    9003    0002    0014    0000<br />

This is the only occurence of '9003' in a file which does contain the information (so this must be the instance I'm looking for).
But, when running my program, and printing some lines, I get this output:

<br />
4<br />
0<br />
0<br />
0<br />
30<br />
32<br />
30<br />
30<br />
3<br />
90<br />
2<br />
0<br />
14<br />
0<br />

This is produced by the loop you've proposed. When comparing both outputs, I see this one has the bytes swapped when compared to od and (!) the exif standard. So I guess, od isn't wrong. Or am I indeed wrong?

Re: Questions about reading binary files [modified*2]

Mark Salsbery29-Sep-07 11:08

Mark Salsbery

29-Sep-07 11:08

I think we're confusing two different issues here...

First, your "od" is lumping 2 byte pairs and is assuming little-endian
byte order so, as you can see from the sample listings you've posted,
each pair of bytes appears swapped in the od-generated listing.

Second, you have to parse your file properly, depending on the byte order.
exif has some kind of tag to indicate whether multi-byte integers are stored
in "motorola" or "intel" order. This doesn't mean you can just go through the
entire file and swap every pair of bytes. This means when you encounter
multibyte-integer data in the file, you may need to swap bytes to work with
the data on your platform.

You need to parse the file bytes following its type and format. I don't have the jpeg/jfif
format memorized but it's well documented all over the internet Smile | :)

Again, the only swapping going on here is by your "od" utility. In your code you simply
have the bytes in the same order they occur in the file.

Mark

*edit* LOL I really meant "sample", not "ample"....ample sounded snotty LMAO

Last modified: 17mins after originally posted --

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files [modified*2]

GentooGuy30-Sep-07 2:40

GentooGuy

30-Sep-07 2:40

okay that sounds possible.

But when having a look at http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/EXIF.html
I can see the tags about the creation date are 0x9003 and 0x9004 (still have to decide which one to take).

when a having a look at the jpg itself (which has a date and time of 01-01-1998) i see the strings about the date in a proper sequence.... but there's no 9003. Just a 0390 before it.

I've looked it up in the exif documentation, and I haven't found anything about inverting the bytes on such markers. What am I missing?

Re: Questions about reading binary files

Mark Salsbery30-Sep-07 6:31

Mark Salsbery

30-Sep-07 6:31

GentooGuy wrote:
but there's no 9003. Just a 0390 before it.

Where are you seeing that? In your od results? If so, then that IS 9003
because od is swapping avery pair of bytes.

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files

GentooGuy30-Sep-07 6:37

GentooGuy

30-Sep-07 6:37

Nop, in the VS binary editor

Re: Questions about reading binary files

Mark Salsbery30-Sep-07 6:47

Mark Salsbery

30-Sep-07 6:47

I don't know what to tell you....

There's only three possibilities here:

1) The file was written incorrectly (not following specs)
2) There's a tag somewhere that indicates the byte order and you need to use it
3) You're interpreting the binary bytes incorrectly.

Which is it? Smile | :)

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files

GentooGuy30-Sep-07 6:55

GentooGuy

30-Sep-07 6:55

option 2.

I've read something about it yesterday, currently trying to find the document which described it.

Re: Questions about reading binary files

Mark Salsbery30-Sep-07 7:10

Mark Salsbery

30-Sep-07 7:10

JPEG is big-endian.

EXIF follows TIFF specs which can be big or little endian.
This is usually determined by the first two bytes of the file:
"II" (0x49 0x49) for little endian, "MM" (0x4D 0x4D) for big endian.

For a file with "MM" byte order, tags (and all other multi-byte fields
of tags) will need to be swapped on Intel machines.

For the tag 0x9003, I would expect the following storage in the file:

Big endian: 0x90 0x03
Little endian: 0x03 0x90

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files

GentooGuy30-Sep-07 14:51

GentooGuy

30-Sep-07 14:51

thanks, I've got it working Smile | :)

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 11:11

Mark Salsbery

29-Sep-07 11:11

BTW are you using Visual Studio? If so, open the jpeg file in the binary editor window.
It won't swap any bytes in the display.

File menu -> Open/File...
Select the file
Click the little drop arrow on the "Open" button and choose "Open with..."
Choose binary editor

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 11:22

Mark Salsbery

29-Sep-07 11:22

I looked up od

You should be using something like "-t x1" (or is it "-t xC") instead of "-x" on your command line since
you want single-byte hex integers, not double-byte integers.

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Re: Questions about reading binary files

Luc Pattyn29-Sep-07 11:49

Luc Pattyn

29-Sep-07 11:49

right! forget od in 16-bit mode.

Luc Pattyn [Forum Guidelines] [My Articles]

Re: Questions about reading binary files

Luc Pattyn30-Sep-07 4:31

Luc Pattyn

30-Sep-07 4:31

FYI:

I checked some JPEG documents and can confirm JPEG is "big-endian" which means multi-byte
quantities need their byte order reversed on a "little-endian" machine such as Intel's x86.

From the doc: "For parameters which are 2 bytes (16 bits) in length, the most significant
byte shall come first in the compressed data’s ordered sequence of bytes."

As already stated by Mark, this does NOT mean you should swap every pair of bytes; it does
mean if two bytes are to represent a 16-bit integer, then you should swap both bytes; if 4 bytes
represent a 32-bit integer, then you should swap all four bytes.

Since 2B and 4B integers will not always be properly aligned in memory, you will need
a method that interprets 2 bytes in the right order, whereever they are, as in the following
C# code snippet:

// unsigned !
private byte getByte() {
    return bytes[p++];
}

// unsigned and big-endian !
private ushort getShort() {
    ushort bHi=getByte();
    ushort bLo=getByte();
    return (ushort)((bHi<<8)+bLo);
}

where bytes[] holds the entire file content, and p is the "current position".

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles]

C++ Byte Level Code.. [modified]

spalanivel27-Sep-07 21:26

spalanivel

27-Sep-07 21:26

Hi,
What is "Byte Level Code" in C++?? When and How we are using?? Any tutorial regarding this topic...Thanks

-- modified at 9:25 Friday 28th September, 2007

Re: C++ Byte Level Code..

led mike28-Sep-07 5:19

led mike

28-Sep-07 5:19

spalanivel wrote:
What is "Byte Level Code" in C++??

Byte level code is a term mostly associated with Java. Java compilers do not produce "machine code", they produce a machine independent code that was often called "byte code". To execute the byte code you need an interpreter specific to the machine (processor / architecture) because the processor can only execute machine code.

The original intent of C/C++ is to produce machine code. If there are non machine code C++ compilers then they would produce something that might be called byte level code meaning that it needs an interpreter because the processor can't execute it.

Re: C++ Byte Level Code..

spalanivel2-Oct-07 5:24

spalanivel

2-Oct-07 5:24

Thanks Mike..

Simple ESC Key

Michael10127-Sep-07 18:18

Michael101

27-Sep-07 18:18

Hey everyone,

I've got a C++ console application and it goes into a loop while that never ends except when the user presses the ESC key! Right now my loop is coded like this:

while (!kbhit())
{ code }

I have to change the kbhit bit to make it so it only ends the loop when I press the ESC key!

Thanks for your help in advance, I appreciate it!

Michael Wink | ;)

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.