Click here to Skip to main content
15,868,055 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am trying to read the first 3 bytes of a file.

I am using Unicode.
My GCC 5.1 compiler is set to UTF-8.

In part of my attempts to read the bytes one at a time from the file (I can read the file already and get it's text), I have encountered the following error which I have not found sufficient documentation on to sufficiently understand it and work with it:

C++
const std::string SomeFile = "xyz.txt";
std::ifstream SomeStreamOfTheFile(SomeFile, std::ios_base::binary);

std::vector<char> SomeVectorBuffer = ( std::istreambuf_iterator<char>(SomeStreamOfTheFile),   std::istreambuf_iterator<char>() );


I get the following:

error: conversion from 'std::istreambuf_iterator<char, std::char_traits<char=""> >' to non-scalar type 'std::vector<char>' requested

I have been studying this but I do not yet know what it means.

Please help (with detailed explainations of your use of the std::istreambuf_iterator) in your help.

Thank you.

What I have tried:

Researching std::istreambuf_iterator and still not understanding how to fix this error.

I know that Template class istreambuf_iterator provides input iterator semantics for streambufs. But how to fix the current error?

I am using char because it is a Basic character set (size of 1 byte) and one byte at a time is what I want to read, from the file directly one byte at a time.

And, I can't do this:
C++
char <char, std::char_traits<char=""> > SomeVectorBuffer = ( std::istreambuf_iterator<char>(SomeStreamOfTheFile) );
because char is not a template.

So, if I try this:
C++
std::basic_string<char,char_traits<char>> SomeBuffer = ( std::istreambuf_iterator<char>(SomeStreamOfTheFile));
Even that does not work since I get the error:

conversion from 'std::istreambuf_iterator<char, std::char_traits<char=""> >' to non-scalar type 'std::__cxx11::basic_string<char>' requested

Some detailed help please.


Thank you.
Posted
Updated 20-Jul-22 0:22am
v6
Comments
Richard MacCutchan 19-Jul-22 16:11pm    
You declared the read mode as binary so you should be reading bytes not chars. And a istreambuf_iterator cannot be converted to a vector in that way.
Member 15078716 19-Jul-22 16:14pm    
I tried to do that but somehow I did not get it to work. I will try again.

I have tried the following but none of them work. I get errors for each.

BYTE SomeVectorBuffer = ( std::istreambuf_iterator<BYTE>(SomeStreamOfTheFile));

BYTE SomeVectorBuffer = ( std::istreambuf_iterator<byte>(SomeStreamOfTheFile));

byte SomeVectorBuffer = ( std::istreambuf_iterator<BYTE>(SomeStreamOfTheFile));

byte SomeVectorBuffer = ( std::istreambuf_iterator<byte>(SomeStreamOfTheFile));



Thank you.
Richard MacCutchan 20-Jul-22 3:23am    
Use the read method to read the first three bytes to see if they are BOM bytes. If so then read characters from that point on. If there are no BOM markers then rewind the file and just read in text format.
Member 15078716 20-Jul-22 15:51pm    
Thank you. Please see my answer to Kave Kreskowiak.
Dave Kreskowiak 19-Jul-22 18:30pm    
char does not equal byte. A single character can be multiple bytes in length.

Something like this? My C++ is extremely rusty, so this probably not the best solution.

You have to allocate a vector to pass into the function that this will fill to the size of the buffer.
C++
int GetFileContent(const char* filename, vector<BYTE> &buffer)
{
	// Open the file
	ifstream file(filename, ios::binary || ios::in);

	// Make sure the file opened!
	if (!file.good())
	{
		return -1;
	}

	// Read the content of the file
	for (size_t count = 0; count < buffer.size(); ++count)
	{
		file.read(reinterpret_cast<char*>(&buffer[count]), sizeof(buffer[count]));
	}

	file.close();
		
	return buffer.size();
}
 
Share this answer
 
Comments
Member 15078716 20-Jul-22 18:37pm    
I am trying one solution at a time.

I have been getting compiler errors with this. I have been trying to work through them for about 4 hours. It is getting late and I might try this again tomorrow. Your code has given me a lot to re-study. I think that I tried your way before and could not get it to work. But, I will be trying again.

Thank you.
Dave Kreskowiak 20-Jul-22 19:08pm    
I said my C++ was really rusty, like I haven't touched it in 15 years kind of rusty.
Member 15078716 20-Jul-22 19:50pm    
I did not mean that as a negative. I still have great confidence that you at least were pointing me in the right direction. I shall study again what you wrote.

Thank you.
Member 15078716 25-Jul-22 22:27pm    
I tried over many hours and did not get this to work.

I want to read the first three bytes of a file. I do not care if it is BOM or not BOM. I do not care about that. I want to read the first byte and see the bits of that byte. Then seperately the second byte, then seperately the third byte. Your code looks nice and I have studied it until I am exhausted and I have tried many different ways to adjust it to do as I am requesting, and it does not work: I get error error error.

Help ! Please.
Dave Kreskowiak 26-Jul-22 0:31am    
Works fine for me. I have no idea what you're doing because you're not showing your code and not saying what the error message are.
I have some basic code that I wrote a few years ago that seems to work:
C++
void DoFileRead(
	PCWSTR		pszFilename
)
{
	std::wcout << L"Reading file: " << pszFilename << std::endl;
	std::ifstream* iStream = new std::ifstream(pszFilename, std::ios::in);
	if (iStream->is_open())
	{
		char	szBuffer[1024];

		iStream->read(szBuffer, 3);
		if (memcmp(szBuffer, "\xFF\xFE", 2) == 0)
		{
			iStream->seekg(2);
			// Looks like Unicode
			// ifstream does not correctly interpret Unicode streams
			// nor, for that matter, does wifstream.
			wchar_t* wszBuffer = reinterpret_cast<wchar_t*>(szBuffer);
			std::wifstream* wiStream = reinterpret_cast<std::wifstream*>(iStream);
			while (!wiStream->eof())
			{
				wiStream->getline(wszBuffer, sizeof(szBuffer) / sizeof(wchar_t));
				std::wcout << "  " << wszBuffer << std::endl;
			}
			iStream->close();
		}
		else if (memcmp(szBuffer, "\xEF\xBB\xBF", 3) != 0)
		{
			// Not UTF-8, so probably normal ASCII - rewind the file
			iStream->seekg(0);
		}
		while (!iStream->eof())
		{
			iStream->getline(szBuffer, sizeof(szBuffer));
			std::cout << "  " << szBuffer << std::endl;
		}
		iStream->close();
	}
	std::wcout << std::endl;
}
 
Share this answer
 
Comments
Member 15078716 25-Jul-22 22:18pm    
It does not compile. I get errors, and no matter how I adjusted this, it did not compile.

This line:

	std::ifstream* iStream = new std::ifstream(pszFilename, std::ios::in);


gives me this error:

error: no matching function for call to 'std::basic_ifstream<char>::basic_ifstream(const WCHAR*&, const openmode&)'

I have studied it and tried changing it lots of ways but it does not compile. Just gives me some error. I study that and try what I read then get another error.

I am just trying to read the first three bytes in a file one at a time from that file. I want to see the bits of those bytes, one byte at a time and one bit at a time from those bytes.

Help ! Please.
Richard MacCutchan 26-Jul-22 3:54am    
Yu are trying to pass a WCHAR* to a function that requires a char*. Please read the documentation: std::basic_ifstream - cppreference.com[^].
Member 15078716 27-Jul-22 22:45pm    
I got it to work now.

Here is my result:

I often do testing and create a CLI report of results. Then I copy that and keep a final report.



Start of report.

Reading VerifiedToHave_utf8_ByteOrderMark03.txt

Contents are: [Byte Order Mark BOM] then [hello - こんにちは - abc]

The BOM is an unsigned char BOM01[3]{ 0xef, 0xbb, 0xbf }

So, I should see as the first byte: [0xef]

Using printf ---------- data = ["∩"]
Using cout ---------- data = [∩]

If data = [∩] then that is the correct ''symbol'' for here.

The first byte ''symbol'' of the file is [∩]

Now testing to get the Hex of the first byte in two different ways:
Verified that data= [∩] as a symbol.

test 1 Hex = [ef] as a Hexidecimal.
test 2 Hex = [ef] as a Hexidecimal.

Attempting to convert the first byte to binary:

test 3 Binary = [11110111]
Or nibble [1111] and then nibble [0111]

This is correct.

End of report.

It works now.

Thank you.

You both helped. I had to make a some changes, but you pointed me in the right direction and I saw some syntax errors that I was (I guess) using over and over again without noticing it. You both get Accepted.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900