|
One, two, three, four, five
Once I caught a muon live
Six, seven, eight, nine, ten
Then I let it go again
Why did you let it go?
Because it made my fingers glow.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
good one.
a poet
and we know it..
"A little time, a little trouble, your better day"
Badfinger
|
|
|
|
|
apparently your fingers you can count on to count the muon
|
|
|
|
|
Having nothing better to do at the moment, I'm re-writing some of my file-reading code.
Previously, I would wrap a FileStream in a StreamReader and call it good -- that's good enough for many simple tasks, but it has some issues when used with some more difficult tasks.
I'm still using a FileStream, but now I'm wrapping it in a family of classes which add some features and eliminate some shortcomings.
What irks me -- and has for a while -- is how FileStream.ReadByte() reports END-OF-FILE:
"The byte, cast to an Int32, or -1 if the end of the stream has been reached."
I do realize that this matches the behavior of C's fgetc function.
Basically, it means that you have to test every return value for -1 (EOF) -- YUCK! so inefficient.
But this isn't C, I would much rather have FileStream.ReadByte() throw an Exception when it hits EOF.
Assuming that FileStream.ReadByte() uses the ReadFile Windows API function, I'm sure that the FileStream class could have a method with the ability to raise an Exception when it hits the EOF -- allow the user to choose how EOF is reported.
And, no, testing for EOF and then throwing is not an option, as it doesn't remove the test, which is the whole issue.
So do I now need to look into writing my own version of FileStream and have it throw an Exception?
Note:
I read a blog post about file handling improvements in .net 6, but it didn't seem to address this.
It should be easy enough for them to add a ThrowOnEOF option.
P.S. File length is reported in bytes, not characters, so comparing file length to characters read (as you get with StreamReader when reading a file which contains multi-byte characters) is not a solution.
And also, in some cases (hopefully rare today), a file may be logically terminated with char 26 (Ctrl-Z) even if the file size in bytes is reported as larger. To support that, a reader must still test every character returned.
Edit: I just read the following on a Wikipedia page about Unicode:
"Control-D has been used to signal "end of file" for text typed in at the terminal on Unix / Linux systems. Windows, DOS, and older minicomputers used Control-Z for this purpose."
"Control-Z has commonly been used on minicomputers, Windows and DOS systems to indicate "end of file" either on a terminal or in a text file. Unix / Linux systems use Control-D to indicate end-of-file at a terminal."
modified 19-Aug-23 0:06am.
|
|
|
|
|
i have written several C++ specialized stream classes . this is of course obvious however i mention anyway i.e. viz. i merely rely on the file size . i have no experience w/ .NET as only now have begun learning it . having looked at the documentation i.e. viz. "FileStream.ReadByte() throw an Exception when it hits EOF. " "The byte, cast to an Int32, or -1 if the end of the stream has been reached." i was STUNNED as how the heck can one distinguish a legitimate 0xff byte value from EOF . 😵
modified 17-Aug-23 19:58pm.
|
|
|
|
|
0xFFFFFFFF
Edit: vs 0x000000FF
modified 19-Aug-23 15:37pm.
|
|
|
|
|
thank you for kind reply . please pardon my stupidity but will a legitimate byte value of 0xff not be converted to 0xFFFFFFFF as return value and thus be confused w/ EOF ? furthermore may i please inquire why file size is not being utilized in your code for this purpose . i assume a logical reason . i would be thankful to learn from my betters . thank you kindly
|
|
|
|
|
The legitimate values for bytes from the stream are indeed 0x00 thru 0xff. That is why the return value is an INT, allowing (int)-1 EOF to be distinguished from 0xff data.
Many many moons ago in an embedded system we wrote a serial comms reader that returned 0x00 thru 0xff for received data, and a number of negative ints indicating different error conditions.
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
thank you . i finally realized the type for byte must be unsigned . in fact unsigned is my usage as well .
|
|
|
|
|
I don't. And anyway, it would mean testing the byte count or position before or after every read which doesn't resolve the issue.
|
|
|
|
|
|
Interesting.
So it still requires a test after each read?
Can it rewind?
|
|
|
|
|
PIEBALDconsult wrote: So it still requires a test after each read?
It's a boolean flag, so yes.
PIEBALDconsult wrote: Can it rewind?
It's forward only - ref: PipeReader Class (System.IO.Pipelines) | Microsoft Learn[^]. You would need to close, reset stream position (if the stream supports it), then create a new reader.
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
|
|
|
|
|
Yes, it would be more 'C#-style' if FileStream had methods byte FileStream.ReadByte() which throws an exception, and bool FileStream.TryReadByte(out byteVar), which returned true for success and false for failure. Bear in mind, however, that ReadByte() is a low-level API. I can think of many use cases where one just loops on the ReadByte() method until -1 is returned. Why burden the low-level reader with the overhead of a try/catch block, and possibly rethrowing a new exception wrapping the one thrown from the ReadByte() method?
High-level constructs are undoubtedly useful, but a 'good' language will also provide access to lower-level constructs, to be used with care.
EDIT: typo
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
modified 17-Aug-23 23:51pm.
|
|
|
|
|
PIEBALDconsult wrote: I would much rather have FileStream.ReadByte() throw an Exception when it hits EOF.
Using exceptions for non-exceptional, totally expected situations, just to control program flow.
If you really need to read a single byte at a time, then a TryReadByte method would be preferable:
public static bool TryReadByte(this Stream stream, out byte value)
{
int result = stream.ReadByte();
if (result == -1)
{
value = default;
return false;
}
value = (byte)result;
return true;
}
Otherwise, use Read(byte[], int, int) or Read(Span<byte>) and check how many bytes were read.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
PIEBALDconsult wrote: I would much rather have FileStream.ReadByte() throw an Exception when it hits EOF.
No, because exception handling is so much more costly than a simple check.
Also keep in mind that ReadByte internally does a check for Length == Position and returns -1 if they're equal. Reference Source[^]
So if you want to half the number of comparisons you should simply check for Length == Position before calling ReadByte.
<edit>Doh, doesn't half anything except when you have reached the eof.</edit>
But the question is rather, why don't you use Read() with a buffer instead? Uses one check for every Read().
|
|
|
|
|
Jörgen Andersson wrote: exception handling is so much more costly than a simple check
Prove it -- and not with a debugger attached. I think a lot of the early complaints (twenty years ago) about .net Exception performance were because people were testing debug versions with the debugger attached, which seems to make Exceptions slow. But I have not seen any issues with Exception performance in real-life.
Jörgen Andersson wrote: use Read() with a buffer instead
Maybe, but then there would be two buffers, because the FileStream holds one (4096 bytes by default). I see no reason to duplicate the buffer when I can process only one byte at a time, there would be no IO performance benefit.
|
|
|
|
|
|
You can't test "the performance of Exceptions" by simply throwing millions of them; that's not how they're used.
Experiment and test I will... eventually. I need to think more about what I don't like about FileStream:
-1 on EOF
No Peek
No Reverse (the Position can be set, but does that affect only the buffer?)
and what I want in a replacement for it.
Can I implement a "better" buffer than Microsoft? How smart or simple is the buffer in FileStream? How does it work?
The ability to Reverse is probably better implemented at a higher level, but it can be done at this level too.
I occasionally need to read sections of files in reverse and it would be good for it to perform better than it does -- as long as forward-reading performance isn't impacted. I recognize that it is not a common requirement.
Oddball said (Kelly's Heroes): You see, man, we like to feel we can get out of trouble, quicker than we got into it.
|
|
|
|
|
Long ago certainly many OSes used a EOF marker at the end of files. It was a specific sequence of bytes. I can't speak to why in general that is no longer the case but I do know it was possible to maliciously overwrite that and thus the file would no longer have an end.
I would not be surprised however if the answer is related to buffers. Buffers exist all the way down the chain. Right down to the hardware.
And that means one can never 'read' to the end. Although some files would be an exact multiple of the buffer lengths most would not. So some other comparison must always be done.
Nothing stops you from replacing something in the current IO stream hierarchy. You probably can't replace the hardware buffering but I am rather certain that you can replace the IO libraries in the OS. Or inject something in to it. I think something like that happens with USB drives.
You can also replace libraries in your language of choice. This is not all that difficult in Java or in C++. I suspect it is more difficult with C# but I haven't looked.
|
|
|
|
|
Yes, on MS-DOS there was char 26 (Ctrl-Z) which was an end-of-file marker in at least some cases.
Even in the past ten years I've had to deal with char 26 appearing in string values and causing trouble.
At most I would use P/Invoke to call the Windows API routines to read the file. And I would implement my own buffering system. Then add layers from there. But that's still just theoretical.
|
|
|
|
|
Heavy Indian accent: "Hello, I am John from EE".
"No you're not, not coming in on that line."
*click*
I assume he thought I was a hacker and disconnected before I got to his servers.
Of course, it helps that I've never been with EE ... and it's a whole load quieter than the air horn!
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
I don't have a land line anymore, only cell phone.
I just hang up on scam calls. If they call back, I block the number. I don't even speak to them.
I get very few tele marketing or scam calls these days.
|
|
|
|
|
I never answer any calls that are from an unknown phone number.
If it is anything real they will leave a voice mail.
If it is real, I call the person back.
It is never real. Real people text me before they ever call me and I already have their phone number in my phone.
|
|
|
|
|
Same here
In a closed society where everybody's guilty, the only crime is getting caught. In a world of thieves, the only final sin is stupidity. - Hunter S Thompson - RIP
|
|
|
|