What is the file's record length, fixed or variable length records?
The concept of records does not apply. The file a series of 1s and 0s organized into eight bits called a byte. You determine how the bytes relate to each other as a "record".
Each byte in the file may be addressed randomly.
To further complicate things, even characters may have different representations. There are single byte character set, double byte character sets and quad bytes character sets. Code page sets map characters to numeric values. There are many standard code page sets.
The way a line of text ends varies by operating system.
Windows usesCR carriage return and line feed as a line ending ("\r\n", \r is 0x0D in hexadecimal, 13 in decimal and Line Feed \n is 0x0A in hexadecimal, 10 in decimal)
Linux uses only a Line feed (0x0A in hexadecimal, 10 in decimal) for a line ending
File versus FileInfo
The File class is static and does not need a constructor. I typically use it to determine if a file exists but you need to be cautious when checking for existence.
if(!File.Exists("C:\Zones.Txt"))
do something<\pre>
will throw an exception if you are not authorized to access the file.
The FileInfo class requires a constructor but subsequent requests (file access times and file attributes) are very fast as the information is obtained from the in memory class. Place the FileInfo constructor in a try/catch to avoid exceptions.
<pre>FileInfo fileInfo = null;
try(fileInfo = new FileInfo("C:\Zones.Txt"))
catch (Exception ex)
{ Console.WriteLine(ex.Message); }
The following program creates files using various character sets. You can see the CR/LF and the number of bytes used by each character. After the demonstration files are created, view the contents of the files using Notepad. Windows will recognize the various encoding techniques and correctly display the lines.
<pre>using System;
using System.Globalization;
using System.IO;
using System.Text;
namespace EncodingDemo
{
class Program
{
private static string[] _lines = new string[]
{"Line 01", "Line 02", "Line 03"};
private static string _outputPath = "";
private static void Main(string[] args)
{
Console.WriteLine("EncodingDemo - Text formats");
Console.WriteLine("Please provide an output path (for example I:/TextDemo");
string documentsFolder = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
Console.WriteLine("Default is: " + documentsFolder);
string outputPath = Console.ReadLine();
if (String.IsNullOrEmpty(outputPath))
_outputPath = documentsFolder;
else
_outputPath = outputPath;
CreateFile("TextDemo-" + "ASCII", Encoding.ASCII);
CreateFile("TextDemo-" + "UTF8", Encoding.UTF8);
CreateFile("TextDemo-" + "Unicode", Encoding.Unicode);
CreateFile("TextDemo-" + "UTF32", Encoding.UTF32);
CreateStringText("TextDemo-" + "Strings");
Console.WriteLine("");
Console.WriteLine("Press any key to terminate this program");
Console.ReadKey();
}
private static void CreateFile(string outputFileName, Encoding encoding)
{
string fileName = Path.Combine(_outputPath, outputFileName + ".txt");
Console.WriteLine("");
Console.WriteLine("Creating " + fileName);
try
{
using (FileStream fileStream = new FileStream(fileName, FileMode.OpenOrCreate))
using (StreamWriter streamWriter = new StreamWriter(fileStream, encoding))
{
foreach (string line in _lines)
streamWriter.WriteLine(line);
}
}
catch (Exception ex)
{
Console.WriteLine("Error on file: " + fileName);
Console.WriteLine(" " + ex.Message);
return;
}
FileInfo fileInfo = new FileInfo(fileName);
if (fileInfo.Exists)
{
Console.WriteLine(" File length: " + fileInfo.Length.ToString("n0", CultureInfo.CurrentCulture));
ShowHex(fileName);
}
else
Console.WriteLine("File not found");
}
private static void CreateStringText(string outputFileName)
{
string fileName = Path.Combine(_outputPath, outputFileName + ".txt");
Console.WriteLine("");
Console.WriteLine("Creating " + fileName);
using (StreamWriter streamWriter = new StreamWriter(fileName))
{
streamWriter.Write(_lines[0]);
streamWriter.WriteLine(_lines[1]);
streamWriter.Write(_lines[2]);
}
FileInfo fileInfo = new FileInfo(fileName);
if (fileInfo.Exists)
{
Console.WriteLine(" File length: " + fileInfo.Length.ToString("n0", CultureInfo.CurrentCulture));
ShowHex(fileName);
}
else
Console.WriteLine("File not found");
}
private static void ShowHex(string fileName)
{
Console.WriteLine("Offset Hex values");
FileStream fileStream = null;
BinaryReader fileReader = null;
fileStream = File.OpenRead(fileName);
using (fileReader = new BinaryReader(fileStream))
{
StringBuilder sb = new StringBuilder();
sb.Append(ToHex(fileStream.Position));
sb.Append(" ");
int
counter = 0,
groupSets = 0;
while (fileStream.Position < fileStream.Length)
{
byte singleByte = fileReader.ReadByte();
sb.Append(ToHex(singleByte));
if (++counter >= 4)
{
sb.Append(" ");
if(++groupSets >= 8)
{
Console.WriteLine(sb.ToString());
sb.Clear();
sb.Append(ToHex(fileStream.Position));
sb.Append(" ");
groupSets = 0;
}
counter = 0;
}
}
Console.WriteLine(sb.ToString());
}
}
private static string ToHex(byte byteValue)
{
string strHex = String.Format("{0:X0}", byteValue);
string rslt = "0000" + strHex;
strHex = rslt.Substring(rslt.Length - 2);
return strHex;
}
private static string ToHex(UInt32 integer32)
{
string strHex = String.Format("{0:X0}", integer32);
string rslt = "00000000" + strHex;
strHex = rslt.Substring(rslt.Length - 8);
return strHex;
}
private static string ToHex(Int64 longInt)
{
UInt32 upper32 = (UInt32)(longInt >> 32);
UInt32 lower32 = (UInt32)(longInt & 0x00000000ffffffff);
return ToHex(upper32) + " " + ToHex(lower32);
}
}
}
I hope this helps to clarify the structure of a file.