|
You do not need to use CachedCsvReader, use CsvReader instead.
|
|
|
|
|
Nice works thanks.
Would you like to release fast writer?
we can work IT out! (Beatles)
|
|
|
|
|
|
Hi Sebastien,
We are looking at using your Csv reader in our application. One piece of functionality that we would like is to be able to skip the first X rows. The problem being at the moment that your implementation reads the structure from the first non-blank row. So if we were to have a file with a couple of columns of supplementary data at the top and then the regular content starting with more columns 10 or so rows further down it will only assume there are two columns of data. So it would be really useful to skip the first rows.
I haven't yet spent any time looking at your source code but could you advise whether this is something that would be quite trivial to add - or a larger piece of work than I anticipate. Intuitively to me it doesn't seem tricky if it based on a StreamReader. But perhaps I'm wrong? It would be really nice if in the constructor we have the ability to skip the first X rows.
thanks
Tom
|
|
|
|
|
If the record are not multiline, then you can call ReadLine() on the StreamReader instance prior to using it with the CsvReader.
|
|
|
|
|
Thanks Sebastien. That is quite straightforward.
Also just noticed that someone else on this forum has posted a possible solution within the Csv file parser code:
Skipping rows[^]
But I think your solution is simpler and saves me having to modify your code.
|
|
|
|
|
I usually prefer using what's already there when possible and not too cumbersome
|
|
|
|
|
I use the fast CSV Reader for a little project. Now i have a problem with some .csv files. It happens when you use the same fieldname in the header more than once. When i use this piece of code an System.ArgumentException occured "an item with the same key is already added":
using (CsvReader csv = new CsvReader(new StreamReader("data.csv"), true))
{
int fieldCount = csv.Fieldcount;
string[] headers = csv.GetFieldHeaders();
I am a beginner in programming i don't understand how the csv object get all the information (like fieldcount, all the records etc.) when you call only the csvReader constructor.
|
|
|
|
|
Just as a database will not allow multiple columns with the same name in a table, the CsvReader throws an exception for multiple columns with the same name in a file. If you cannot avoid having duplicates, simply turn off the header functionalities of the CsvReader. You can do that via the useHeader constructor argument:
using (CsvReader csv = new CsvReader(new StreamReader("data.csv"), false))
|
|
|
|
|
Hello,
When I try to instantiate the CsvReader like this:
using (CsvReader csv = new CsvReader(new StreamReader(TBFileName.Text), true, ',', '"', '\\'))
I get the following error:
'LumenWorks.Framework.IO.Csv.CsvReader' does not contain a constructor that takes '5' arguments'
Do you have any insight as to what I might be doing wrong?
Thanks,
Mitch
|
|
|
|
|
Yes, actually I do. There is, believe it or not, not a single constructor that takes 5 arguments in the CsvReader class.
|
|
|
|
|
I do not know if it is just my problem,however have been struggling.
When i have parsing error, for example
xx, "xx"", x, 1, 1
dd, d, 1, 1,1
it does successfully throw an error, and advance to the next line. However, the parsed result would be like the following
xx, xx", dd, d, 1
somehow concatenate the two lines. How can simply discard the current record? and start with the next records.
Thanks so much
|
|
|
|
|
You need to specify the default parse error action as follow:
using (var csv = new CsvReader(...))
{
csv.DefaultParseErrorAction = ParseErrorAction.AdvanceToNextLine;
while (csv.ReadNextRecord())
{
}
}
|
|
|
|
|
thanks for the quick reply.
I did exactly the same thing,
csv.DefaultParseErrorAction = ParseErrorAction.RaiseEvent;
csv.ParseError += csv_ParseError;
while (csv.ReadNextRecord())
{
string[] drs = new String[150];
if (!csv.ParseErrorFlag && csv.FieldCount == 150)
{
csv.CopyCurrentRecordTo(drs);
if (drs.Length == 150)
{
for (int i = 0; i < drs.Length; i++)
{
HttpContext.Current.Response.Write(drs[i] + " ");
}
}
}
}
|
|
|
|
|
Just raise this question again. In the latest version,
"Is it possible to configure the reader so consecutive delimiters are read as one delimiter? "
For example, the data set looks like:
male 23 student
male 36 teacher
The delimiter is space, but there is one additional space between column #2 and #3.
|
|
|
|
|
My question is: how can the reader distinguish between real empty fields and mere unwanted delimiters?
I'm sorry to tell you that, but the only option I could realistically offer is a quirk mode for Excel. Seriously, how hard is it for the Office team to follow a 5 lines standard which is probably older than I am?
Yet, I am willing to do this for my new project at http://nlight.codeplex.com[^], but not for the library offered here: it's a lost cause. I wrote that code assuming well formed data (how naive of me!) and it's a real PITA when I try to go around Excel/SQL Server idiocies. For that reason, the NLight project will become the 4.0 version of this article rather sooner than later.
|
|
|
|
|
Not sure whether this is a perfect solution, but at least it works on my case.
CsvReader.cs
private string ReadField(int field, bool initializing, bool discardValue)
before _nextFieldIndex = Math.Max(index + 1, _nextFieldIndex);
just add:
if (_skipEmptyField && string.IsNullOrEmpty(value))
continue;
|
|
|
|
|
Have you run the unit tests ?
|
|
|
|
|
|
Ah I see, you added you own flag. Well, that's basically part of the quirk mode I was talking about, but there is much more to it. Your request is one of many ... It happens that this particular one is simple, but on the other hand, as far as I remember, you are only the second person to ask for it. The most requested ones are non-escaped quote characters in fields and extra commas at the end of records.
So I'm glad you went ahead and added code to handle your scenario. It will surely become part of the larger Excel "fix" ...
|
|
|
|
|
I have blank Fields in my header row, courtesy of converting from excel to csv.
I get something like this field1, field2, field3,,,
I only want to include the 3 fields. However, if any of the records contain a null or empty space I want to include them, up to the 3 fields only.
i.e. 3,,3,,, would just include 3,,3
Does the MissingFieldAction apply to headers and records or just headers?
|
|
|
|
|
I am not sure how exactly is your csv. Can you post a short sample which includes the headers?
|
|
|
|
|
Here is my csv file. It gets converted from excel. According to your reader there are 25 fields. However, I only want the headings that have a value (should be 19), DocumentDate is the last field in the excel file. The conversion from excel to csv seems to add empty fields.
Also, I don't want the malformed or empty rows at the end.
This should be the last record...
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6373,Susaki,1,MARKETING FEE,64.3,8.04,USD,0.6713,10160-1,6/3/2010,6/3/2010
===============
csv file below
===============
SailDate,SourceCompanyName,Voyage,Vessel,Port,CountryDestination,SupplierName,SalesPoint,SalesContrNumber,DiscPort,LotNumber,CostType,NetCost,GST,Currency,ExchangeRate,SupplierInvoiceNumber,DueDate,DocumentDate,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6370,Susaki,4,MARKETING FEE,245.21,30.65,USD,0.6713,10160-4,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6370,Susaki,4,MARKETING FEE,519.86,64.98,USD,0.6713,10160-4,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6371,Susaki,3,MARKETING FEE,395.63,49.45,USD,0.6713,10160-3,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6371,Susaki,3,MARKETING FEE,291.74,36.47,USD,0.6713,10160-3,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6371,Susaki,3,MARKETING FEE,144.21,18.03,USD,0.6713,10160-3,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6372,Susaki,2,MARKETING FEE,684.26,85.53,USD,0.6713,10160-2,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6372,Susaki,2,MARKETING FEE,815.42,101.93,USD,0.6713,10160-2,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6372,Susaki,2,MARKETING FEE,21.12,2.64,USD,0.6713,10160-2,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6373,Susaki,1,MARKETING FEE,855.85,106.98,USD,0.6713,10160-1,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6373,Susaki,1,MARKETING FEE,300.48,37.56,USD,0.6713,10160-1,6/3/2010,6/3/2010,,,,,,
6/3/2010,TAUM,10160,SUSAKI WING V95,Tauranga,JAPAN,TPT Forests Limited - USD,FOB,6373,Susaki,1,MARKETING FEE,64.3,8.04,USD,0.6713,10160-1,6/3/2010,6/3/2010,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
|
|
|
|
|
I think you should rather concentrate your efforts on making Excel output the file without the extra empty fields/records .. I loaded your file in Excel, saved it as .xslx and saved it back to csv and the extra commas were gone ..
|
|
|
|
|
Well the problem is the initial export is from excel to csv, and it is done automatically, even when I do
it manually I still get the extra commas.
Yeah, if I open it in excel again and then save as csv again it works, however, twice converting a document seems redundant.
Is there a place in your code that I can add the logic to ignore/stop counting/adding values the field collection when it encounters an empty/null field?
any suggestions would be helpful. Long live excel...
|
|
|
|
|