Click here to Skip to main content
15,911,035 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a CSV file with 75 columns and 400 rows. I am only interested in the first 12 columns. How can I read the first 12 columns and save to a new string, read the next line, reading only the first 12 columns, then add to the new string, etc. so that I wind up with a new CSV file containing only the first 12 columns of the original CSV file. I have an idea which you can see in the code below but it is flawed and I don't know how to proceed. My idea was to count commas and when you reach more than 12 you read the next line. Maybe this will work But I am not sure.

What I have tried:

Using readFile As New StreamReader(phrfListFile)
            Do While readFile.Peek() <> -1
                Str = readFile.ReadLine()
                For Each c As Char In Str
                    If c = "," Then
                        cnt += 1
                    End If
                Next
                If cnt = 12 Then
                    ShortStr = ShortStr + Str
                End If
                cnt = 0
            Loop

        End Using
        File.WriteAllText("C:\Race\ShortStr", ShortStr)
Posted
Updated 31-Dec-18 16:22pm

Your "parser" is very naive. If there's a field with a comma in it, like "lastname, firstname", your parser will incorrectly parse the line.

Use a CSV library to parse the file, and then you can write out the fields you need into a new CSV file.

There's a bunch of CSV libraries on CP, like
Simple and Fast CSV Library in C#[^]
Why to build your own CSV parser (or maybe not)[^]
 
Share this answer
 
Comments
Member 10628309 31-Dec-18 12:41pm    
Your comment regarding what I would call extraneous commas does not apply. The CSV file is created on a computer with a deactivated comma key...only the program can insert commas at the end of each field. However, your point is well taken. The articles you referenced only address corrupt or missing fields. At least that is the way I read them. I see nothing about removing fields.
Dave Kreskowiak 31-Dec-18 12:51pm    
Sigh...

First, it doesn't matter if the comma key is disabled or not. If the data is available to do something LIKE export "lastname, firstname", the computer can create a CSV file that your code cannot parse.

Second, NOTHING is going to reference removing fields. You MUST parse each line of the CSV to get to the end of the current line and know where the beginning of the next line is.

Every CSV parser is going to give you the entire list of fields for every line. Some will do it in an array of returned strings, others will fill in the properties of a class that represents all of the data in each line. It's then going to be up to YOU to determine what data, from all that's available for each line, to write to your own CSV file.

Your parser is making the mistake of reading a text file character-by-character, instead of line-by-line. It's FAR easier to parse the line than it will be to parse characters.
Member 10628309 19-Jan-19 14:09pm    
You were totally right. I had a misconception of the purpose of a parser. I was thinking along the lines of how the dictionary defines parser. The CSV file that I needed to process was totally rewritten and tripled in size, it still has no inadvertent commas, but there is other stuff that is not handled properly as you pointed out. I am now using the TextFieldParser which requires Imports Microsoft.VisualBasic.FileIO. It handles all of the messy stuff that the previous parser would not. It also allows me to write at least 2 fewer lines of code. If this message sound like an apology it is.
Member 10628309 31-Dec-18 22:23pm    
I am sure will sigh more after reading this. I should have informed you that the source csv file was first created more than 15 years ago and updated hundreds of times, and has never had an inadvertent comma even once. I don't have to accept anyone's csv file. However as I said before my solution blows up if there is an inadvertent comma as you warned me. I thank you for your advice, it is good advice and when I become more experienced I will pursue the parsers you suggested. I did read all of the material you referenced. I will try at least one on the parsers, maybe more. The solution I show below works perfectly but as you point out is vulnerable.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Try
            If OpenFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then
                OpenFileDialog1.Filter = "CSV File|*.CSV"
                phrfListFile = OpenFileDialog1.FileName
            End If
        Catch ex As Exception
        End Try
        Dim SReader As New StreamReader(phrfListFile)
        Dim SWriter As New StreamWriter("C:\Race\NewphrfListFile.CSV")
        Dim strData() As String
        Do While SReader.Peek <> -1
            strData = Split(SReader.ReadLine, ",")
            SWriter.WriteLine(strData(0) & "," & strData(1) & "," & strData(2) & "," & strData(3) _
             & "," & strData(9) & "," & strData(10) & "," & strData(11) & "," & strData(12))
        Loop
        SReader.Close()
        SWriter.Close()
    End Sub
End Class
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900