Click here to Skip to main content
15,885,985 members
Articles / Programming Languages / C++

C++: Minimalistic CSV Streams

Rate me:
Please Sign up or sign in to vote.
4.80/5 (77 votes)
10 Mar 2023MIT4 min read 175.6K   4.9K   172   42
Read/write CSV in few lines of code!

Introduction

MiniCSV is a small, single header library which is based on C++ file streams and is comparatively easy to use. Without further ado, let us see some code in action.

Writing

We see an example of writing tab-separated values to file using csv::ofstream class. Now you can specify the escape string when calling set_delimiter in version 1.7.

C++
#include "minicsv.h"

struct Product
{
    Product() : name(""), qty(0), price(0.0f) {}
    Product(std::string name_, int qty_, float price_) 
        : name(name_), qty(qty_), price(price_) {}
    std::string name;
    int qty;
    float price;
};

int main()
{
    csv::ofstream os("products.txt");
    os.set_delimiter('\t', "##");
    if(os.is_open())
    {
        Product product("Shampoo", 200, 15.0f);
        os << product.name << product.qty << product.price << NEWLINE;
        Product product2("Soap", 300, 6.0f);
        os << product2.name << product2.qty << product2.price << NEWLINE;
    }
    os.flush();
    return 0;
}

NEWLINE is defined as '\n'. We cannot use std::endl here because csv::ofstream is not derived from the std::ofstream.

Reading

To read back the same file, csv::ifstream is used and std::cout is for displaying the read items on the console.

C++
#include "minicsv.h"
#include <iostream>

int main()
{
    csv::ifstream is("products.txt");
    is.set_delimiter('\t', "##");
    if(is.is_open())
    {
        Product temp;
        while(is.read_line())
        {
            is >> temp.name >> temp.qty >> temp.price;
            // display the read items
            std::cout << temp.name << "," << temp.qty << "," << temp.price << std::endl;
        }
    }
    return 0;
}

The output in console is as follows:

C++
Shampoo,200,15
Soap,300,6

Overloaded Stream Operators

String stream has been introduced in v1.6. Let me show you an example on how to overload string stream operators for the Product class. The concept is the same for file streams.

C++
#include "minicsv.h"
#include <iostream>

struct Product
{
    Product() : name(""), qty(0), price(0.0f) {}
    Product(std::string name_, int qty_, float price_) : name(name_), 
                               qty(qty_), price(price_) {}
    std::string name;
    int qty;
    float price;
};

template<>
inline csv::istringstream& operator >> (csv::istringstream& istm, Product& val)
{
    return istm >> val.name >> val.qty >> val.price;
}

template<>
inline csv::ostringstream& operator << (csv::ostringstream& ostm, const Product& val)
{
    return ostm << val.name << val.qty << val.price;
}

int main()
{
    // test string streams using overloaded stream operators for Product
    {
        csv::ostringstream os;
        os.set_delimiter(',', "$$");
        Product product("Shampoo", 200, 15.0f);
        os << product << NEWLINE;
        Product product2("Towel, Soap, Shower Foam", 300, 6.0f);
        os << product2 << NEWLINE;

        csv::istringstream is(os.get_text().c_str());
        is.set_delimiter(',', "$$");
        Product prod;
        while (is.read_line())
        {
            is >> prod;
            // display the read items
            std::cout << prod.name << "|" << prod.qty << "|" << prod.price << std::endl;
        }
    }
    return 0;
}

This is what is displayed on the console.

C++
Shampoo|200|15
Towel, Soap, Shower Foam|300|6

What if the type has private members? Create a member function that takes in the stream object.

C++
class Product
{
public:
    void read(csv::istringstream& istm)
    {
        istm >> this->name >> this->qty >> this->price;
    }
};

template<>
inline csv::istringstream& operator >> (csv::istringstream& istm, Product& prod)
{
    prod.read(istm);
    return istm;
}

Conclusion

MiniCSV is a small CSV library that is based on C++ file streams. Because delimiter can be changed on the fly, I have used this library to write file parser for MTL and Wavefront OBJ format in a relatively short time compared to handwritten with no library help. MiniCSV is now hosted at Github. Thank you for reading!

History

  • 2014-03-09: Initial release
  • 2014-08-20: Remove the use of smart ptr
  • 2015-03-23: 75% perf increase on writing by removing the flush on every line, fixed the lnk2005 error of multiple redefinition. read_line replace eof on ifstream.
  • 2015-09-22: v1.7: Escape/unescape and surround/trim quotes on text
  • 2015-09-24: Added overloaded stringstream operators example.
  • 2015-09-27: Stream operator overload for const char* in v1.7.2.
  • 2015-10-04: Fixed G++ and Clang++ compilation errors in v1.7.3.
  • 2015-10-20: Ignore delimiters within quotes during reading when enable_trim_quote_on_str is enabled in v1.7.6. Example: 10.0,"Bottle,Cup,Teaspoon",123.0 will be read as as 3 tokens : <10.0><Bottle,Cup,Teaspoon><123.0>
  • 2016-05-05: Now the quote inside your quoted string are escaped now. Default escape string is "&quot;" which can be changed through os.enable_surround_quote_on_str() and is.enable_trim_quote_on_str()
  • 2016-07-10: Version 1.7.9: Reading UTF-8 BOM
  • 2016-08-02: Version 1.7.10: Separator class for the stream, so that no need to call set_delimiter repeatedly if delimiter keep changing. See code example below:
    C++
    // demo sep class usage
    csv::istringstream is("vt:33,44,66");
    is.set_delimiter(',', "$$");
    csv::sep colon(':', "<colon>");
    csv::sep comma(',', "<comma>");
    while (is.read_line())
    {
        std::string type;
        int r = 0, b = 0, g = 0;
        is >> colon >> type >> comma >> r >> b >> g;
        // display the read items
        std::cout << type << "|" << r << "|" << b << "|" << g << std::endl;
    }
  • 2016-08-23: Version 1.7.11: Fixed num_of_delimiter function: do not count delimiter within quotes
  • 2016-08-26: Version 1.8.0: Added better error message for data conversion during reading. Before that, data conversion error with std::istringstream went undetected.

    Before change:
    C++
    template<typename T>
    csv::ifstream& operator >> (csv::ifstream& istm, T& val)
    {
        std::string str = istm.get_delimited_str();
        
    #ifdef USE_BOOST_LEXICAL_CAST
        val = boost::lexical_cast<T>(str);
    #else
        std::istringstream is(str);
        is >> val;
    #endif
    
        return istm;
    }

    After change:

    C++
    template<typename T>
    csv::ifstream& operator >> (csv::ifstream& istm, T& val)
    {
        std::string str = istm.get_delimited_str();
    
    #ifdef USE_BOOST_LEXICAL_CAST
        try 
        {
            val = boost::lexical_cast<T>(str);
        }
        catch (boost::bad_lexical_cast& e)
        {
            throw std::runtime_error(istm.error_line(str).c_str());
        }
    #else
        std::istringstream is(str);
        is >> val;
        if (!(bool)is)
        {
            throw std::runtime_error(istm.error_line(str).c_str());
        }
    #endif
    
        return istm;
    }

    Breaking changes: It means old user code to catch boost::bad_lexical_cast must be changed to catch std::runtime_error. Same for csv::istringstream. Beware std::istringstream is not as good as boost::lexical_cast at catching error. Example, "4a" gets converted to integer 4 without error.

    Example of the csv::ifstream error log as follows:

    C++
    csv::ifstream conversion error at line no.:2, 
    filename:products.txt, token position:3, token:aa

    Similar for csv::istringstream except there is no filename.

    C++
    csv::istringstream conversion error at line no.:2, token position:3, token:aa
  • 2017-01-08: Version 1.8.2 with better input stream performance. Run the benchmark to see (Note: Need to update the drive/folder location 1st).

    Benchmark results against version 1.8.0:

    C++
         mini_180::csv::ofstream:  348ms
         mini_180::csv::ifstream:  339ms <<< v1.8.0
             mini::csv::ofstream:  347ms
             mini::csv::ifstream:  308ms <<< v1.8.2
    mini_180::csv::ostringstream:  324ms
    mini_180::csv::istringstream:  332ms <<< v1.8.0
        mini::csv::ostringstream:  325ms
        mini::csv::istringstream:  301ms <<< v1.8.2
    
  • 2017-01-23: Version 1.8.3 add unit test and to allow 2 quotes escape 1 quote to be in line with CSV specification.
  • 2017-02-07: Version 1.8.3b add more unit tests and remove CPOL license file.
  • 2017-03-12: Version 1.8.4 fixed some char output problems and added NChar (char wrapper) class to write to numeric value [-127..128] to char variables.
    C++
    bool test_nchar(bool enable_quote)
    {
        csv::ostringstream os;
        os.set_delimiter(',', "$$");
        os.enable_surround_quote_on_str(enable_quote, '\"');
    
        os << "Wallet" << 56 << NEWLINE;
    
        csv::istringstream is(os.get_text().c_str());
        is.set_delimiter(',', "$$");
        is.enable_trim_quote_on_str(enable_quote, '\"');
    
        while (is.read_line())
        {
            try
            {
                std::string dest_name = "";
                char dest_char = 0;
    
                is >> dest_name >> csv::NChar(dest_char);
    
                std::cout << dest_name << ", " 
                    << (int)dest_char << std::endl;
            }
            catch (std::runtime_error& e)
            {
                std::cerr << __FUNCTION__ << e.what() << std::endl;
            }
        }
        return true;
    }

    Display Output:

    C++
    Wallet, 56
  • 2017-09-18: Version 1.8.5:

    If your escape parameter in set_delimiter() is empty, text with delimiter will be automatically enclosed in quotes (to be compliant with Microsoft Excel and general CSV practice).

    "Hello,World",600

    Microsoft Excel and MiniCSV read this as "Hello,World" and 600.

  • 2021-02-21: Version 1.8.5d: Fixed infinite loop in quote_unescape.
  • 2021-05-06: MiniCSV detects the end of line with the presence of newline. Newline in the string input inevitably breaks the parsing. New version 1.8.6 takes care of newline by escaping it.
  • 2023-03-11: v1.8.7 added set_precision(), reset_precision() and get_precision() to ostream_base for setting float/double/long double precision in the output.

FAQ

Why does the reader stream encounter errors for CSV with text not enclosed within quotes?

Answer: To resolve it, please remember to call enable_trim_quote_on_str with false.

Product that Makes Use of MiniCSV

Points of Interest

Recently, I encountered a interesting benchmark result of reading a 5MB file, up against a string_view CSV parser by Vincent La. You can see the effects of Short String Buffer (SSO).

Benchmark of every column is 12 chars in length

The length is within SSO limit (24 bytes) to avoid heap allocation.

C++
csv_parser timing:113ms
MiniCSV timing:71ms
CSV Stream timing:187ms

Benchmark of every column is 30 chars in length

The length is outside SSO limit, memory has to allocated on the heap! Now string_view csv_parser wins.

C++
csv_parser timing:147ms
MiniCSV timing:175ms
CSV Stream timing:434ms

Note: Through I am not sure why CSV Stream is so slow in VC++ 15.9 update.

Note: Benchmark could be different with other C++ compiler like G++ and Clang++ which I do not have access now.

Related Articles

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Software Developer (Senior)
Singapore Singapore
Shao Voon is from Singapore. His interest lies primarily in computer graphics, software optimization, concurrency, security, and Agile methodologies.

In recent years, he shifted focus to software safety research. His hobby is writing a free C++ DirectX photo slideshow application which can be viewed here.

Comments and Discussions

 
GeneralGreat WorkShao! Pin
david2111422-Oct-15 4:08
david2111422-Oct-15 4:08 
QuestionUse in many classes of project Pin
vasvladal21-Oct-15 2:32
vasvladal21-Oct-15 2:32 
AnswerRe: Use in many classes of project Pin
Shao Voon Wong21-Oct-15 2:50
mvaShao Voon Wong21-Oct-15 2:50 
GeneralRe: Use in many classes of project Pin
vasvladal21-Oct-15 18:13
vasvladal21-Oct-15 18:13 
SuggestionDesign flaws Pin
Rado_28-Sep-15 2:15
Rado_28-Sep-15 2:15 
GeneralRe: Design flaws Pin
Shao Voon Wong28-Sep-15 3:08
mvaShao Voon Wong28-Sep-15 3:08 
GeneralMy vote of 5 Pin
gordon8825-Sep-15 8:37
professionalgordon8825-Sep-15 8:37 
QuestionVery useful ! Pin
Member 94147403-Jun-15 1:40
Member 94147403-Jun-15 1:40 
Hi .
I just tried it with a .csv file that use comma separator , i just had to define a new
structure that fit my format , modify the separator type in your exemple code, and that worked out of the box .
You will save me a lot of time , and i will learn a lot looking in your code .
I'm an indie dev for video games , your code will work inside of a dynamical economics engine
for RPG/Strategy games .

I Thank you a lot , shao .Thumbs Up | :thumbsup:
AnswerRe: Very useful ! Pin
Shao Voon Wong8-Jun-15 23:55
mvaShao Voon Wong8-Jun-15 23:55 
SuggestionRe: Very useful ! Pin
Shao Voon Wong21-Oct-15 18:50
mvaShao Voon Wong21-Oct-15 18:50 
GeneralRe: Very useful ! Pin
Member 941474021-Oct-15 20:53
Member 941474021-Oct-15 20:53 
QuestionImproper string initialization Pin
Andrew Komiagin22-Mar-15 22:58
Andrew Komiagin22-Mar-15 22:58 
AnswerRe: Improper string initialization Pin
Shao Voon Wong24-Mar-15 19:17
mvaShao Voon Wong24-Mar-15 19:17 
QuestionWhat should the separator character be? Pin
Roger Bamforth21-Aug-14 3:08
Roger Bamforth21-Aug-14 3:08 
AnswerRe: What should the separator character be? Pin
Shao Voon Wong25-Aug-14 18:51
mvaShao Voon Wong25-Aug-14 18:51 
QuestionCSV Data Pin
WintonRoseland21-Aug-14 2:02
WintonRoseland21-Aug-14 2:02 
QuestionSeparator character Pin
Wombaticus20-Aug-14 1:25
Wombaticus20-Aug-14 1:25 
AnswerRe: Separator character Pin
Shao Voon Wong24-Sep-15 2:30
mvaShao Voon Wong24-Sep-15 2:30 
QuestionSome thoughts on how to improve the lib Pin
Kamyshev Artem4-Apr-14 0:58
Kamyshev Artem4-Apr-14 0:58 
AnswerRe: Some thoughts on how to improve the lib Pin
Shao Voon Wong7-Apr-14 19:41
mvaShao Voon Wong7-Apr-14 19:41 
Questiongreat article! Pin
Frank R. Haugen2-Apr-14 6:26
professionalFrank R. Haugen2-Apr-14 6:26 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.