Click here to Skip to main content
15,885,216 members
Articles / Programming Languages / C#
Tip/Trick

Flatmapper

Rate me:
Please Sign up or sign in to vote.
4.97/5 (27 votes)
9 Feb 2016CPOL4 min read 33.9K   50   27
FlatMapper is a library to import and export data from and to plain text files

Introduction

FlatMapper is a library to import and export data from and to plain text files.

Plain text files are still very widely used on legacy systems and still a favorite format for Human-to-System interface. This project was born from a need to read and write data from plain text files. I needed a lightweight module that would do just that, no extra needless weight. Since most of the libraries that I found at the time were either code intrusive, or had extra dependencies, I decided to write my own.

My goal was to write it with a nice fluent API, must be simple to use, minimal dependencies and dependents and fast! It must work with any POCO and should not be bloated with features that are not part of the core. I had a great time developing it (I still do).

Since the project that originated this library, I already used it again on 2 other projects.

Key Features

  • Fast - Uses Static Reflection and Dynamic methods
  • LINQ Compatible
  • It supports character delimited and fixed length files
  • Non intrusive - You don't have to change your code. Any POCO will work
  • No external dependencies
  • Iterative reads - Doesn't need to load the entire file into memory
  • Multi-line support (Only on character delimited and quoted)
  • Nullables support
  • Enums support
  • Vitually any type support with FieldValueConverters
  • Fluent Interface
  • Per line/record Error handling
  • Simple to use

How to Install

To install Flatmapper, run the following command in the Package Manager Console:

Install-Package Flatmapper

How to Use

Before we start reading or writing from files, we need to specify the layout of the file. We only need to do this once.
Imagine the following scenario. We need to read and write from text files to and from this class:

C#
public class TestObject
{
    public int Id { get; set; }
    public string Description { get; set; }
    public int? NullableInt { get; set; }
    public Gender? NullableEnum { get; set; }
    public DateTime Date { get; set; }
}

In the following sections, you'll find out how to setup the file layout with both fixed length and character delimited.

Fixed Length Layout

C#
    var layout = new Layout<TestObject>.FixedLengthLayout()
                    .HeaderLines(1)
                    .WithMember(o => o.Id, set => 
                    set.WithLength(5).WithLeftPadding('0'))
                    .WithMember(o => o.Description, set => 
                    set.WithLength(25).WithRightPadding(' '))
                    .WithMember(o => o.NullableInt, 
			set => set.WithLength(5).AllowNull("=Null").WithLeftPadding('0'))
                    .WithMember(o => o.NullableEnum, set => set.WithLength(10).AllowNull("======NULL").WithLeftPadding(' '))
                    .WithMember(o => o.Date, set => set.WithLength(19).WithFormat(new CultureInfo("pt-PT"))); //PT-pt default dates are always fixed 19 chars "13-12-2015 23:41:41";    

Delimited Layout

C#
    var layout = new Layout<TestObject>.DelimitedLayout()
                    .WithDelimiter(";")
                    .WithQuote("\"")
                    .HeaderLines(1)
                    .WithMember(o => o.Id, set => set.WithLength(5))
                    .WithMember(o => o.Description, set => set.WithLength(25))
                    .WithMember(o => o.NullableInt, set => set.AllowNull("NULL"))
                    .WithMember(o => o.NullableEnum, set => set.AllowNull("NULL"))
                    .WithMember(o => o.Date, set => set.WithFormat(new CultureInfo("pt-PT")));

With this setup is also possible to have multi-line fields, as long they are quoted.

Reading and Writing

The reading is interactive, meaning that only when a new item is requested, the data will be read. This helps to avoid reading the entire file into memory and only then parsing the data. Data is parsed on demand.

This library connects into the `Stream` class of the core framework. This way, don't have restrictions in the encoding and it's outside the scope of the library to free any resource.

C#
//Reading data
using (var fileStream = File.OpenRead("c:\temp\data.txt"))
{
    var flatfile = new FlatFile<TestObject>(layout, fileStream);
    foreach(var objectInstance in flatfile.Read())
    {
        //Do Something....
    }
}

//Writing data
using (var fileStream = File.OpenWrite("c:\temp\data.txt"))
{
    var flatfile = new FlatFile<TestObject>(layout, fileStream);
    flatfile.Write(listOfObjects);
}

Error Handling

Optionally, per line/object instance, you can control the behavior if any error is thrown due to some unexpected format or any other error, for that matter.
By specifying a `Func<string, Exception, bool>` into the `handleEntryReadError` parameter of the constructor of `FlatFile<T>`, every-time any input error occurs, that function is executed, with the line and the `Exception` that was thrown. If that function returns `true`, the Exception is ignored, and the import continues. If not, the `Exception` that originated the call, will be re-thrown.

C#
private bool HandleEntryReadError(string line, Exception exception)
{
    Log.LogError("Error reading line :" + line, exception);
    return true;
}

/* ... */
var flatfile = new FlatFile<TestObject>(layout, fileStream, HandleEntryReadError);

Plays Well With Others

One of the core philosophies behind this library is that there are other libraries that already excel on their goal, so this should not be a one library make all, but a one goal library that follows the standards and can play well with others.

An example of that is how this library can work with Dapper in order to import and export data from the database.

C#
public void ImportData(IDbConnection connection)
{
    using (var file = File.OpenRead(ImportFile))
    {
        var flatfile = new FlatFile<TestObject>(layout, file);
        var itemsEnumerable = flatfile.Read();
        connection.Execute("spInsertItem", itemsEnumerable);
    }
}

public void ExportData(IDbConnection connection)
{
    using (var file = File.OpenWrite(ImportFile))
    {
        var flatfile = new FlatFile<TestObject>(layout, file);
        var query = "select Id, Description, NullableInt from TestDataTable";
        var items = connection.Query<TestObject>(query);
        flatfile.Write(items);
    }
}

Since is works with every POCO and there is no code intrusion in any way, there is no outstanding reason why this won't work as well with EntityFramework or any other micro or full fledged ORM.

Performance

Basic performance tests are made on https://github.com/kappy/FlatMapper.PerformanceTests.
For the test, I used a record with 3 parameters (very similar to the previous examples), 1.000.000 of them.
The files generated were about 22 MB, and these were the results:

Performance Test results

The test was run on a i7 Q740 @ 1.73GHz, SSD, and 6GB Ram.
This test used the nuget package. On recent tests, I discovered that a local build from the sources, using VS 2015, results in quite some performance gain, maybe because of different compiler versions.

The Road to 1.0

Before the 1.0 release, there is some stuff that I would like to have in the library.

  • Async (for this to actually work, it will need to work by blocks, dropping the Iterative reads feature)

Ideas

Some other ideas for after 1.0.

  • Per line/record Layout.. using a discriminator as base.

License and Contributions

Contributions are welcome, either bug-fixes, new features, or just filing up issues with ideas and suggestions. Any feedback is appreciated.

This library is open source and licensed under Apache License 2.0.

Thanks

  • Need to thank my wife for all those nights where the attention was on the computer and not on her;
  • My company Mindbus that allowed me not only to use my library on live projects but helped me fine in tuning it;
  • My colleague Nuno Santos, the man that originally wrote the Multi-line parser.
This article was originally posted at https://github.com/kappy/FlatMapper/wiki

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer Mindbus
Portugal Portugal
João Matos Silva is a Enthusiastic software developer at Mindbus. With 10 years of experience on .Net and experience on team leadership and software architecture, mainly focused on the .Net and Asp.Net platforms..
Adept of open source, has a public profile on github (https://github.com/kappy) with some projects of his authorship and contributions to other main open source projects.

Comments and Discussions

 
QuestionWhat made you write this Pin
Chris at M16-Feb-16 18:38
Chris at M16-Feb-16 18:38 
AnswerRe: What made you write this Pin
João Matos Silva17-Feb-16 0:09
professionalJoão Matos Silva17-Feb-16 0:09 
GeneralMy vote of 4 Pin
dale.newman12-Feb-16 2:48
dale.newman12-Feb-16 2:48 
GeneralRe: My vote of 4 Pin
João Matos Silva12-Feb-16 3:12
professionalJoão Matos Silva12-Feb-16 3:12 
GeneralRe: My vote of 4 Pin
dale.newman12-Feb-16 3:14
dale.newman12-Feb-16 3:14 
GeneralMy vote of 5 Pin
GerVenson11-Feb-16 1:15
professionalGerVenson11-Feb-16 1:15 
GeneralRe: My vote of 5 Pin
OriginalGriff11-Feb-16 1:19
mveOriginalGriff11-Feb-16 1:19 
GeneralRe: My vote of 5 Pin
GerVenson11-Feb-16 1:42
professionalGerVenson11-Feb-16 1:42 
GeneralRe: My vote of 5 Pin
João Matos Silva11-Feb-16 3:10
professionalJoão Matos Silva11-Feb-16 3:10 
QuestionNice Pin
Garth J Lancaster10-Feb-16 16:16
professionalGarth J Lancaster10-Feb-16 16:16 
AnswerRe: Nice Pin
João Matos Silva11-Feb-16 3:10
professionalJoão Matos Silva11-Feb-16 3:10 
Questionclass within a class Pin
Alan March21-Oct-15 12:36
Alan March21-Oct-15 12:36 
AnswerRe: class within a class Pin
João Matos Silva22-Oct-15 0:35
professionalJoão Matos Silva22-Oct-15 0:35 
GeneralRe: class within a class Pin
Alan March22-Oct-15 7:00
Alan March22-Oct-15 7:00 
GeneralRe: class within a class Pin
João Matos Silva22-Oct-15 22:56
professionalJoão Matos Silva22-Oct-15 22:56 
GeneralGreat work Pin
Member 359306419-Oct-15 23:26
Member 359306419-Oct-15 23:26 
I'll follow your project on github.
Reminds me a bit of FluentEtl. I looked at that a few years ago.
QuestionTypo? Pin
bling19-Oct-15 9:57
bling19-Oct-15 9:57 
AnswerRe: Typo? Pin
João Matos Silva19-Oct-15 10:25
professionalJoão Matos Silva19-Oct-15 10:25 
QuestionA Demo Source Code would be great Pin
Ehsan Sajjad19-Oct-15 4:07
professionalEhsan Sajjad19-Oct-15 4:07 
AnswerRe: A Demo Source Code would be great Pin
João Matos Silva19-Oct-15 4:24
professionalJoão Matos Silva19-Oct-15 4:24 
GeneralMy vote of 5 Pin
Tomas Takac19-Oct-15 3:20
Tomas Takac19-Oct-15 3:20 
GeneralRe: My vote of 5 Pin
João Matos Silva19-Oct-15 3:44
professionalJoão Matos Silva19-Oct-15 3:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.