Click here to Skip to main content
14,981,768 members
Articles / DevOps
Article
Posted 18 Feb 2017

Tagged as

Stats

18.1K views
19 bookmarked

Cinchoo ETL - Xml Reader

Rate me:
Please Sign up or sign in to vote.
5.00/5 (7 votes)
8 Jan 2018CPOL28 min read
Simple Xml file reader for .NET
ChoETL is an open source ETL (extract, transform and load) framework for .NET. It is a code based library for extracting data from multiple sources, transforming, and loading into your very own data warehouse in .NET environment. You can have data in your data warehouse in no time.

Contents

1. Introduction

ChoETL is an open source ETL (extract, transform and load) framework for .NET. It is a code based library for extracting data from multiple sources, transforming, and loading into your very own data warehouse in .NET environment. You can have data in your data warehouse in no time.

This article talks about using ChoXmlReader component offered by ChoETL framework. It is a simple utility class to extract Xml data from file / source to objects.

UPDATE: Corresponding XmlWriter article can be found here.

Features:

  • Uses Xml Reader, parses Xml file in fastest manner.
  • Stream based parsers allow for ultimate performance, low resource usage, and nearly unlimited versatility scalable to any size data file, even tens or hundreds of gigabytes.
  • Event based data manipulation and validation allows total control over the flow of data during the bulk insert process.
  • Exposes IEnumarable list of objects - which is often used with LINQ query for projection, aggregation and filtration etc.
  • Supports deferred reading.
  • Supports processing files with culture specific date, currency and number formats.
  • Supports different character encoding.
  • Recognizes a wide variety of date, currency, enum, boolean and number formats when reading files.
  • Provides fine control of date, currency, enum, boolean, number formats when writing files.
  • Detailed and robust error handling, allowing you to quickly find and fix problems.

2. Requirement

This framework library is written in C# using .NET 4.x Framework / .NET Core 2.x.

3. "Hello World!" Sample

  • Open VS.NET 2017 or higher
  • Create a sample VS.NET (.NET Framework 4.x / .NET Core 2.x) Console Application project
  • Install ChoETL via Package Manager Console using Nuget Command:
    • Install-Package ChoETL
    • Install-Package ChoETL.NETStandard
  • Use the ChoETL namespace

Let's begin by looking into a simple example of reading Xml file having 2 columns

Listing 3.1 Sample Xml data file

XML
<Employees>
    <Employee Id='1'>
        <Name>Tom</Name>
    </Employee>
    <Employee Id='2'>
        <Name>Mark</Name>
    </Employee>
</Employees>

There are number of ways you can get the Xml file parsing started with minimal setup

3.1. Quick load - Data First Approach

It is the zero config, quick way to load a Xml file in no time. No POCO object is required. Sample code below shows how to load the file

Listing 3.1.1 Load Xml file using iterator

foreach (dynamicin new ChoXmlReader("emp.xml"))
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

Listing 3.1.2 Load Xml file using loop

var reader = new ChoXmlReader("emp.xml");
dynamic rec;
 
while ((rec = reader.Read()) != null)
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

3.2. Code First Approach

This is another zero config way to parse and load Xml file using POCO class. First define a simple data class to match the underlying Xml file layout

Listing 3.2.1 Simple POCO entity class

public partial class EmployeeRec
{
    public int Id { getset; }
    public string Name { getset; } 
}

In above, the class defines two properties matching the sample Xml file template.

Listing 3.2.2 Load Xml file

foreach (var e in new ChoXmlReader<EmployeeRec>("emp.xml"))
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

3.3. Configuration First Approach

In this model, we define the Xml configuration with all the necessary parsing parameters along with Xml columns matching with the underlying Xml file. 

Listing 3.3.1 Define Xml configuration

ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id"));
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));

In above, the class defines two properties matching the sample Xml file template.

Listing 3.3.2 Load Xml file without POCO object

foreach (dynamic e in new ChoXmlReader("emp.xml", config))
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

Listing 3.3.3 Load Xml file with POCO object

foreach (var e in new ChoXmlReader<EmployeeRec>("emp.xml", config))
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

3.4. Code First with declarative configuration

This is the combined approach to define POCO entity class along with Xml configuration parameters decorated declaratively. Id is required column and Name is optional value column with default value "XXXX". If Name is not present, it will take the default value.

Listing 3.4.1 Define POCO Object

public class EmployeeRec
{
    [ChoXmlNodeRecordField(XPath = "//@Id")]
    [Required]
    public int Id
    {
        get;
        set;
    }
    [ChoXmlNodeRecordField(XPath = "//Name")]
    [DefaultValue("XXXX")]
    public string Name
    {
        get;
        set;
    }
 
    public override string ToString()
    {
        return "{0}. {1}".FormatString(Id, Name);
    }
}

The code above illustrates about defining POCO object to carry the values of each record line in the input file. First thing defines property for each record field with ChoXmlNodeRecordFieldAttribute to qualify for Xml record mapping.  XPath is a optional property. If not specified, framework automatically discover and load the values from either from xmlattribute or xmlelement. Id is decorated it with RequiredAttribute, if the value is missing,it will throw exception.  Name is given default value using DefaultValueAttribute. It means that if the Name Xml column contains empty value in the file, it will be defaulted to 'XXXX' value.

It is very simple and ready to extract Xml data in no time.

Listing 3.4.2 Main Method

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("emp.xml"))
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

We start by creating a new instance of ChoXmlReader object. That's all. All the heavy lifting of parsing and loading Xml data stream into the objects is done by the parser under the hood.

By default, ChoXmlReader discovers and uses default configuration parameters while loading Xml file. These can be overridable according to your needs. The following sections will give details about each configuration attributes.

4. Reading All Records

It is as easy as setting up POCO object match up with Xml file structure, you can read the whole file as enumerable pattern. It is a deferred execution mode, but take care while making any aggregate operation on them. This will load the entire file records into memory.

Listing 4.1 Read Xml File

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("emp.xml"))
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

or:

Listing 4.2 Read Xml file stream

C#
foreach (var e in new ChoXmlReader<EmployeeRec>(textReader))
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

This model keeps your code elegant, clean, easy to read and maintain. Also leverages LINQ extension methods to to perform grouping, joining, projection, aggregation etc.

Listing 4.3 Using LINQ

var list = (from o in new ChoXmlReader<EmployeeRec>("emp.xml")
           where o.Name != null && o.Name.StartsWith("R")
           select o).ToArray();
 
foreach (var e in list)
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

5. Read Records Manually

It is as easy as setting up POCO object match up with Xml file structure, you can read the whole file as enumerable pattern

Listing 5.1 Read Xml file

C#
var reader = new ChoXmlReader<EmployeeRec>("emp.xml");
var rec = (object)null;
 
while ((rec = reader.Read()) != null)
{
    Console.WriteLine(e.Id);
    Console.WriteLine(e.Name);
}

6. Customize Xml Record

Using ChoXmlRecordObjectAttribute, you can customize the POCO entity object declaratively.

Listing 6.1 Customizing POCO object for each record

C#
[ChoXmlRecordObject]
public class EmployeeRec
{
    [ChoXmlNodeRecordField(XPath = "/@Id")] 
    public int Id { get; set; }
    [ChoXmlNodeRecordField(XPath = "/Name")] 
    [Required]
    [DefaultValue("ZZZ")]
    public string Name { get; set; }
}

Here are the available attributes to carry out customization of Xml load operation on a file.

  • XPath - Optional. XPath expression used to pick the elements to load. If not specified, the member value will be discovered and loaded automatically. 
  • ColumnCountStrict - This flag indicates if an exception should be thrown if reading an expected field is missing.
  • ErrorMode - This flag indicates if an exception should be thrown if reading and an expected field is failed to load. This can be overridden per property. Possible values are:
    • IgnoreAndContinue - Ignore the error, record will be skipped and continue with next.
    • ReportAndContinue - Report the error to POCO entity if it is of IChoNotifyRecordRead type
    • ThrowAndStop - Throw the error and stop the execution
  • IgnoreFieldValueMode - A flag to let the reader know if a record should be skipped when reading if it's empty / null. This can be overridden per property. Possible values are:
    • Null - N/A
    • DBNull - N/A
    • Empty - skipped if the record value is empty
    • WhiteSpace - skipped if the record value contains only whitespaces
  • ObjectValidationMode - A flag to let the reader know about the type of validation to be performed with record object. Possible values are:
    • Off - No object validation performed. (Default)
    • MemberLevel - Validation performed at the time of each Xml property gets loaded with value.
    • ObjectLevel - Validation performed after all the properties are loaded to the POCO object.

8. Customize Xml Fields

For each Xml column, you can specify the mapping in POCO entity property using ChoXmlNodeRecordFieldAttributeOnly use this attribute if you want to use custom xpath to map to this column. Otherwise use the simple variation attributes aka. ChoXmlElementRecordFieldAttribute/ChoXmlAttributeRecordFieldAttribute.

Listing 6.1 Customizing POCO object for Xml columns

C#
public class EmployeeRec
{
    [ChoXmlNodeRecordField(XPath="/@Id")]
    public int Id { get; set; }
    [ChoXmlNodeRecordField(XPath="/Name")]
    [Required]
    [DefaultValue("ZZZ")]
    public string Name { get; set; }
}

Here are the available members to add some customization to it for each property:

  • XPath - Optional. XPath expression uses a path notation, like those used in URLs, for addressing parts of an XML document. If not specified, ChoXmlReader automatically discover and load the value either from XmlElement or XmlAttribute. If not specified, ChoXmlReader discover and load the value from XElement or XAttribute matching the field name.
  • ErrorMode - This flag indicates if an exception should be thrown if reading and an expected field failed to load. Possible values are:
    • IgnoreAndContinue - Ignore the error and continue to load other properties of the record.
    • ReportAndContinue - Report the error to POCO entity if it is of IChoRecord type.
    • ThrowAndStop - Throw the error and stop the execution.
  • IgnoreFieldValueMode - A flag to let the reader know if a record should be skipped when reading if it's empty / null. Possible values are:
    • Null - N/A
    • DBNull - N/A
    • Empty - skipped if the record value is empty.
    • WhiteSpace - skipped if the record value contains only whitespaces. 

8.1. DefaultValue

It is the value used and set to the property when the Xml value is empty or whitespace (controlled via IgnoreFieldValueMode).

Any POCO entity property can be specified with default value using System.ComponentModel.DefaultValueAttribute.

8.2. ChoFallbackValue

It is the value used and set to the property when the Xml value failed to set. Fallback value only set when ErrorMode is either IgnoreAndContinue or ReportAndContinue.

Any POCO entity property can be specified with fallback value using ChoETL.ChoFallbackValueAttribute.

8.3. Type Converters

Most of the primitive types are automatically converted and set them to the properties. If the value of the Xml field can't automatically be converted into the type of the property, you can specify a custom / built-in .NET converters to convert the value. These can be either IValueConverter or TypeConverter converters.

There are couple of ways you can specify the converters for each field

  • Declarative Approach
  • Configuration Approach

8.3.1. Declarative Approach

This model is applicable to POCO entity object only. If you have POCO class, you can specify the converters to each property to carry out necessary conversion on them. Samples below shows the way to do it.

Listing 8.3.1.1 Specifying type converters

C#
public class EmployeeRec
{
    [ChoXmlNodeRecordField(XPath: "/@Id")]
    [ChoTypeConverter(typeof(IntConverter))]
    public int Id { get; set; }
    [ChoXmlNodeRecordField(XPath: "/Name")]
    [Required]
    [DefaultValue("ZZZ")]
    public string Name { get; set; }
}

Listing 8.3.1.2 IntConverter implementation

C#
public class IntConverter : IValueConverter
{
    public object Convert(object value, Type targetType, object parameter, CultureInfo culture)
    {
        return value;
    }
 
    public object ConvertBack(object value, Type targetType, object parameter, CultureInfo culture)
    {
        return value;
    }
}

In the example above, we defined custom IntConverter class. And showed how to use it with 'Id' Xml property.

8.3.2. Configuration Approach

This model is applicable to both dynamic and POCO entity object. This gives freedom to attach the converters to each property at runtime. This takes the precedence over the declarative converters on POCO classes. 

Listing 8.3.2.2 Specifying TypeConverters

ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();

ChoXmlNodeRecordFieldConfiguration idConfig = new ChoXmlNodeRecordFieldConfiguration("Id", XPath: "/@Id");
idConfig.AddConverter(new IntConverter());
config.XmlRecordFieldConfigurations.Add(idConfig);

config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name", XPath: "/Name"));
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name1", XPath: "/Name1"));

In above, we construct and attach the IntConverter to 'Id' field using AddConverter helper method in ChoXmlNodeRecordFieldConfiguration object.

Likewise, if you want to remove any converter from it, you can use RemoveConverter on ChoXmlNodeRecordFieldConfiguration object.

8.4. Validations

ChoXmlReader leverages both System.ComponentModel.DataAnnotations and Validation Block validation attributes to specify validation rules for individual fields of POCO entity. Refer to the MSDN site for a list of available DataAnnotations validation attributes.

Listing 8.4.1 Using validation attributes in POCO entity

C#
[ChoXmlRecordObject]
public partial class EmployeeRec
{
    [ChoXmlNodeRecordField(FieldName = "id", XPath: "/@Id")]
    [ChoTypeConverter(typeof(IntConverter))]
    [Range(1int.MaxValue, ErrorMessage = "Id must be > 0.")]
    [ChoFallbackValue(1)]
    public int Id { getset; }
 
    [ChoXmlNodeRecordField(FieldName = "Name", XPath: "/Name")]
    [Required]
    [DefaultValue("ZZZ")]
    [ChoFallbackValue("XXX")]
    public string Name { getset; }
}

In example above, used Range validation attribute for Id property. Required validation attribute to Name property. ChoXmlReader performs validation on them during load based on Configuration.ObjectValidationMode is set to ChoObjectValidationMode.MemberLevel or ChoObjectValidationMode.ObjectLevel.

Sometime you may want override the defined declarative validation behaviors comes with POCO class, you can do with Cinchoo ETL via configuration approach. The sample below shows the way to override them.

static void ValidationOverridePOCOTest()
{
    ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
    var idConfig = new ChoXmlNodeRecordFieldConfiguration("Id", XPath: "/@Id");
    idConfig.Validators = new ValidationAttribute[] { new RequiredAttribute() };
    config.XmlRecordFieldConfigurations.Add(idConfig);
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name", XPath: "/Name));
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Salary", XPath: "/Salary"));
 
    using (var parser = new ChoXmlReader<EmployeeRecWithCurrency>("emp.xml", config))
    {
        object rec;
        while ((rec = parser.Read()) != null)
        {
            Console.WriteLine(rec.ToStringEx());
        }
    }
}
 
public class EmployeeRecWithCurrency
{
    public int? Id { get; set; }
    public string Name { get; set; }
    public ChoCurrency Salary { get; set; }
}

Some cases, you may want to take control and perform manual self validation within the POCO entity class. This can be achieved by inheriting POCO object from IChoValidatable interface.

Listing 8.4.2 Manual validation on POCO entity

[ChoXmlRecordObject]
public partial class EmployeeRec : IChoValidatable
{
    [ChoXmlNodeRecordField(FieldName = "id", XPath: "/@Id")]
    [ChoTypeConverter(typeof(IntConverter))]
    [Range(1int.MaxValue, ErrorMessage = "Id must be > 0.")]
    [ChoFallbackValue(1)]
    public int Id { getset; }
 
    [ChoXmlNodeRecordField(FieldName = "Name", XPath: "/Name")]
    [Required]
    [DefaultValue("ZZZ")]
    [ChoFallbackValue("XXX")]
    public string Name { getset; }
 
    public bool TryValidate(object target, ICollection<ValidationResult> validationResults)
    {
        return true;
    }
 
    public bool TryValidateFor(object target, string memberName, ICollection<ValidationResult> validationResults)
    {
        return true;
    }
 
    public void Validate(object target)
    {
    }
 
    public void ValidateFor(object target, string memberName)
    {
    }
}

Sample above shows how to implement custom self-validation in POCO object.

IChoValidatable interface exposes below methods

  • TryValidate - Validate entire object, return true if all validation passed. Otherwise return false.
  • Validate - Validate entire object, throw exception if validation is not passed.
  • TryValidateFor - Validate specific property of the object, return true if all validation passed. Otherwise return false.
  • ValidateFor - Validate specific property of the object, throw exception if validation is not passed.

10. Callback Mechanism

ChoXmlReader offers industry standard Xml parsing out of the box to handle most of the parsing needs. If the parsing is not handling any of the needs, you can use the callback mechanism offered by ChoXmlReader to handle such situations. In order to participate in the callback mechanism, you can use either of the following models

  • Using event handlers exposed by ChoXmlReader via IChoReader interface.
  • Inheriting POCO entity object from IChoNotifyRecordRead / IChoNotifyFileRead / IChoNotifyRecordFieldRead interfaces
  • Inheriting DataAnnotation's MetadataType type object by IChoNotifyRecordRead / IChoNotifyFileRead / IChoNotifyRecordFieldRead interfaces.
  • Inheriting IChoNotifyRecordFieldConfigurable IChoNotifyRecordFieldConfigurable configuration interfaces

Note: Any exceptions raised out of these interface methods will be ignored.

IChoReader exposes the below events:

  • BeginLoad - Invoked at the begin of the Xml file load
  • EndLoad - Invoked at the end of the Xml file load
  • BeforeRecordLoad - Raised before the Xml record load
  • AfterRecordLoad - Raised after Xml record load
  • RecordLoadError - Raised when Xml record load errors out
  • BeforeRecordFieldLoad - Raised before Xml field value load
  • AfterRecordFieldLoad - Raised after Xml field value load
  • RecordFieldLoadError - Raised when Xml field value errors out
  • SkipUntil - Raised before the Xml parsing kicks off to add custom logic to skip record lines.
  • DoWhile - Raised during Xml parsing where you can add custom logic to stop the parsing.

IChoNotifyRecordRead exposes the below methods:

  • BeforeRecordLoad - Raised before the Xml record load
  • AfterRecordLoad - Raised after Xml record load
  • RecordLoadError - Raised when Xml record load errors out

IChoNotifyFileRead exposes the below methods:

  • BeginLoad - Invoked at the begin of the Xml file load
  • EndLoad - Invoked at the end of the Xml file load
  • SkipUntil - Raised before the Xml parsing kicks off to add custom logic to skip record lines.
  • DoWhile - Raised during Xml parsing where you can add custom logic to stop the parsing.

IChoNotifyRecordFieldRead exposes the below methods:

  • BeforeRecordFieldLoad - Raised before Xml field value load
  • AfterRecordFieldLoad - Raised after Xml field value load
  • RecordFieldLoadError - Raised when Xml field value errors out

IChoNotifyRecordConfigurable exposes the below methods:

  • RecordConfigure - Raised for Xml record configuration

IChoNotifyRecordFieldConfigurable exposes the below methods:

  • RecondFieldConfigure - Raised for each Xml record field configuration

10.1. Using ChoXmlReader events

This is more direct and simplest way to subscribe to the callback events and handle your odd situations in parsing Xml files. Downside is that code can't be reusable as you do by implementing IChoNotifyRecordRead with POCO record object.

Sample below shows how to use the BeforeRecordLoad callback method to skip nodes having Id < 100.

static void IgnoreNodeTest()
{
    using (var parser = new ChoXmlReader("emp.xml"))
    {

        parser.BeforeRecordLoad += (o, e) =>
        {
            if (e.Source != null)
            {
                e.Skip = ((XElement)e.Source).Attribute["Id"].Value < 100;
            }
        };
        foreach (var e in parser)
            Console.WriteLine(e.Dump());
    }
}

Likewise you can use other callback methods as well with ChoXmlReader.

10.2. Implementing IChoNotifyRecordRead interface

Sample below shows how to implement IChoNotifyRecordRead interface to direct POCO class.

Listing 10.2.1 Direct POCO callback mechanism implementation

C#
[ChoXmlRecordObject]
public partial class EmployeeRec : IChoNotifyRecordRead
{
    [ChoXmlNodeRecordField(FieldName = "Id", XPath: "/@Id")]
    [ChoTypeConverter(typeof(IntConverter))]
    [Range(1int.MaxValue, ErrorMessage = "Id must be > 0.")]
    [ChoFallbackValue(1)]
    public int Id { getset; }
    
    [ChoXmlNodeRecordField(FieldName = "Name", XPath: "/Name")]
    [Required]
    [DefaultValue("ZZZ")]
    [ChoFallbackValue("XXX")]
    public string Name { getset; }
 
    public bool AfterRecordLoad(object target, int index, object source)
    {
        throw new NotImplementedException();
    }
 
    public bool BeforeRecordLoad(object target, int index, ref object source)
    {
        throw new NotImplementedException();
    }
 
    public bool RecordFieldLoadError(object target, int index, string propName, object value, Exception ex)
    {
        throw new NotImplementedException();
    }
}

Sample below shows how to attach Metadata class to POCO class by using MetadataTypeAttribute on it.

Listing 10.2.2 MetaDataType based callback mechanism implementation

C#
[ChoXmlRecordObject]
public class EmployeeRecMeta : IChoNotifyRecordRead
{
    [ChoXmlNodeRecordField(FieldName = "Id", XPath: "/@Id")]
    [ChoTypeConverter(typeof(IntConverter))]
    [Range(1int.MaxValue, ErrorMessage = "Id must be > 0.")]
    [ChoFallbackValue(1)]
    public int Id { getset; }

    [ChoXmlNodeRecordField(FieldName = "Name", XPath: "/Name")]
    [Required]
    [DefaultValue("ZZZ")]
    [ChoFallbackValue("XXX")]
    public string Name { getset; }
 
    public bool AfterRecordLoad(object target, int index, object source)
    {
        throw new NotImplementedException();
    }
 
    public bool BeforeRecordLoad(object target, int index, ref object source)
    {
        throw new NotImplementedException();
    }
 
    public bool RecordFieldLoadError(object target, int index, string propName, object value, Exception ex)
    {
        throw new NotImplementedException();
    }
} 

[MetadataType(typeof(EmployeeRecMeta))]
public partial class EmployeeRec
{
    public int Id { getset; }
    public string Name { getset; }
}

Sample below shows how to attach Metadata class for sealed or third party POCO class by using ChoMetadataRefTypeAttribute on it.

Listing 10.2.3 ChoMetaDataRefType based callback mechanism implementation

C#
[ChoMetadataRefType(typeof(EmployeeRec))]
[ChoXmlRecordObject]
public class EmployeeRecMeta : IChoNotifyRecordRead
{
    [ChoXmlNodeRecordField(FieldName = "Id", XPath: "/@Id")]
    [ChoTypeConverter(typeof(IntConverter))]
    [Range(1int.MaxValue, ErrorMessage = "Id must be > 0.")]
    [ChoFallbackValue(1)]
    public int Id { getset; }

    [ChoXmlNodeRecordField(FieldName = "Name", XPath: "/Name")]
    [Required]
    [DefaultValue("ZZZ")]
    [ChoFallbackValue("XXX")]
    public string Name { getset; }
 
    public bool AfterRecordLoad(object target, int index, object source)
    {
        throw new NotImplementedException();
    }
 
    public bool BeforeRecordLoad(object target, int index, ref object source)
    {
        throw new NotImplementedException();
    }
 
    public bool RecordLoadError(object target, int index, object source, Exception ex)
    {
        throw new NotImplementedException();
    }
} 

public partial class EmployeeRec
{
    public int Id { getset; }    
    public string Name { getset; }
}

10.3. BeginLoad

This callback invoked once at the beginning of the Xml file load. source is the Xml file stream object. In here you have chance to inspect the stream, return true to continue the Xml load. Return false to stop the parsing.

Listing 10.3.1 BeginLoad Callback Sample

public bool BeginLoad(object source)
{
    StreamReader sr = source as StreamReader;
    return true;
}

10.4. EndLoad

This callback invoked once at the end of the Xml file load. source is the Xml file stream object. In here you have chance to inspect the stream, do any post steps to be performed on the stream.

Listing 10.4.1 EndLoad Callback Sample

public void EndLoad(object source)
{
    StreamReader sr = source as StreamReader;
}

10.5. BeforeRecordLoad

This callback invoked before each record line in the Xml file is loaded. target is the instance of the POCO record object. index is the line index in the file. source is the Xml record line. In here you have chance to inspect the line, and override it with new line if want to.

TIP: If you want to skip the XNode from loading, set the source to null.

TIP: If you want to take control of parsing and loading the record properties by yourself, set the source to String.Empty. 

Return true to continue the load process, otherwise return false to stop the process.

Listing 10.5.1 BeforeRecordLoad Callback Sample

public bool BeforeRecordLoad(object target, int index, ref object source)
{
    XNode node = source as XNode; 
    return true;
}

10.6. AfterRecordLoad

This callback invoked after each record line in the Xml file is loaded. target is the instance of the POCO record object. index is the line index in the file. source is the Xml record line. In here you have chance to do any post step operation with the record line.

Return true to continue the load process, otherwise return false to stop the process.

Listing 10.6.1 AfterRecordLoad Callback Sample

public bool AfterRecordLoad(object target, int index, object source)
{
    XNode node = source as XNode;
    return true;
}

10.7. RecordLoadError

This callback invoked if error encountered while loading record line. target is the instance of the POCO record object. index is the line index in the file. source is the Xml record line. ex is the exception object. In here you have chance to handle the exception. This method invoked only when Configuration.ErrorMode is ReportAndContinue.

Return true to continue the load process, otherwise return false to stop the process.

Listing 10.7.1 RecordLoadError Callback Sample

public bool RecordLoadError(object target, int index, object source, Exception ex)
{
    XNode node = source as XNode;
    return true;
}

10.8. BeforeRecordFieldLoad

This callback invoked before each Xml record column is loaded. target is the instance of the POCO record object. index is the line index in the file. propName is the Xml record property name. value is the Xml column value. In here you have chance to inspect the Xml record property value and perform any custom validations etc.

Return true to continue the load process, otherwise return false to stop the process.

Listing 10.8.1 BeforeRecordFieldLoad Callback Sample

public bool BeforeRecordFieldLoad(object target, int index, string propName, ref object value)
{
    return true;
}

10.9. AfterRecordFieldLoad

This callback invoked after each Xml record column is loaded. target is the instance of the POCO record object. index is the line index in the file. propName is the Xml record property name. value is the Xml column value. Any post field operation can be performed here, like computing other properties, validations etc.

Return true to continue the load process, otherwise return false to stop the process.

Listing 10.9.1 AfterRecordFieldLoad Callback Sample

public bool AfterRecordFieldLoad(object target, int index, string propName, object value)
{
    return true;
}

10.10. RecordLoadFieldError

This callback invoked when error encountered while loading Xml record column value. target is the instance of the POCO record object. index is the line index in the file. propName is the Xml record property name. value is the Xml column value. ex is the exception object. In here you have chance to handle the exception. This method invoked only after the below two sequences of steps performed by the ChoXmlReader

  • ChoXmlReader looks for FallbackValue value of each Xml property. If present, it tries to assign its value to it.
  • If the FallbackValue value not present and the Configuration.ErrorMode is specified as ReportAndContinue., this callback will be executed.

Return true to continue the load process, otherwise return false to stop the process.

Listing 10.10.1 RecordFieldLoadError Callback Sample

public bool RecordFieldLoadError(object target, int index, string propName, object value, Exception ex)
{
    return true;
}

10. Customization

ChoXmlReader automatically detects and loads the configured settings from POCO entity. At runtime, you can customize and tweak these parameters before Xml parsing. ChoXmlReader exposes Configuration property, it is of ChoXmlRecordConfiguration object. Using this property, you can customize them.

Listing 10.1 Customizing ChoXmlReader at run-time

C#
class Program
{
    static void Main(string[] args)
    {
        using (var parser = new ChoXmlReader<EmployeeRec>("emp.xml"))
        {
            object row = null;
  
            parser.Configuration.ColumnCountStrict = true;
            while ((row = parser.Read()) != null)
                Console.WriteLine(row.ToString());
        }
    }

11. AsDataReader Helper Method

ChoXmlReader exposes AsDataReader helper method to retrieve the Xml records in .NET datareader object. DataReader are fast-forward streams of data. This datareader can be used in few places like bulk coping data to database using SqlBulkCopy, loading disconnected DataTable, etc.

Listing 11.1 Reading as DataReader sample

static void AsDataReaderTest()
{
    using (var parser = new ChoXmlReader<EmployeeRec>("emp.xml"))
    {
        IDataReader dr = parser.AsDataReader();
        while (dr.Read())
        {
            Console.WriteLine("Id: {0}, Name: {1}", dr[0], dr[1]);
        }
    }
}

12. AsDataTable Helper Method

ChoXmlReader exposes AsDataTable helper method to retrieve the Xml records in .NET DataTable object. It then can be persisted to disk, displayed in grid/controls or stored in memory like any other object.

Listing 12.1 Reading as DataTable sample

static void AsDataTableTest()
{
    using (var parser = new ChoXmlReader<EmployeeRec>("emp.xml"))
    {
        DataTable dt = parser.AsDataTable();
        foreach (DataRow dr in dt.Rows)
        {
            Console.WriteLine("Id: {0}, Name: {1}", dr[0], dr[1]);
        }
    }
}

13. Using Dynamic Object

So far, the article explained about using ChoXmlReader with POCO object. ChoXmlReader also supports loading Xml file without POCO object. It leverages .NET dynamic feature. The sample below shows how to read Xml stream without POCO object.

If you have Xml file, you can parse and load the file with minimal/zero configuration. 

The sample below shows it:

Listing 13.1 Loading Xml file

C#
class Program
{
    static void Main(string[] args)
    {
        dynamic row;
        using (var parser = new ChoXmlReader("emp.xml"))
        {
            while ((row = parser.Read()) != null)
            {
                Console.WriteLine(row.Id);
            }
        }
    }
}

The above example automatically discovers the Xml elements/attributes and parses the file.

You can override the default behavior of discovering columns automatically by adding field configurations manually and pass it to ChoXmlReader for parsing file.

Sample shows how to do it:

Listing 13.3 Loading Xml file with configuration

C#
class Program
{
    static void Main(string[] args)
    {
        ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
        config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id", XPath: "/@Id"));
        config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name", XPath: "/Name"));

        dynamic row;
        using (var parser = new ChoXmlReader("Emp.xml", config))
        {
            while ((row = parser.Read()) != null)
            {
                Console.WriteLine(row.Name);
            }
        }
    }
}

To completely turn off the auto column discovery, you will have to set ChoXmlRecordConfiguration.AutoDiscoverColumns to false.

13.1. DefaultValue

It is the value used and set to the property when the Xml value is empty or whitespace (controlled via IgnoreFieldValueMode).

Any POCO entity property can be specified with default value using System.ComponentModel.DefaultValueAttribute.

For dynamic object members or to override the declarative POCO object member's default value specification, you can do so through configuration as shown below.

ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id"));
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name") { DefaultValue = "NoName" })

13.2. ChoFallbackValue

It is the value used and set to the property when the Xml value failed to set. Fallback value only set when ErrorMode is either IgnoreAndContinue or ReportAndContinue.

Any POCO entity property can be specified with fallback value using ChoETL.ChoFallbackValueAttribute.

For dynamic object members or to override the declarative POCO object member's fallback values, you can do through configuration as shown below.

ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id"));
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name") { FallbackValue = "Tom" });

13.3. FieldType

In the type less dynamic object model, the reader reads individual field value and populate them to dynamic object members in 'string' value. If you want to enforce the type and do extra type checking during load, you can do so by declaring the field type at the field configuration.

Listing 8.5.1 Defining FieldType

ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id") { FieldType = typeof(int) });
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));

In above sample shows to define field type as 'int' to 'Id' field. This instruct the ChoXmlReader to parse and convert the value to integer before assigning to it. This extra type safety alleviate the incorrect values being loaded to object while parsing.

13.4. Type Converters

Most of the primitive types are automatically converted and set them to the properties by ChoXmlReader. If the value of the Xml field can't automatically be converted into the type of the property, you can specify a custom / built-in .NET converters to convert the value. These can be either IValueConverter or TypeConverter converters.

In the dynamic object model, you can specify these converters via configuration. See below example on the approach taken to specify type converters for Xml columns

Listing 13.4.1 Specifying TypeConverters

ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();

ChoXmlNodeRecordFieldConfiguration idConfig = new ChoXmlNodeRecordFieldConfiguration("Id");
idConfig.AddConverter(new IntConverter());
config.XmlRecordFieldConfigurations.Add(idConfig);

config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name1"));

In above, we construct and attach the IntConverter to 'Id' field using AddConverter helper method in ChoXmlNodeRecordFieldConfiguration object.

Likewise, if you want to remove any converter from it, you can use RemoveConverter on ChoXmlNodeRecordFieldConfiguration object.

13.5. Validations

ChoXmlReader leverages both System.ComponentModel.DataAnnotations and Validation Block validation attributes to specify validation rules for individual Xml fields. Refer to the MSDN site for a list of available DataAnnotations validation attributes.

Listing 13.5.1 Specifying Validations

ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
config.ThrowAndStopOnMissingField = false;

ChoXmlNodeRecordFieldConfiguration idConfig = new ChoXmlNodeRecordFieldConfiguration("Id");
idConfig.Validators = new ValidationAttribute[] { new RangeAttribute(0, 100) };
config.XmlRecordFieldConfigurations.Add(idConfig);

config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name1"));

In example above, we used Range validation attribute for Id property. XmlReader performs validation on them during load based on Configuration.ObjectValidationMode is set to ChoObjectValidationMode.MemberLevel or ChoObjectValidationMode.ObjectLevel.

PS: Self validation NOT supported in Dynamic object model

14. Working with sealed POCO object

If you already have existing sealed POCO object or the object is in 3rd party library, we can use them with XmlReader.  

Listing 14.1 Exisiting sealed POCO Object

C#
public sealed class ThirdPartyRec
{
    public int Id
    {
        get;
        set;
    }
    public string Name
    {
        get;
        set;
    }
}

Listing 14.2 Consuming Xml file

C#
class Program
{
    static void Main(string[] args)
    {
        using (var parser = new ChoXmlReader<ThirdPartyRec>("Emp.xml"))
        {
            object row = null;
 
            while ((row = parser.Read()) != null)
                Console.WriteLine(row.ToString());
        }
    }
}

In this case, XmlReader reverse discover the Xml columns from the Xml file and load the data into POCO object. If the Xml file structure and POCO object matches, the load will success with populating all corresponding data to its properties. In case the property is missing for any Xml column, XmlReader silently ignores them and continue on with rest.

You can override this behavior by setting ChoXmlRecordConfiguration.ThrowAndStopOnMissingField property to false. In this case, the XmlReader will throw ChoMissingRecordFieldException exception if a property is missing for a Xml column.

15. Exceptions

XmlReader throws different types of exceptions in different situations.

  • ChoParserException - Xml file is bad and parser not able to recover.
  • ChoRecordConfigurationException - Any invalid configuration settings are specified, this exception will be raised.
  • ChoMissingRecordFieldException - A property is missing for a Xml column, this exception will be raised.

17. Using MetadataType Annotation

Cinchoo ETL works better with data annotation's MetadataType model. It is way to attach MetaData class to data model class. In this associated class, you provide additional metadata information that is not in the data model. It roles is to add attribute to a class without having to modify this one. You can add this attribute that takes a single parameter to a class that will have all the attributes. This is useful when the POCO classes are auto generated (by Entity Framework, MVC etc) by an automatic tools. This is why second class come into play. You can add new stuffs without touching the generated file. Also this promotes modularization by separating the concerns into multiple classes.

For more information about it, please search in MSDN.

Listing 17.1 MetadataType annotation usage sample

[MetadataType(typeof(EmployeeRecMeta))]
public class EmployeeRec
{
    public int Id { getset; }
    public string Name { getset; }
}

[ChoXmlRecordObject]
public class EmployeeRecMeta : IChoNotifyRecordRead, IChoValidatable
{
    [ChoXmlNodeRecordField(FieldName = "id", ErrorMode = ChoErrorMode.ReportAndContinue )]
    [ChoTypeConverter(typeof(IntConverter))]
    [Range(11, ErrorMessage = "Id must be > 0.")]
    [ChoFallbackValue(1)]
    public int Id { getset; }

    [ChoXmlNodeRecordField(FieldName = "Name")]
    [StringLength(1)]
    [DefaultValue("ZZZ")]
    [ChoFallbackValue("XXX")]
    public string Name { getset; }
 
    public bool AfterRecordFieldLoad(object target, int index, string propName, object value)
    {
        throw new NotImplementedException();
    }
 
    public bool AfterRecordLoad(object target, int index, object source)
    {
        throw new NotImplementedException();
    }
 
    public bool BeforeRecordFieldLoad(object target, int index, string propName, ref object value)
    {
        throw new NotImplementedException();
    }
 
    public bool BeforeRecordLoad(object target, int index, ref object source)
    {
        throw new NotImplementedException();
    }
 
    public bool BeginLoad(object source)
    {
        throw new NotImplementedException();
    }
 
    public void EndLoad(object source)
    {
        throw new NotImplementedException();
    }
 
    public bool RecordFieldLoadError(object target, int index, string propName, object value, Exception ex)
    {
        throw new NotImplementedException();
    }
 
    public bool RecordLoadError(object target, int index, object source, Exception ex)
    {
        throw new NotImplementedException();
    }
 
    public bool TryValidate(object target, ICollection<ValidationResult> validationResults)
    {
        return true;
    }
 
    public bool TryValidateFor(object target, string memberName, ICollection<ValidationResult> validationResults)
    {
        return true;
    }
 
    public void Validate(object target)
    {
    }
 
    public void ValidateFor(object target, string memberName)
    {
    }
}

In above EmployeeRec is the data class. Contains only domain specific properties and operations. Mark it very simple class to look at it.

We separate the validation, callback mechanism, configuration etc into metadata type class, EmployeeRecMeta.

18. Configuration Choices

If the POCO entity class is an auto-generated class or exposed via library or it is a sealed class, it limits you to attach Xml schema definition to it declaratively. In such case, you can choose one of the options below to specify Xml layout configuration

  • Manual Configuration
  • Auto Map Configuration
  • Attaching MetadataType class 

I'm going to show you how to configure the below POCO entity class on each approach

Listing 18.1 Sealed POCO entity class

C#
public sealed class EmployeeRec
{
    public int Id { getset; }
    public string Name { getset; }
}

18.1 Manual Configuration

Define a brand new configuration object from scratch and add all the necessary Xml fields to the ChoXmlConfiguration.XmlRecordFieldConfigurations collection property. This option gives you greater flexibility to control the configuration of Xml parsing. But the downside is that possibility of making mistakes and hard to manage them if the Xml file layout is large,

Listing 18.1.1 Manual Configuration

C#
ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
config.ThrowAndStopOnMissingField = true;
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id"));
config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));

18.2 Auto Map Configuration

This is an alternative approach and very less error-prone method to auto map the Xml columns for the POCO entity class.

First define a schema class for EmployeeRec POCO entity class as below

Listing 18.2.1 Auto Map class

public class EmployeeRecMap
{
    [ChoXmlNodeRecordField(FieldName = "Id")]
    public int Id { getset; }
 
    [ChoXmlNodeRecordField(FieldName = "Name")]
    public string Name { getset; } 
}

Then you can use it to auto map Xml columns by using ChoXmlRecordConfiguration.MapRecordFields method

Listing 18.2.2 Using Auto Map configuration

C#
ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
config.MapRecordFields<EmployeeRecMap>();

foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml", config)) 
    Console.WriteLine(e.ToString());

18.3 Attaching MetadataType class

This is one another approach to attach MetadataType class for POCO entity object. Previous approach simple care for auto mapping of Xml columns only. Other configuration properties like property converters, parser parameters, default/fallback values etc. are not considered.

This model, accounts for everything by defining MetadataType class and specifying the Xml configuration parameters declaratively. This is useful when your POCO entity is sealed and not partial class. Also it is one of favorable and less error-prone approach to configure Xml parsing of POCO entity.

Listing 18.3.1 Define MetadataType class

[ChoXmlRecordObject]
public class EmployeeRecMeta : IChoNotifyRecordRead, IChoValidatable
{
    [ChoXmlNodeRecordField(FieldName = "Id", ErrorMode = ChoErrorMode.ReportAndContinue )]
    [ChoTypeConverter(typeof(IntConverter))]
    [Range(11, ErrorMessage = "Id must be > 0.")]
    public int Id { getset; }

    [ChoXmlNodeRecordField(FieldName = "Name")]
    [StringLength(1)]
    [DefaultValue("ZZZ")]
    [ChoFallbackValue("XXX")]
    public string Name { getset; }
 
    public bool AfterRecordLoad(object target, int index, object source)
    {
        throw new NotImplementedException();
    }
 
    public bool BeforeRecordLoad(object target, int index, ref object source)
    {
        throw new NotImplementedException();
    }
 
    public bool RecordLoadError(object target, int index, object source, Exception ex)
    {
        throw new NotImplementedException();
    }
 
    public bool TryValidate(object target, ICollection<ValidationResult> validationResults)
    {
        return true;
    }
 
    public bool TryValidateFor(object target, string memberName, ICollection<ValidationResult> validationResults)
    {
        return true;
    }
 
    public void Validate(object target)
    {
    }
 
    public void ValidateFor(object target, string memberName)
    {
    }
}

Listing 18.3.2 Attaching MetadataType class

//Attach metadata 
ChoMetadataObjectCache.Default.Attach<EmployeeRec>(new EmployeeRecMeta());

foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml")) 
    Console.WriteLine(e.ToString()

19. LoadText Helper Method

This is little nifty helper method to parse and load Xml text string into objects.

Listing 19.1 Using LoadText method

string txt = @"
            <Employees>
                <Employee Id='1'>
                    <Name>Tom</Name>
                </Employee>
                <Employee Id='2'>
                    <Name>Mark</Name>
                </Employee>
            </Employees>
        ";
foreach (var e in ChoXmlReader.LoadText(txt))
   Console.WriteLine(e.ToStringEx());

20. Advanced Topics

20.1 Override Converters Format Specs

Cinchoo ETL automatically parses and converts each Xml column values to the corresponding Xml column's underlying data type seamlessly. Most of the basic .NET types are handled automatically without any setup needed.

This is achieved through two key settings in the ETL system

  1. ChoXmlRecordConfiguration.CultureInfo - Represents information about a specific culture including the names of the culture, the writing system, and the calendar used, as well as access to culture-specific objects that provide information for common operations, such as formatting dates and sorting strings. Default is 'en-US'.
  2. ChoTypeConverterFormatSpec - It is global format specifier class holds all the intrinsic .NET types formatting specs.

In this section, I'm going to talk about changing the default format specs for each .NET intrinsic data types according to parsing needs.

ChoTypeConverterFormatSpec is singleton class, the instance is exposed via 'Instance' static member. It is thread local, means that there will be separate instance copy kept on each thread.

There are 2 sets of format specs members given to each intrinsic type, one for loading and another one for writing the value, except for Boolean, Enum, DataTime types. These types have only one member for both loading and writing operations.

Specifying each intrinsic data type format specs through ChoTypeConverterFormatSpec will impact system wide. ie. By setting ChoTypeConverterFormatSpec.IntNumberStyle = NumberStyles.AllowParentheses, will impact all integer members of Xml objects to allow parentheses. If you want to override this behavior and take control of specific Xml data member to handle its own unique parsing of Xml value from global system wide setting, it can be done by specifying TypeConverter at the Xml field member level. Refer section 13.4 for more information.

NumberStyles (optional) used for loading values from Xml stream and Format string are used for writing values to Xml stream.

In this article I'll brief about using NumberStyles for loading Xml data from stream. These values are optional. It determines the styles permitted for each type during parsing of Xml file. System automatically figures out the way to parse and load the values from underlying Culture. In odd situation, you may want to override and set the styles the way you want in order to successfully load the file. Refer the MSDN for more about NumberStyles and its values.

Listing 20.1.1 ChoTypeConverterFormatSpec Members

public class ChoTypeConverterFormatSpec
{
    public static readonly ThreadLocal<ChoTypeConverterFormatSpec> Instance = new ThreadLocal<ChoTypeConverterFormatSpec>(() => new ChoTypeConverterFormatSpec());
 
    public string DateTimeFormat { getset; }
    public ChoBooleanFormatSpec BooleanFormat { getset; }
    public ChoEnumFormatSpec EnumFormat { getset; }
 
    public NumberStyles? CurrencyNumberStyle { getset; }
    public string CurrencyFormat { getset; }
 
    public NumberStyles? BigIntegerNumberStyle { getset; }
    public string BigIntegerFormat { getset; }
 
    public NumberStyles? ByteNumberStyle { getset; }
    public string ByteFormat { getset; }
 
    public NumberStyles? SByteNumberStyle { getset; }
    public string SByteFormat { getset; }
 
    public NumberStyles? DecimalNumberStyle { getset; }
    public string DecimalFormat { getset; }
 
    public NumberStyles? DoubleNumberStyle { getset; }
    public string DoubleFormat { getset; }
 
    public NumberStyles? FloatNumberStyle { getset; }
    public string FloatFormat { getset; }
 
    public string IntFormat { getset; }
    public NumberStyles? IntNumberStyle { getset; }
 
    public string UIntFormat { getset; }
    public NumberStyles? UIntNumberStyle { getset; }
 
    public NumberStyles? LongNumberStyle { getset; }
    public string LongFormat { getset; }
 
    public NumberStyles? ULongNumberStyle { getset; }
    public string ULongFormat { getset; }
 
    public NumberStyles? ShortNumberStyle { getset; }
    public string ShortFormat { getset; }
 
    public NumberStyles? UShortNumberStyle { getset; }
    public string UShortFormat { getset; }
}

Sample below shows how to load Xml data stream having 'se-SE' (Swedish) culture specific data using XmlReader. Also the input feed comes with 'EmployeeNo' values containing parentheses. In order to make the load successful, we have to set the ChoTypeConverterFormatSpec.IntNumberStyle to NumberStyles.AllowParenthesis.

Listing 20.1.2 Using ChoTypeConverterFormatSpec in code

static void UsingFormatSpecs()
{
    ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
    config.Culture = new System.Globalization.CultureInfo("se-SE");
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id") { FieldType = typeof(int) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Salary") { FieldType = typeof(ChoCurrency) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("JoinedDate") { FieldType = typeof(DateTime) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("EmployeeNo") { FieldType = typeof(int) });
 
    ChoTypeConverterFormatSpec.Instance.IntNumberStyle = NumberStyles.AllowParentheses;
 
    using (var parser = new ChoXmlReader("Emp.xml", config))
    {
        object row = null;
 
        while ((row = parser.Read()) != null)
            Console.WriteLine(row.ToStringEx());
    }
}

20.2 Currency Support

Cinchoo ETL provides ChoCurrency object to read and write currency values in Xml files. ChoCurrency is a wrapper class to hold the currency value in decimal type along with support of serializing them in text format during Xml load. 

Listing 20.2.1 Using Currency members in dynamic model

static void CurrencyDynamicTest()
{
    ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id"));
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Salary") { FieldType = typeof(ChoCurrency) });
 
    using (var parser = new ChoXmlReader("Emp.xml", config))
    {
        object rec;
        while ((rec = parser.Read()) != null)
        {
            Console.WriteLine(rec.ToStringEx());
        }
    }
}

Sample above shows how to load currency values using dynamic object model. By default, all the members of dynamic object are treated as string type, unless specified explicitly via ChoXmlFieldConfiguration.FieldType. By specifying the field type as ChoCurrency to the 'Sa;lary' Xml field, XmlReader loads them as currency object.

PS: The format of the currency value is figured by XmlReader through ChoRecordConfiguration.Culture and ChoTypeConverterFormatSpec.CurrencyNumberStyle.

Sample below shows how to use ChoCurrency Xml field in POCO entity class.

Listing 20.2.2 Using Currency members in POCO model

public class EmployeeRecWithCurrency
{
    public int Id { getset; }
    public string Name { getset; }
    public ChoCurrency Salary { getset; }
}
 
static void CurrencyTest()
{
    using (var parser = new ChoXmlReader<EmployeeRecWithCurrency>("Emp.xml"))
    {
        object rec;
        while ((rec = parser.Read()) != null)
        {
            Console.WriteLine(rec.ToStringEx());
        }
    }
}

20.3 Enum Support

Cinchoo ETL implicitly handles parsing of enum column values from Xml files. If you want to fine control the parsing of these values, you can specify them globally via ChoTypeConverterFormatSpec.EnumFormat. Default is ChoEnumFormatSpec.Value

FYI, changing this value will impact system wide.

There are 3 possible values can be used

  1. ChoEnumFormatSpec.Value - Enum value is used for parsing.
  2. ChoEnumFormatSpec.Name - Enum key name is used for parsing.
  3. ChoEnumFormatSpec.Description - If each enum key is decorated with DescriptionAttribute, its value will be use for parsing.

Listing 20.3.1 Specifying Enum format specs during parsing

public enum EmployeeType
{
    [Description("Full Time Employee")]
    Permanent = 0,
    [Description("Temporary Employee")]
    Temporary = 1,
    [Description("Contract Employee")]
    Contract = 2
}

static void EnumTest()
{
    ChoTypeConverterFormatSpec.Instance.EnumFormat = ChoEnumFormatSpec.Description;
 
    ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id") { FieldType = typeof(int) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Salary") { FieldType = typeof(ChoCurrency) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("JoinedDate") { FieldType = typeof(DateTime) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("EmployeeType") { FieldType = typeof(EmployeeType) });
 
    ChoTypeConverterFormatSpec.Instance.IntNumberStyle = NumberStyles.AllowParentheses;
 
    using (var parser = new ChoXmlReader("Emp.xml", config))
    {
        object row = null;
 
        while ((row = parser.Read()) != null)
            Console.WriteLine(row.ToStringEx());
    }
}

20.4 Boolean Support

Cinchoo ETL implicitly handles parsing of boolean Xml column values from Xml files. If you want to fine control the parsing of these values, you can specify them globally via ChoTypeConverterFormatSpec.BooleanFormat. Default value is ChoBooleanFormatSpec.ZeroOrOne

FYI, changing this value will impact system wide.

There are 4 possible values can be used

  1. ChoBooleanFormatSpec.ZeroOrOne - '0' for false. '1' for true.
  2. ChoBooleanFormatSpec.YOrN - 'Y' for true, 'N' for false.
  3. ChoBooleanFormatSpec.TrueOrFalse - 'True' for true, 'False' for false.
  4. ChoBooleanFormatSpec.YesOrNo - 'Yes' for true, 'No' for false.

Listing 20.4.1 Specifying boolean format specs during parsing

static void BoolTest()
{
    ChoTypeConverterFormatSpec.Instance.BooleanFormat = ChoBooleanFormatSpec.ZeroOrOne;
 
    ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id") { FieldType = typeof(int) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Salary") { FieldType = typeof(ChoCurrency) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("JoinedDate") { FieldType = typeof(DateTime) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Active") { FieldType = typeof(bool) });
 
    ChoTypeConverterFormatSpec.Instance.IntNumberStyle = NumberStyles.AllowParentheses;
 
    using (var parser = new ChoXmlReader("Emp.xml", config))
    {
        object row = null;
 
        while ((row = parser.Read()) != null)
            Console.WriteLine(row.ToStringEx());
    }
}

20.5 DateTime Support

Cinchoo ETL implicitly handles parsing of datetime Xml column values from Xml files using system Culture or custom set culture. If you want to fine control the parsing of these values, you can specify them globally via ChoTypeConverterFormatSpec.DateTimeFormat. Default value is 'd'.

FYI, changing this value will impact system wide.

You can use any valid standard or custom datetime .NET format specification to parse the datetime Xml values from the file.

Listing 20.5.1 Specifying datetime format specs during parsing

static void DateTimeTest()
{
    ChoTypeConverterFormatSpec.Instance.DateTimeFormat = "MMM dd, yyyy";
 
    ChoXmlRecordConfiguration config = new ChoXmlRecordConfiguration();
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Id") { FieldType = typeof(int) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Name"));
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Salary") { FieldType = typeof(ChoCurrency) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("JoinedDate") { FieldType = typeof(DateTime) });
    config.XmlRecordFieldConfigurations.Add(new ChoXmlNodeRecordFieldConfiguration("Active") { FieldType = typeof(bool) });
 
    ChoTypeConverterFormatSpec.Instance.IntNumberStyle = NumberStyles.AllowParentheses;
 
    using (var parser = new ChoXmlReader("Emp.xml", config))
    {
        object row = null;
 
        while ((row = parser.Read()) != null)
            Console.WriteLine(row.ToStringEx());
    }
}

Sample above shows how to parse custom datetime Xml values from Xml file. 

Note: As the datetime values contains Xml seperator, it is given with double quotes to pass the parsing. 

20.6 CDATA Support

Cinchoo ETL implicitly handles parsing of CDATA Xml values from Xml files. The CDATA values can be loaded as string object of ChoCDATA object as well.

Listing 20.6.1 Loading CDATA as string object

public class EmployeeRec
{
    [ChoXmlNodeRecordField()]
    [Required]
    public int Id
    {
        get;
        set;
    }
    [ChoXmlNodeRecordField()]
    [DefaultValue("XXXX")]
    public string Name
    {
        get;
        set;
    }
    [ChoXmlNodeRecordField()]
    [DefaultValue("XXXX")]
    public string Message
    {
        get;
        set;
    }
 
    public override string ToString()
    {
        return "{0}. {1}.".FormatString(Id, Name);
    }
}

Listing 20.6.3 Loading CDATA as native CDATA object itself

ChoETL offers ChoCDATA class to hold the CDATA xml value in native format. ChoXmlReader automatically handle the parsing of this value and load them accordingly.

public class EmployeeRec
{
    [ChoXmlNodeRecordField()]
    [Required]
    public int Id
    {
        get;
        set;
    }
    [ChoXmlNodeRecordField()]
    [DefaultValue("XXXX")]
    public string Name
    {
        get;
        set;
    }
    [ChoXmlNodeRecordField()]
    [DefaultValue("XXXX")]
    public ChoCDATA Message
    {
        get;
        set;
    }
 
    public override string ToString()
    {
        return "{0}. {1}.".FormatString(Id, Name);
    }
}

21. Fluent API

XmlReader exposes few frequent to use configuration parameters via fluent API methods. This will make the programming of parsing of Xml files quicker.

21.1 WithXPath

This API method sets the XPath expression to select the nodes to load using XmlReader.

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml").WithXPath("Employees/Employee"))
    Console.WriteLine(e.ToString());

21.2 WithXmlNamespaceManager

This API method sets xml namespace manager to provide the scope management for these namespaces.  It stores prefixes and namespaces as strings. This is used with xml nodes with xpath that reference namespace-qualified element and attribute names.

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml").WithXNamespaceManager(ns))
    Console.WriteLine(e.ToString());

21.2 WithXmlNamespace

This API method sets add xml namespace to Xml name table. This is used with xml nodes with xpath that reference namespace-qualified element and attribute names.

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml").WithXNamespace("dd", "http://www.cinchoo.com/dd"))
    Console.WriteLine(e.ToString());

21.3 WithFields

This API method specifies the list of Xml nodes (either attributes or elements) to be considered for parsing and loading. Other fields in the Xml nodes will be discarded. 

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml").WithFields("Id", "Name"))
    Console.WriteLine(e.ToString());

21.4 WithField

This API method used to add Xml node with xpath, data type and other parameters. This method helpful in dynamic object model, by specifying each and individual Xml node with appropriate datatype.  

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml").WithField("Id", "/Id", typeof(int)))
    Console.WriteLine(e.ToString());

21.5 ColumnCountStrict

This API method used to set the XmlWriter to perform check on column countnness before reading Xml file.

C#
foreach (var e in new ChoXmlReader<EmployeeRec>("Emp.xml").ColumnCountStrict())
    Console.WriteLine(e.ToString());

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Cinchoo
United States United States
No Biography provided

Comments and Discussions

 
QuestionCan the XML reader read nested XML files? Pin
kkkwj27-Apr-21 9:55
Memberkkkwj27-Apr-21 9:55 
QuestionFile gets locked after reading xml, please fix it. Pin
jingjing101321-Jan-18 0:44
Memberjingjing101321-Jan-18 0:44 
AnswerRe: File gets locked after reading xml, please fix it. Pin
Cinchoo21-Jan-18 10:41
MemberCinchoo21-Jan-18 10:41 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.