Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / XML

XML Serialization – Tips & Tricks

Rate me:
Please Sign up or sign in to vote.
4.90/5 (17 votes)
2 Apr 2010CPOL5 min read 132.1K   55   12
This article shows solutions to some of the common problems related to working with XML Serialization.

Let’s say we have an XSD representing a library with a list of books and employees in it.

XML
<?xml version="1.0" encoding="utf-8"?>

<xsd:schema id="Lib"
           targetNamespace="http://schemas.ali.com/lib/"
           elementFormDefault="qualified"
           xmlns="http://schemas.ali.com/lib/"
           xmlns:mstns="http://schemas.ali.com/lib/"
           xmlns:xsd="http://www.w3.org/2001/XMLSchema"
           version="1.0"
           attributeFormDefault="unqualified">

  <xsd:element name="Library" type="LibraryType" />

  <xsd:complexType name="LibraryType">
    <xsd:all>
      <xsd:element name="Books" type="BooksType" minOccurs="0" maxOccurs="1" />
      <xsd:element name="Employees" type="EmployeesType" minOccurs="0" maxOccurs="1" />
    </xsd:all>
  </xsd:complexType>

  <xsd:complexType name="BooksType">
    <xsd:sequence minOccurs="0" maxOccurs="unbounded">
      <xsd:element name="Book" type="BookType" />
    </xsd:sequence>
  </xsd:complexType>

  <xsd:complexType name="EmployeesType">
    <xsd:sequence minOccurs="0" maxOccurs="unbounded">
      <xsd:element name="Employee" type="EmployeeType" />
    </xsd:sequence>
  </xsd:complexType>

  <xsd:complexType name="BookType">
    <xsd:attribute name="Title" type="xsd:string" use="required" />
    <xsd:attribute name="Author" type="xsd:string" use="optional" />
  </xsd:complexType>

  <xsd:complexType name="EmployeeType">
    <xsd:attribute name="Name" type="xsd:string" use="required" />
  </xsd:complexType>

</xsd:schema>

Here is a sample XML using this XSD:

XML
<?xml version="1.0" encoding="utf-8" ?>

<Library xmlns="http://schemas.ali.com/lib/">
  <Books>
    <Book Title="Book 1" Author="Ali"/>
    <Book Title="Book 2" Author="Sara"/>
  </Books>
  <Employees>
    <Employee Name="Ali"/>
    <Employee Name="Sara"/>
  </Employees>
</Library>

Tip 1 – Generating Code from XSD

We’d like to have an object representation of this XML. Thus, we’ll use the XML Schema Definition tool to generate .NET C# code from the XSD, as follows:

  • Start Visual Studio Command Prompt
  • Run this command: xsd “path to XSD file” -language:CS /classes /outputdir:”path to output directory”

As a result, a .cs file will be generated and copied to the output directory. Take a look at the file GeneratedLibrary.cs.

Tip 2 – Using List<T> instead of array

The <Books> and <Employees> elements are generated as arrays. So I would like to change those arrays to List<T> objects to make it easier to add items to them instead of having to worry about the size of the array and expanding it. Take a look at the modified class LibraryWithLists.cs. However, this is still not good enough because if I want to create a library with one book, I’ll have to write the following code:

C#
LibraryType library = new LibraryType();
library.Books = new BooksType();
library.Books.Book = new List<BookType>();
BookType newBook = new BookType();
newBook.Title = "Book 1";
newBook.Author = "Author 1";
library.Books.Book.Add(newBook);

But that doesn’t seem neat enough. I want to write something like:

C#
LibraryType library = new LibraryType();
library.Books = new List<BookType>();
BookType newBook = new BookType();
newBook.Title = "Book 1";
newBook.Author = "Author 1";
library.Books.Add(newBook);

Thus, I’ll need to get rid of the class BooksType and change it in the declaration from “private BooksType booksField;” to “private List<BookType> booksField;”. However, making that change only is not enough. We need to tell the serializer that the new property is an XmlArrayItem and not an XmlElement. Take a look at the resulting code in Library.cs.

Tip 3 – Serializing Object to XML

Now, we should be able to write code and generate XML from the object created. For example, writing the code below:

C#
//  Create a library
LibraryType library = new LibraryType();

//  Create Books tag
library.Books = new List<BookType>();

//  Add 5 books to the library
for (int i = 1; i <= 5; i++)
{
    BookType book = new BookType();
    book.Title = string.Format("Book {0}", i);
    book.Author = string.Format("Author {0}", i);
    library.Books.Add(book);
}

//  Create employees tag
library.Employees = new List<EmployeeType>();

//  Add 3 employees to the library
for (int i = 1; i <= 3; i++)
{
    EmployeeType employee = new EmployeeType();
    employee.Name = string.Format("Book {0}", i);
    library.Employees.Add(employee);
}

//  Now that the object is created, serialize it and print out resulting Xml
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
StringWriter sw = new StringWriter();
serializer.Serialize(sw, library);
Console.WriteLine("Object serialized to Xml:\n\n{0}", sw.ToString());

would result in the following XML:

XML
<?xml version="1.0" encoding="utf-16"?>
<Library xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance 
   xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schemas.ali.com/lib/">
  <Books>
    <BookType Title="Book 1" Author="Author 1" />
    <BookType Title="Book 2" Author="Author 2" />
    <BookType Title="Book 3" Author="Author 3" />
    <BookType Title="Book 4" Author="Author 4" />
    <BookType Title="Book 5" Author="Author 5" />
  </Books>
  <Employees>
    <EmployeeType Name="Book 1" />
    <EmployeeType Name="Book 2" />
    <EmployeeType Name="Book 3" />
  </Employees>
</Library>

Tip 4 – Serializing without Namespace

If you look at the serialized XML above, you’ll notice the extra xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance“, xmlns:xsd=”http://www.w3.org/2001/XMLSchema” and xmlns=”http://schemas.ali.com/lib/“. These namespaces are added by default. In my case, I only care about the xmlns=”http://schemas.ali.com/lib/” which is the URL for the XSD of my XML file. To get rid of the above namespaces and keep the one referring to the XSD, we’ll need to pass our custom XmlSerializerNamespaces object to the Serialize() method.

C#
//  Create our own xml serializer namespace
//  Avoiding default xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
//  and xmlns:xsd="http://www.w3.org/2001/XMLSchema"
XmlSerializerNamespaces ns = new XmlSerializerNamespaces(); 

//  Add lib namespace with empty prefix
ns.Add("", "http://schemas.ali.com/lib/"); 

//  Now serialize by passing the XmlSerializerNamespaces object
//  as a parameter to the Serialize() method
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
StringWriter sw = new StringWriter();
serializer.Serialize(sw, library, ns);
Console.WriteLine("Object serialized to Xml:\n\n{0}", sw.ToString());

If you want to get rid of the namespaces altogether, you can simply write ns.Add(“”, “”) instead of ns.Add(“”, ““). http://schemas.ali.com/lib/

Tip 5 – Changing Encoding

If you look at the generated XML above, you’ll notice in the XML declaration that the encoding is set to utf-16. To make this UTF8 encoding, we’ll need to change the Stream object settings before we do the serialization. To do this, replace the code below...

C#
StringWriter sw = new StringWriter();
serializer.Serialize(sw, library, ns);

... with:

C#
//  Serialize the object to Xml with UTF8 encoding
MemoryStream ms = new MemoryStream();
XmlTextWriter xmlTextWriter = new XmlTextWriter(ms, Encoding.UTF8);
xmlTextWriter.Formatting = Formatting.Indented;
serializer.Serialize(xmlTextWriter, library, ns);
ms = (MemoryStream)xmlTextWriter.BaseStream;
string xml = Encoding.UTF8.GetString(ms.ToArray());

To make this more generic, we can create a static method that can do serialization with any encoding.

C#
/// <summary>
/// Serializes the object to Xml based on encoding and name spaces.
/// </summary>
/// <param name="serializer"></param>
/// <param name="encoding"></param>
/// <param name="ns"></param>
/// <param name="objectToSerialize"></param>
/// <returns></returns>
public static string Serialize(XmlSerializer serializer,
                           Encoding encoding,
                           XmlSerializerNamespaces ns,
                           object objectToSerialize)
{
    MemoryStream ms = new MemoryStream();
    XmlTextWriter xmlTextWriter = new XmlTextWriter(ms, encoding);
    xmlTextWriter.Formatting = Formatting.Indented;
    serializer.Serialize(xmlTextWriter, objectToSerialize, ns);
    ms = (MemoryStream)xmlTextWriter.BaseStream;
    return encoding.GetString(ms.ToArray());
}

Now we can write something like the below:

C#
string xml = Serialize(serializer, Encoding.UTF8, ns, library);

Tip 6 – Removing XML Declaration

Let’s say you want to completely remove the XML Declarartion <?xml Version=”1.0? Encoding=”utf-8??> from your serialized XML. You can do so neatly by using an XmlWriterSettings class and setting its OmitXmlDeclaration property to true. Here is how the above Serialize method would change to support this:

Thus, we can do something like the below to omit the XML declaration:

C#
/// <summary>
/// Serializes the object to XML based on encoding and name spaces.
/// </summary>
/// <param name="serializer">XmlSerializer object 
/// (passing as param to avoid creating one every time)</param>
/// <param name="encoding">The encoding of the serialized Xml</param>
/// <param name="ns">The namespaces to be used by the serializer</param>
/// <param name="omitDeclaration">Whether to omit Xml declarartion or not</param>
/// <param name="objectToSerialize">The object we want to serialize to Xml</param>
/// <returns></returns>
public static string Serialize(XmlSerializer serializer,
                               Encoding encoding,
                               XmlSerializerNamespaces ns,
                               bool omitDeclaration,
                               object objectToSerialize)
{
    MemoryStream ms = new MemoryStream();
    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Indent = true;
    settings.OmitXmlDeclaration = omitDeclaration;
    settings.Encoding = encoding;
    XmlWriter writer = XmlWriter.Create(ms, settings);
    serializer.Serialize(writer, objectToSerialize, ns);
    return encoding.GetString(ms.ToArray()); ;
}

Tip 7 – Deserializing XML to Object

C#
string xml = Serialize(serializer, Encoding.Default, ns, true, library);

This functionality is really cool. Load XML into an object I can understand and easily interact with. For example, you’d write the following code to read the contents of XML file “Sample1.xml” and convert it to object LibraryType.

//  Read the first XML file
TextReader tr = new StreamReader("Sample1.xml");

//  Deserialize the XML file into a LibraryType object
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
LibraryType lib1 = (LibraryType)serializer.Deserialize(tr);

If you look at the object lib1 using the debugger, you’ll see that it’s properly loaded:

LibraryType Object
LibraryType Object

We can add a new book to this object and serialize it back to XML using the following code:

C#
if (lib1.Books == null)
{
    lib1.Books = new List<BookType>();
}

BookType newBook = new BookType();
newBook.Title = "Book 3";
lib1.Books.Add(newBook);

//  Serialize back the library type object and output Xml
StringWriter sw = new StringWriter();
serializer.Serialize(sw, lib1);
Console.WriteLine("{0}:\n\n{1}", "Sample1.xml", sw.ToString());

The resulting XML would look something like:

XML
<?xml version="1.0" encoding="utf-16"?>
<Library xmlns="http://schemas.ali.com/lib/">
  <Books>
    <Book Title="Book 1" Author="Ali" />
    <Book Title="Book 2" Author="Sara" />
    <Book Title="Book 3" />
  </Books>
  <Employees>
    <Employee Name="Ali" />
    <Employee Name="Sara" />
  </Employees>
</Library>

Tip 8 – Resolving Empty Lists Issue

Now, let’s try the same deserialization code above, but on XML file “Sample2.xml” which has no <Employees> tag. When we deserialize XML into an object then serialize back into XML, we get the following:

XML
<?xml version="1.0" encoding="utf-16"?>
<Library xmlns="http://schemas.ali.com/lib/">
  <Books>
    <Book Title="Book 1" Author="Ali" />
    <Book Title="Book 2" Author="Sara" />
  </Books>
  <Employees />
</Library>

Notice the extra <Employees/> which we really didn’t intend to have in our XML. I guess the reason for this issue is because XmlSerializer is initializing all List<T> variables in the object on deserialization (verified that by watching the object in the debugger after deserialization) and thus we’ll get this empty tag <Employees/> when we serialize back. To resolve this issue, there are two workarounds. The first approach is definitely better, but you might find the other approach helpful based on your needs.

Approach 1

Don’t bother with the empty lists. Just clean up their corresponding empty XML tags on serialization. The method below takes a string representation of the XML and removes all empty tags.

C#
/// <summary>
/// //////////Deletes empty Xml tags from the passed xml
/// </summary>
/// <param name="xml"></param>
/// <returns></returns>
public static string CleanEmptyTags(String xml)
{
    Regex regex = new Regex(@"(\s)*<(\w)*(\s)*/>");
    return regex.Replace(xml, string.Empty);
}

With the method above in mind, our Serialize method would change as follows:

C#
public static string Serialize(XmlSerializer serializer,
                               Encoding encoding,
                               XmlSerializerNamespaces ns,
                               bool omitDeclaration,
                               object objectToSerialize)
{
    MemoryStream ms = new MemoryStream();
    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Indent = true;
    settings.OmitXmlDeclaration = omitDeclaration;
    settings.Encoding = encoding;
    XmlWriter writer = XmlWriter.Create(ms, settings);
    serializer.Serialize(writer, objectToSerialize, ns);
    string xml = encoding.GetString(ms.ToArray());
    xml = CleanEmptyTags(xml);
    return xml;
}
Approach 2

Call the deserialize as usual then set any empty instantiated lists (Count == 0) to null. Here is the static method and its helper method that does the job.

C#
/// <summary>
/// Deserializes the passed Xml then deallocates any instantiated and empty lists.
/// </summary>
/// <param name="serializer"></param>
/// <param name="tr"></param>
/// <param name="objectNamespace"></param>
/// <returns></returns>
public static object Deserialize(XmlSerializer serializer, 
			TextReader tr, string objectNamespace)
{
    //  Deserialize Xml into object
    object objectToReturn = serializer.Deserialize(tr);

    //  Clean up empty lists
    CleanUpEmptyLists(objectToReturn, objectNamespace);

    return objectToReturn;
}

/// <summary>
/// Sets any empty lists in the passed object to null. 
/// If the passed object itself is a list,
/// the method returns true of it's empty and false otherwise.
/// </summary>
/// <param name="o"></param>
/// <param name="objectNamespace"></param>
/// <returns></returns>
public static bool CleanUpEmptyLists(object o, string objectNamespace)
{
    //  Skip if the object is already null
    if (o == null)
    {
        return false;
    }

    //  Get the types of the object
    Type type = o.GetType();

    //  If this is an empty list, set it to null
    if (o is IList)
    {
        IList list = (IList)o;

        if (list.Count == 0)
        {
            return true;
        }
        else
        {
            foreach (object obj in list)
            {
                CleanUpEmptyLists(obj, objectNamespace);
            }
        }

        return false;
    }
    //  Ignore any objects that aren't in our namespace for perf reasons
    //  and to avoid getting errors on trying to get into every little detail
    else if (type.Namespace != objectNamespace)
    {
        return false;
    }

    //  Loop over all properties and handle them
    foreach (PropertyInfo property in type.GetProperties())
    {
        //  Get the property value and clean up any empty lists it contains
        object propertyValue = property.GetValue(o, null);
        if (CleanUpEmptyLists(propertyValue, objectNamespace))
        {
            property.SetValue(o, null, null);
        }
    }

    return false;
}

Using the above static method, you can now deserialize as follows:

C#
//  Deserialize the Xml file into a LibraryType object
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
LibraryType lib = 
	(LibraryType)Deserialize(serializer, tr, typeof(LibraryType).Namespace);

If you know of a better solution, please let me know.

Source

Download full source code on my GitHub page. You can use it to try out the above tips one by one.

Posted in .NET, csharp, XML Tagged: csharp, deserialize, empty, encoding, generic, Indent, List, namespace, OmitXmlDeclaration, serialize, XML, xmlserializer, XSD

This article was originally posted at http://mycodelog.com/2009/12/29/xmlserializer

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
United States United States
https://open-gl.com

Comments and Discussions

 
QuestionYour links to code in external files Pin
amdaman12-Jan-15 18:46
amdaman12-Jan-15 18:46 
AnswerRe: Your links to code in external files Pin
Ali BaderEddin13-Jan-15 15:48
Ali BaderEddin13-Jan-15 15:48 
AnswerRe: Your links to code in external files Pin
Ali BaderEddin13-Jan-15 15:49
Ali BaderEddin13-Jan-15 15:49 
QuestionAwesome Article Pin
Prabash_D7-Feb-13 19:18
Prabash_D7-Feb-13 19:18 
QuestionGREAT ARTICLE !!!! Pin
Mohammad Shahjahan Ahmed Talukder19-Sep-12 3:32
professionalMohammad Shahjahan Ahmed Talukder19-Sep-12 3:32 
AnswerRe: GREAT ARTICLE !!!! Pin
Ali BaderEddin20-Sep-12 7:27
Ali BaderEddin20-Sep-12 7:27 
GeneralThis is very good article. Pin
Jayesh Sorathia14-Sep-12 22:43
Jayesh Sorathia14-Sep-12 22:43 
GeneralRe: This is very good article. Pin
Ali BaderEddin18-Sep-12 7:59
Ali BaderEddin18-Sep-12 7:59 
QuestionXML Serialization - Tips & Tricks - showing null elements Pin
Nancy359517-Feb-12 10:50
Nancy359517-Feb-12 10:50 
Questionthanks for the tips but i still have some problem Pin
igalep13221-Jan-12 4:17
igalep13221-Jan-12 4:17 
i've tried to remove xml declaration, (
Quote:
Tip 6 – Removing XML Declaration

and all i succeeded to remove was the encoding part, but the is still there

how can i remove it ?

thanks
Generalcs links not working Pin
uildriks3-Nov-10 3:23
professionaluildriks3-Nov-10 3:23 
GeneralThanks for writing this. Pin
William Gorden6-Apr-10 10:36
William Gorden6-Apr-10 10:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.