Introduction
This article demonstrates the use of MSXML APIs using C++ and creates a simple XML file.
Background
I wanted to write an XML file for my project using MSXML in C++. After Googling, I found some helpful links spread across various articles which gave me some starting point to proceed. I decide to write an article which will help a beginner to use MSXML and give the basics of XML, as there is not much available for this topic on C++.
XML Basics
An XML file is a well agreed data structure produced by the World Wide Web Consortium (W3C). It gives easy access to data in a structured way. More about XML can be found on the Wikipedia.
MSXML APIs
Microsoft XML APIs are independent of development environments. Most of the uses of MSXML in C# and Visual Basic can be found on the web, including the MSDN, but little is available for C++ developers.
I have created a demo project which will create an XML file having the following structure:
="1.0"="UTF-8"
<Parent Depth="0">
<Child1 Depth="1">This is a child of Parent</Child1>
<Child2 Depth="1">
<Child3 Depth="2">
<Child4 Depth="3">This is a child of Child3</Child4>
</Child3>
</Child2>
</Parent>
In MSXML, we create a root or parent element, and then go on inserting the child elements. Nodes can be anything. It can be an element node, it can be an attribute node etc. There are several types of nodes available. They are documented on MSDN. You can create a node by passing the appropriate flag, like NODE_ELEMENT
, to the CreateNode()
function. Another way is to use the CreatElement()
function.
Now, enough on the background; let us come to the real job. I have created a dialog based application using Microsoft Visual Studio 2005.
Now, let us see the code and understand what is happening.
Using the Code
The first thing you need to do is add these two lines in the project stdafx.h file:
#import "MSXML4.dll" rename_namespace(_T("MSXML"))
#include <msxml2.h>
I am using msxml4.dll. As of now, msxml6.dll is available. The same code was tested on MSXML6 as well.
Rename_namespace(_T(“MSXML”))
renames the namespace to MSXML
; otherwise, by default, its namespace will be MSXML2
. Note that I have not used the “raw_interfaces_only
” attribute because I want the compiler to generate the C++ smart pointer wrapper interfaces.
Now you need to call the AFxOleInit()
function. The best place is the InitInstance()
of the application class.
if (!AfxOleInit())
{
AfxMessageBox(_T("Failed to initialize OLE library"));
return FALSE;
}
This initializes all the OLE library stuff which includes the calls to ::CoInitialize()
and ::CoUnInitialize()
which are necessary for any COM object to be used.
Now have a look at the code inside OnBnClickedCreatexml()
. Most of the code is self explanatory, but I will provide the details wherever necessary.
void CXMLDemoDlg::OnBnClickedCreatexml()
{
MSXML::IXMLDOMDocument2Ptr pXMLDoc;
HRESULT hr = pXMLDoc.CreateInstance(__uuidof(DOMDocument40));
if(FAILED(hr))
{
AfxMessageBox(_T("Failed to create the XML class instance"));
return;
}
if(pXMLDoc->loadXML(_T("<Parent></Parent>")) == VARIANT_FALSE)
{
ShowError(pXMLDoc);
return;
}
MSXML::IXMLDOMElementPtr pXMLRootElem = pXMLDoc->GetdocumentElement();
pXMLRootElem->setAttribute(_T("Depth"),_variant_t(_T("0")));
MSXML::IXMLDOMProcessingInstructionPtr pXMLProcessingNode =
pXMLDoc->createProcessingInstruction("xml", " version='1.0' encoding='UTF-8'");
_variant_t vtObject;
vtObject.vt = VT_DISPATCH;
vtObject.pdispVal = pXMLRootElem;
vtObject.pdispVal->AddRef();
pXMLDoc->insertBefore(pXMLProcessingNode,vtObject);
MSXML::IXMLDOMElementPtr pXMLChild1 =
pXMLDoc->createElement(_T("Child1")); pXMLChild1->setAttribute(_T("Depth"),_T("1"));
pXMLChild1->Puttext(_T("This is a child of Parent")); pXMLChild1 = pXMLRootElem->appendChild(pXMLChild1);
MSXML::IXMLDOMElementPtr pXMLChild2 = pXMLDoc->createElement(_T("Child2"));
pXMLChild2->setAttribute(_T("Depth"), _T("1"));
pXMLChild2 = pXMLRootElem->appendChild(pXMLChild2);
MSXML::IXMLDOMElementPtr pXMLChild3 = pXMLDoc->createElement(_T("Child3"));
pXMLChild3->setAttribute(_T("Depth"), _T("2"));
pXMLChild3 = pXMLChild2->appendChild(pXMLChild3);
MSXML::IXMLDOMElementPtr pXMLChild4 = pXMLDoc->createElement(_T("Child4"));
pXMLChild4->setAttribute(_T("Depth"), _T("3"));
pXMLChild4->Puttext(_T("This is a child of Child3"));
pXMLChild4 = pXMLChild3->appendChild(pXMLChild4);
MSXML::IXMLDOMDocument2Ptr loadXML;
hr = loadXML.CreateInstance(__uuidof(DOMDocument40));
if(FAILED(hr))
{
ShowError(loadXML);
return;
}
if(loadXML->load(variant_t(_T("StyleSheet.xsl"))) == VARIANT_FALSE)
{
ShowError(loadXML);
return;
}
MSXML::IXMLDOMDocument2Ptr pXMLFormattedDoc;
hr = pXMLFormattedDoc.CreateInstance(__uuidof(DOMDocument40));
CComPtr<IDispatch> pDispatch;
hr = pXMLFormattedDoc->QueryInterface(IID_IDispatch, (void**)&pDispatch);
if(FAILED(hr))
{
return;
}
_variant_t vtOutObject;
vtOutObject.vt = VT_DISPATCH;
vtOutObject.pdispVal = pDispatch;
vtOutObject.pdispVal->AddRef();
hr = pXMLDoc->transformNodeToObject(loadXML,vtOutObject);
MSXML::IXMLDOMNodePtr pXMLFirstChild = pXMLFormattedDoc->GetfirstChild();
MSXML::IXMLDOMNamedNodeMapPtr pXMLAttributeMap = pXMLFirstChild->Getattributes();
MSXML::IXMLDOMNodePtr pXMLEncodNode = pXMLAttributeMap->getNamedItem(_T("encoding"));
pXMLEncodNode->PutnodeValue(_T("UTF-8"));
UpdateData(); if(sLocation.IsEmpty()) sLocation = _T("Javed.xml");
hr = pXMLFormattedDoc->save(sLocation.AllocSysString());
if(FAILED(hr))
{
ShowError(pXMLFormattedDoc);
return;
}
sLocation += _T(" created");
AfxMessageBox(sLocation);
}
There are various steps to create a complete XML document.
MSXML::IXMLDOMDocument2Ptr pXMLDoc;
HRESULT hr = pXMLDoc.CreateInstance(__uuidof(DOMDocument40));
if(FAILED(hr))
{
AfxMessageBox(_T("Failed to create the XML class instance"));
return;
}
The above code instantiates the MSXML object. Note that we have not called CoInitialize(NULL)
here to initialize the COM libs because they are included in the AfxOleInit()
function which we have already called.
if(pXMLDoc->loadXML(_T("<Parent></Parent>")) == VARIANT_FALSE)
{
ShowError(pXMLDoc);
return;
}
Here, the point worth mentioning is you have to create the starting node. May be the parent node or the root node (or an element in this case) is the best place to start. The above code does exactly the same thing. It is important to load the first element using LoadXML()
.
MSXML::IXMLDOMElementPtr pXMLRootElem = pXMLDoc->GetdocumentElement();
pXMLRootElem->setAttribute(_T("Depth"),_variant_t(_T("0")));
MSXML::IXMLDOMProcessingInstructionPtr pXMLProcessingNode =
pXMLDoc->createProcessingInstruction("xml", " version='1.0' encoding='UTF-8'");
_variant_t vtObject;
vtObject.vt = VT_DISPATCH;
vtObject.pdispVal = pXMLRootElem;
vtObject.pdispVal->AddRef();
pXMLDoc->insertBefore(pXMLProcessingNode,vtObject);
Now we want to insert <?xml version="1.0" encoding="UTF-8"?>
just at the start of the XML file. The above code does exactly the same thing. IXMLDOMProcessingInstruction
is an interface which deals with how XML files should be processed, like encoding details, version number etc. Here, we have created the processing node and inserted it just before the parent element.
MSXML::IXMLDOMElementPtr pXMLChild1 = pXMLDoc->createElement(_T("Child1"));
pXMLChild1->setAttribute(_T("Depth"),_T("1"));
pXMLChild1->Puttext(_T("This is a child of Parent")); pXMLChild1 = pXMLRootElem->appendChild(pXMLChild1);
MSXML::IXMLDOMElementPtr pXMLChild2 = pXMLDoc->createElement(_T("Child2"));
pXMLChild2->setAttribute(_T("Depth"), _T("1"));
pXMLChild2 = pXMLRootElem->appendChild(pXMLChild2);
MSXML::IXMLDOMElementPtr pXMLChild3 = pXMLDoc->createElement(_T("Child3"));
pXMLChild3->setAttribute(_T("Depth"), _T("2"));
pXMLChild3 = pXMLChild2->appendChild(pXMLChild3);
MSXML::IXMLDOMElementPtr pXMLChild4 = pXMLDoc->createElement(_T("Child4"));
pXMLChild4->setAttribute(_T("Depth"), _T("4"));
pXMLChild4->Puttext(_T("This is a child of Child3"));
pXMLChild4 = pXMLChild3->appendChild(pXMLChild4);
Now it’s time to create the complete XML including all the child elements. Be careful in assigning which is the parent and of which child, as shown above.
At this point, your XML content is done. You may want to save this into a file, so just call save()
.
pXMLdDoc->save(sLocation.AllocSysString());
But wait, if you save this, then open it in Notepad or in any non-HTML basic editor, it will look like this:
="1.0"="UTF-8"
<Parent Depth="0"><Child1 Depth="1">This is a child of Parent</Child1>
<Child2 Depth="1"><Child3 Depth="2">
<Child4 Depth="4">This is a child of Child3</Child4></Child3></Child2></Parent>
Line breaks were added to the above snippet to prevent scrolling.
Yes, not indented as it should be. All the elements will be in a single line. Obviously, you will not like this. Let us now format this properly. For this, you need to do transformation with a template style sheet. A style sheet is also XML only, but it has some scripting characters which will operate on your XML to get the desired output. I am using a style sheet named StyleSheet.xls which is kept in the project directory.
="1.0"="utf-8"
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The code:
if(loadXML->load(variant_t(_T("StyleSheet.xsl"))) == VARIANT_FALSE)
{
ShowError(loadXML);
return;
}
MSXML::IXMLDOMDocument2Ptr pXMLFormattedDoc;
hr = pXMLFormattedDoc.CreateInstance(__uuidof(DOMDocument40));
CComPtr<IDispatch> pDispatch;
hr = pXMLFormattedDoc->QueryInterface(IID_IDispatch, (void**)&pDispatch);
if(FAILED(hr))
{
return;
}
_variant_t vtOutObject;
vtOutObject.vt = VT_DISPATCH;
vtOutObject.pdispVal = pDispatch;
vtOutObject.pdispVal->AddRef();
hr = pXMLDoc->transformNodeToObject(loadXML,vtOutObject);
Here we are loading the style sheet with the function load()
and then doing a transformation with the original XML to get a formatted new XML document object.
MSXML::IXMLDOMNodePtr pXMLFirstChild = pXMLFormattedDoc->GetfirstChild();
MSXML::IXMLDOMNamedNodeMapPtr pXMLAttributeMap = pXMLFirstChild->Getattributes();
MSXML::IXMLDOMNodePtr pXMLEncodNode = pXMLAttributeMap->getNamedItem(_T("encoding"));
pXMLEncodNode->PutnodeValue(_T("UTF-8"));
Although I have used UTF-8 encoding in the XML creation, the resulting formatted XML is created by UTF-16. So I have changed the encoding to UTF-8. It is just a matter of replacing one attribute. This also shows how to manipulate an element attribute.
hr = pXMLFormattedDoc->save(sLocation.AllocSysString());
Finally, just save the XML to a file, and you are done.
Points of Interest
One important point on the same line is, if you create a large XML file, say 10 KB in size, using UTF-16, which does not have Unicode characters, it is a waste of resources. The same data created using UTF-8 will be approximately 5 KB, provided only ASCII charters are used. This is because UTF-8 consumes only one byte for ASCII characters whereas UTF-16 consumes 2 bytes. More about UTF is available here.
History
- 27 October 2009, First created.
Javed is software developer (Lead). He has been working on desktop software using C++\C# since 2005.