Click here to Skip to main content
15,891,621 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

Can you please help me to compare two xml files using VBA and make the differences in the third xml file.

Thanks...
Posted
Comments
Sergey Alexandrovich Kryukov 7-Oct-11 15:16pm    
What's the ultimate goal of this activity? The question as is does not have exact meaning. Sharing the particular goal of it can make a more useful solution, but what is "comparison" is not defined.
--SA
Member 8291513 9-Oct-11 0:08am    
Thank you. Yes I am looking exactly the same output from the two xml files but using VBA. But didnt find the solution :(

1 solution

What do you mean by the "difference" highly depends on how you define it, in particular, in XML schema uses and its mapping onto the "difference" file.

First of all, you need to find difference not between files, but between logical structures of the XML. The same logical structure could be represented by text in many slightly different ways. This is one of the key ideas about XML: to abstract out physical entities from the logical structure. In particular, you could parse both files to be compared into DOM to compare DOMs, not files. But how to express the result of comparison?

Now, about mapping of differences to some schema. I'll give you a simple example:

File #1:
XML
<?xml version="1.0" encoding="UTF-8"?>
<top>
  <a>
    <b><![CDATA[<p>Same thing</p>]]></b>
    <b>1</b>
    <b>2</b>
  </a>
</top>


File#2:
XML
<?xml version="1.0" encoding="UTF-8"?>
<top>
  <a>
    <b>&lt;p&gt;Same thing&lt;/p&gt;</b>
    <b>2</b>
    <b>3</b>
  </a>
</top>


Some data is in both files, some in one of them only, some in the second one only.
Let's think how it can be represented. For example, like this:
XML
<?xml version="1.0" encoding="UTF-8"?>
<top>
   <a>
      <both_files>
         <b><b>&lt;p&gt;Same thing&lt;/p&gt;</b></b>
         <b>2</b>
      </both_files>
      <first_file_only>
         <b>1</b>
      </first_file_only>
      <second_file_only>
         <b>3</b>
      </second_file_only>
   </a>
</top>


First, pay attention that the first a element is identical in both files despite of different ways it's written.
Now, what a schema for the "difference" file could be? This is something not automatically defined by the schema of the files under comparison. My sample uses some arbitrary syntax I made up — in one of many possible ways. And I did not even touch such a difficult problem as the ordering of elements. How to express the different ordering? Well, it's feasible, too, but…

The problem does not have one general solution. Did I make it clear?

[EDIT]

Some links, if you need to understand better what I'm talking about questioning the "mapping":
http://en.wikipedia.org/wiki/Bijection[^],
http://en.wikipedia.org/wiki/Data_mapping[^].

[END EDIT]

So the question as such does not have exact meaning.

—SA
 
Share this answer
 
v5
Comments
André Kraak 7-Oct-11 15:20pm    
my 5, Good explanation of the problems with "difference".
Sergey Alexandrovich Kryukov 7-Oct-11 15:35pm    
Thank you, André.
--SA
Simon Bang Terkildsen 7-Oct-11 16:36pm    
+5!
Sergey Alexandrovich Kryukov 7-Oct-11 16:37pm    
Thank you, Simon.
--SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900