Home All Groups Group Topic Archive Search About

Merge/Synchronize XML Files

Author
3 Nov 2006 11:45 AM
Meelis Lilbok
Hi is for synchronizing two xml files any fast solution?

Lets say i have 2 xml files 1.xml and 2.xml

1.xml contianes

<test>
    <t id="1">Hello</t>
    <t id="2">World</t>
    <t id="3">Good bye!</td>
</test>

2.xml containes
<test>
    <t id="1">Hello</t>
    <t id="2">World</t>
</test>

After synchronizing 2.xml must look likt this
<test>
    <t id="1">Hello</t>
    <t id="2">World</t>
    <t id="3">Good bye!</td>
</test>

At the moment i use
    For Each
    Next
and this is too slow, if file containes  about 1000 <t> nodes

Regards;
Meelis

Author
3 Nov 2006 11:50 AM
Meelis Lilbok
oops
Not </td> </t> this was typo ;)

Meelis


Show quoteHide quote
"Meelis Lilbok" <meelis.lil***@deltmar.ee> wrote in message
news:uvG8V2z$GHA.2328@TK2MSFTNGP02.phx.gbl...
> Hi is for synchronizing two xml files any fast solution?
>
> Lets say i have 2 xml files 1.xml and 2.xml
>
> 1.xml contianes
>
> <test>
>    <t id="1">Hello</t>
>    <t id="2">World</t>
>    <t id="3">Good bye!</td>
> </test>
>
> 2.xml containes
> <test>
>    <t id="1">Hello</t>
>    <t id="2">World</t>
> </test>
>
> After synchronizing 2.xml must look likt this
> <test>
>    <t id="1">Hello</t>
>    <t id="2">World</t>
>    <t id="3">Good bye!</td>
> </test>
>
> At the moment i use
>    For Each
>    Next
> and this is too slow, if file containes  about 1000 <t> nodes
>
> Regards;
> Meelis
>
>
>
Author
3 Nov 2006 12:22 PM
Rick
Why don't you try writing an XSLT transformation that combines all the
nodes in <test> and then eliminates the duplicates? I think you could either
combine the two xml documents prior to the transform, or perhaps by
importing them from within the XSLT.

You can then do an xmlDoc.DocumentElement.Transform...

Rick

Show quoteHide quote
"Meelis Lilbok" <meelis.lil***@deltmar.ee> wrote in message
news:uvG8V2z$GHA.2328@TK2MSFTNGP02.phx.gbl...
> Hi is for synchronizing two xml files any fast solution?
>
> Lets say i have 2 xml files 1.xml and 2.xml
>
> 1.xml contianes
>
> <test>
>    <t id="1">Hello</t>
>    <t id="2">World</t>
>    <t id="3">Good bye!</td>
> </test>
>
> 2.xml containes
> <test>
>    <t id="1">Hello</t>
>    <t id="2">World</t>
> </test>
>
> After synchronizing 2.xml must look likt this
> <test>
>    <t id="1">Hello</t>
>    <t id="2">World</t>
>    <t id="3">Good bye!</td>
> </test>
>
> At the moment i use
>    For Each
>    Next
> and this is too slow, if file containes  about 1000 <t> nodes
>
> Regards;
> Meelis
>
>
>
Author
3 Nov 2006 12:34 PM
Meelis Lilbok
sry im not very familyar with xslt
where should i begin? any samples?

Meelis


Show quoteHide quote
"Rick" <R***@LakeValleySeed.com> wrote in message
news:%231WzML0$GHA.1012@TK2MSFTNGP04.phx.gbl...
> Why don't you try writing an XSLT transformation that combines all the
> nodes in <test> and then eliminates the duplicates? I think you could
> either combine the two xml documents prior to the transform, or perhaps by
> importing them from within the XSLT.
>
> You can then do an xmlDoc.DocumentElement.Transform...
>
> Rick
>
> "Meelis Lilbok" <meelis.lil***@deltmar.ee> wrote in message
> news:uvG8V2z$GHA.2328@TK2MSFTNGP02.phx.gbl...
>> Hi is for synchronizing two xml files any fast solution?
>>
>> Lets say i have 2 xml files 1.xml and 2.xml
>>
>> 1.xml contianes
>>
>> <test>
>>    <t id="1">Hello</t>
>>    <t id="2">World</t>
>>    <t id="3">Good bye!</td>
>> </test>
>>
>> 2.xml containes
>> <test>
>>    <t id="1">Hello</t>
>>    <t id="2">World</t>
>> </test>
>>
>> After synchronizing 2.xml must look likt this
>> <test>
>>    <t id="1">Hello</t>
>>    <t id="2">World</t>
>>    <t id="3">Good bye!</td>
>> </test>
>>
>> At the moment i use
>>    For Each
>>    Next
>> and this is too slow, if file containes  about 1000 <t> nodes
>>
>> Regards;
>> Meelis
>>
>>
>>
>
>
Author
3 Nov 2006 1:06 PM
Anthony Jones
Show quote Hide quote
"Meelis Lilbok" <meelis.lil***@deltmar.ee> wrote in message
news:uvG8V2z$GHA.2328@TK2MSFTNGP02.phx.gbl...
> Hi is for synchronizing two xml files any fast solution?
>
> Lets say i have 2 xml files 1.xml and 2.xml
>
> 1.xml contianes
>
> <test>
>     <t id="1">Hello</t>
>     <t id="2">World</t>
>     <t id="3">Good bye!</td>
> </test>
>
> 2.xml containes
> <test>
>     <t id="1">Hello</t>
>     <t id="2">World</t>
> </test>
>
> After synchronizing 2.xml must look likt this
> <test>
>     <t id="1">Hello</t>
>     <t id="2">World</t>
>     <t id="3">Good bye!</td>
> </test>
>
> At the moment i use
>     For Each
>     Next
> and this is too slow, if file containes  about 1000 <t> nodes
>
> Regards;
> Meelis
>

I suspect that a loop using .nextNode on either the input, target or both
per iteration will suit your needs   Although XSL may still outperform a
script based language doing this even 1000 nodes shouldn't take an excessive
amount of time.

Your example doesn't show why you simply don't replace 2.xml with 1.xml.
More detail show the wider set of cases are needed to arrive at an
apporpriate solution.

If id="1" were missing from 1.XML should it be deleted from 2.xml?
If id="2" in 2.xml contained the word 'kosmos' should it contain 'world'
after the merge because it was replaced by id="2" from 1.xml?
In the real world is t a complex element if so do you intend to merge the
contents of ts of the same id from each xml file or simply replace the t in
2.xml with the one in 1.xml?
Author
3 Nov 2006 1:25 PM
Meelis Lilbok
> If id="1" were missing from 1.XML should it be deleted from 2.xml?
> If id="2" in 2.xml contained the word 'kosmos' should it contain 'world'
> after the merge because it was replaced by id="2" from 1.xml?
> In the real world is t a complex element if so do you intend to merge the
> contents of ts of the same id from each xml file or simply replace the t
> in
> 2.xml with the one in 1.xml?

Yes i cant simly replace beacuse in one file node with id="2" can have
"Hello"
in second file id="2" may have "Hallo"

I try to explain little bit more :=

file 1.xml is a "template" file, containing strings/texts in native
language(Estonian)
With my application users can translate strings to they own language.
When now user launches translator application

1) template is loaded from server
2) application checks if template file contains new id's(nodes) and adds
those nodes to user file.


[template.xml]
<test>
    <t id="1">Tere</t>
    <t id="2">Maailm</t>
    <t id="3">Head aega!</t>
</test>

[user.xml]
<test>
    <t id="1">Hallo</t>
    <t id="2">World</t>
</test>


After synchronizing user.xml must look like this
<test>
    <t id="1">Hello</t>
    <t id="2">World</t>
    <t id="3">Head aega!</t>
</test>




Meelis
Author
3 Nov 2006 2:50 PM
Anthony Jones
Show quote Hide quote
"Meelis Lilbok" <meelis.lil***@deltmar.ee> wrote in message
news:eXkxDu0$GHA.4292@TK2MSFTNGP02.phx.gbl...
> > If id="1" were missing from 1.XML should it be deleted from 2.xml?
> > If id="2" in 2.xml contained the word 'kosmos' should it contain 'world'
> > after the merge because it was replaced by id="2" from 1.xml?
> > In the real world is t a complex element if so do you intend to merge
the
> > contents of ts of the same id from each xml file or simply replace the t
> > in
> > 2.xml with the one in 1.xml?
>
> Yes i cant simly replace beacuse in one file node with id="2" can have
> "Hello"
> in second file id="2" may have "Hallo"
>
> I try to explain little bit more :=
>
> file 1.xml is a "template" file, containing strings/texts in native
> language(Estonian)
> With my application users can translate strings to they own language.
> When now user launches translator application
>
> 1) template is loaded from server
> 2) application checks if template file contains new id's(nodes) and adds
> those nodes to user file.
>
>
> [template.xml]
> <test>
>     <t id="1">Tere</t>
>     <t id="2">Maailm</t>
>     <t id="3">Head aega!</t>
> </test>
>
> [user.xml]
> <test>
>     <t id="1">Hallo</t>
>     <t id="2">World</t>
> </test>
>
>
> After synchronizing user.xml must look like this
> <test>
>     <t id="1">Hello</t>
>     <t id="2">World</t>
>     <t id="3">Head aega!</t>
> </test>
>

So if I've understood it correctly all you really need is add new nodes that
have appeared at the end of 1.xml to the end of 2.xml?  Sounds a little
simplistic so I probably haven't understood your requirement but if it is
then:-


Option Explicit

Dim xml1 : Set xml1 = LoadDOM("g:\temp\xml1.xml")
Dim xml2 : Set xml2 = LoadDOM("g:\temp\xml2.xml")
Dim oNode

Set oNode = xml2.documentElement.lastChild
Set oNode = xml1.selectSingleNode("//t[@id='" & oNode.getAttribute("id") &
"']")

For Each oNode in oNode.selectNodes("following-sibling::t")
xml2.documentElement.appendChild(oNode.cloneNode(true))
Next

xml2.save "g:\temp\xml2.xml"

Function LoadDOM(sFile)

Set LoadDOM = CreateObject("MSXML2.DOMDocument.3.0")
LoadDOM.async = False
LoadDOM.setProperty "SelectionLanguage", "XPath"
LoadDOM.load sFile

End Function


This for eaches only the new nodes the end of xml1 which are not already in
xml2.


Show quoteHide quote
>
>
>
> Meelis
>
>
>
>
Author
3 Nov 2006 3:20 PM
Andrew Morton
Meelis Lilbok wrote:
Show quoteHide quote
> Yes i cant simly replace beacuse in one file node with id="2" can have
> "Hello"
> in second file id="2" may have "Hallo"
>
> I try to explain little bit more :=
>
> [template.xml]
> <test>
>    <t id="1">Tere</t>
>    <t id="2">Maailm</t>
>    <t id="3">Head aega!</t>
> </test>
>
> [user.xml]
> <test>
>    <t id="1">Hallo</t>
>    <t id="2">World</t>
> </test>
>
>
> After synchronizing user.xml must look like this
> <test>
>    <t id="1">Hello</t>

I assume you meant to type Hallo...

>    <t id="2">World</t>
>    <t id="3">Head aega!</t>
> </test>
>

If you get rid of all the XML noise, you will be left with name-value pairs
(see DictionaryEntry in the help).

"1"    "Hallo"
"2"    "World"


If you then put the template DictionaryEntry items into a Hashtable (q.v.)
followed by the values extracted from the user.xml file *but taking note of
this from the Hashtable.Add method help*:

"The Item property can also be used to add new elements by setting the value
of a key that does not exist in the Hashtable. For example:
myCollection["myNonexistentKey"] = myValue. However, if the specified key
already exists in the Hashtable, setting the Item property overwrites the
old value. In contrast, the Add method does not modify existing elements."

then you will have a hashtable containing the merged data.


You can then take the data from the hashtable, add back in all the XML cr^W
stuff as you Append it to a StringBuilder, then write back to disk. The
whole operation should take about as long as it takes to double-click a
mouse button.

' not checked, but this is how you'd re-build the XML
dim sb as new StringBuilder("<test>" & vbLF)
for each thing as DictionaryEntry in yourHashtable
    sb.Append(string.format("  <t id=""{0}"">{1}</t>" & vbLf,
thing.key.tostring, thing.value.tostring))
next
sb.Append("</test>")
' now write the file

Any use?

Andrew