|
From: Sun, V. <vs...@am...> - 2007-07-01 05:26:14
|
Wondering what is frequent update/insert impact for vtd-xml : *=09 Does XMLModifier directly manipulating the underlying byte[]? How does it handle inserting new data fragment? create a new memory block to combine both of them? or just track the new block with the token? Is there a performance benchmark on update operation comparison with the other implementation? *=09 How will these newly inserted fragments impact the overall navigation performance?=20 *=09 How does the update operation impact the existing bookmarks?=20 *=09 Can we associate multiple VTDNav with the VTDGen? Regardless, how will the update operation impact the existing VTDNav ? (the currentIdnex for example) Thanks. =20 -------------------------- Cheers, Vanessa =20 |
|
From: Jimmy Z. <cra...@co...> - 2007-07-01 18:14:51
|
Does XMLModifier directly manipulating the underlying byte[]?=20 XMLModifier does *not* directly overwrite the underlying bytes...=20 it will apply change and output *new* XML data into OutputStream... How does it handle inserting new data fragment?=20 if you insert new data fragment, you just have to navigate the cursor to = the right spot in the XML document and drop it in create a new memory block to combine both of them? or just track the new = block with the token?=20 yes, that is right , except you don't explicitly create new memory block, instead you create = an outputStream (e.g. ByteArrayOutputStream) Is there a performance benchmark on update operation comparison with the = other implementation? http://vtd-xml.sourceforge.net/benchmark1.html update+xpath+indexing, = so update is part of it Can we associate multiple VTDNav with the VTDGen? Regardless, how will = the update operation impact the existing VTDNav ? (the currentIdnex for = example) You can using VTDGen to create multiple VTDNav, think VTDGEn as a DOM = parser, VTDNav as a DOM tree.. Has not effect on update, as update is associated with VTDNav, not = VTDGen... How will these newly inserted fragments impact the overall navigation = performance?=20 No effect, since inserted fragement are not really inserted, XMLModifier = only records the insertion, new XML is created only when you ask XMLModifier to output new XML How does the update operation impact the existing bookmarks?=20 No effect, as the update operation's effect isn't immediate ----- Original Message -----=20 From: Sun, Vanessa=20 To: vtd...@li...=20 Sent: Saturday, June 30, 2007 10:26 PM Subject: [Vtd-xml-users] update-operation impact, if any? Wondering what is frequent update/insert impact for vtd-xml : a.. Does XMLModifier directly manipulating the underlying byte[]? = How does it handle inserting new data fragment? create a new memory = block to combine both of them? or just track the new block with the = token? Is there a performance benchmark on update operation comparison = with the other implementation? b.. How will these newly inserted fragments impact the overall = navigation performance?=20 c.. How does the update operation impact the existing bookmarks?=20 d.. Can we associate multiple VTDNav with the VTDGen? Regardless, = how will the update operation impact the existing VTDNav ? (the = currentIdnex for example) Thanks. -------------------------- Cheers, Vanessa -------------------------------------------------------------------------= ----- = -------------------------------------------------------------------------= This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ -------------------------------------------------------------------------= ----- _______________________________________________ Vtd-xml-users mailing list Vtd...@li... https://lists.sourceforge.net/lists/listinfo/vtd-xml-users |
|
From: Sun, V. <vs...@am...> - 2007-07-01 23:44:45
|
I see.=20 =20 Our use case is a xml processing pipeline, each stage will modify the document then send it down to the next stage for further processing... if I understand your comments correctly, this means each stage will have to produce a new byte[] and NVDNav in order for the next stage to see and navigate the whole document... =20 We are currently using a Dom tree and just keep adding new data to the tree and pass the tree through the pipeline. Sure it is slow (that's why we are looking for alternative), but intuitively thinking, it should still be more efficient for partial updates than recreate the whole document for each stage ... am I missing something? =20 -------------------------- Cheers, Vanessa =20 ________________________________ From: Jimmy Zhang [mailto:cra...@co...]=20 Sent: Sunday, July 01, 2007 11:14 AM To: Sun, Vanessa; vtd...@li... Subject: Re: [Vtd-xml-users] update-operation impact, if any? Does XMLModifier directly manipulating the underlying byte[]?=20 XMLModifier does *not* directly overwrite the underlying bytes...=20 it will apply change and output *new* XML data into OutputStream... =20 How does it handle inserting new data fragment?=20 if you insert new data fragment, you just have to navigate the cursor to the right spot in the XML document and drop it in =20 create a new memory block to combine both of them? or just track the new block with the token?=20 yes, that is right , except you don't explicitly create new memory block, instead you create an outputStream (e.g. ByteArrayOutputStream) =20 Is there a performance benchmark on update operation comparison with the other implementation? http://vtd-xml.sourceforge.net/benchmark1.html update+xpath+indexing, so update is part of it =20 Can we associate multiple VTDNav with the VTDGen? Regardless, how will the update operation impact the existing VTDNav ? (the currentIdnex for example) =20 You can using VTDGen to create multiple VTDNav, think VTDGEn as a DOM parser, VTDNav as a DOM tree.. Has not effect on update, as update is associated with VTDNav, not VTDGen... =20 How will these newly inserted fragments impact the overall navigation performance?=20 =20 No effect, since inserted fragement are not really inserted, XMLModifier only records the insertion, new XML is created only when you ask XMLModifier to output new XML =20 How does the update operation impact the existing bookmarks?=20 =20 No effect, as the update operation's effect isn't immediate =20 ----- Original Message -----=20 From: Sun, Vanessa <mailto:vs...@am...> =20 To: vtd...@li...=20 Sent: Saturday, June 30, 2007 10:26 PM Subject: [Vtd-xml-users] update-operation impact, if any? =09 =09 Wondering what is frequent update/insert impact for vtd-xml : *=09 Does XMLModifier directly manipulating the underlying byte[]? How does it handle inserting new data fragment? create a new memory block to combine both of them? or just track the new block with the token? Is there a performance benchmark on update operation comparison with the other implementation? *=09 How will these newly inserted fragments impact the overall navigation performance?=20 *=09 How does the update operation impact the existing bookmarks?=20 *=09 Can we associate multiple VTDNav with the VTDGen? Regardless, how will the update operation impact the existing VTDNav ? (the currentIdnex for example) Thanks. =20 -------------------------- Cheers, Vanessa =20 =09 ________________________________ =09 =09 ------------------------------------------------------------------------ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/=20 =09 ________________________________ =09 _______________________________________________ Vtd-xml-users mailing list Vtd...@li... https://lists.sourceforge.net/lists/listinfo/vtd-xml-users =09 |
|
From: Jimmy Z. <cra...@co...> - 2007-07-02 01:11:12
|
It is not much different from DOM, albeit VTD-XML's more efficient...=20 when you send DOM downstream, still need to serialize...=20 for VTD-XML, you just call XMLModifier's output()... The following stage will still need to parse the incoming XML ... so you have to=20 1. parse 2. add changes 3. generate output whether using DOM or VTD-XML... where is the difference? ----- Original Message -----=20 From: Sun, Vanessa=20 To: Jimmy Zhang ; vtd...@li...=20 Sent: Sunday, July 01, 2007 4:44 PM Subject: RE: [Vtd-xml-users] update-operation impact, if any? I see.=20 Our use case is a xml processing pipeline, each stage will modify the = document then send it down to the next stage for further processing... = if I understand your comments correctly, this means each stage will have = to produce a new byte[] and NVDNav in order for the next stage to see = and navigate the whole document... We are currently using a Dom tree and just keep adding new data to the = tree and pass the tree through the pipeline. Sure it is slow (that's why = we are looking for alternative), but intuitively thinking, it should = still be more efficient for partial updates than recreate the whole = document for each stage ... am I missing something? -------------------------- Cheers, Vanessa -------------------------------------------------------------------------= ----- From: Jimmy Zhang [mailto:cra...@co...]=20 Sent: Sunday, July 01, 2007 11:14 AM To: Sun, Vanessa; vtd...@li... Subject: Re: [Vtd-xml-users] update-operation impact, if any? Does XMLModifier directly manipulating the underlying byte[]?=20 XMLModifier does *not* directly overwrite the underlying bytes...=20 it will apply change and output *new* XML data into OutputStream... How does it handle inserting new data fragment?=20 if you insert new data fragment, you just have to navigate the cursor = to the right spot in the XML document and drop it in create a new memory block to combine both of them? or just track the = new block with the token?=20 yes, that is right , except you don't explicitly create new memory block, instead you = create an outputStream (e.g. ByteArrayOutputStream) Is there a performance benchmark on update operation comparison with = the other implementation? http://vtd-xml.sourceforge.net/benchmark1.html update+xpath+indexing, = so update is part of it Can we associate multiple VTDNav with the VTDGen? Regardless, how will = the update operation impact the existing VTDNav ? (the currentIdnex for = example) You can using VTDGen to create multiple VTDNav, think VTDGEn as a DOM = parser, VTDNav as a DOM tree.. Has not effect on update, as update is associated with VTDNav, not = VTDGen... How will these newly inserted fragments impact the overall navigation = performance?=20 No effect, since inserted fragement are not really inserted, = XMLModifier only records the insertion, new XML is created only when you = ask XMLModifier to output new XML How does the update operation impact the existing bookmarks?=20 No effect, as the update operation's effect isn't immediate ----- Original Message -----=20 From: Sun, Vanessa=20 To: vtd...@li...=20 Sent: Saturday, June 30, 2007 10:26 PM Subject: [Vtd-xml-users] update-operation impact, if any? Wondering what is frequent update/insert impact for vtd-xml : a.. Does XMLModifier directly manipulating the underlying byte[]? = How does it handle inserting new data fragment? create a new memory = block to combine both of them? or just track the new block with the = token? Is there a performance benchmark on update operation comparison = with the other implementation? b.. How will these newly inserted fragments impact the overall = navigation performance?=20 c.. How does the update operation impact the existing bookmarks?=20 d.. Can we associate multiple VTDNav with the VTDGen? Regardless, = how will the update operation impact the existing VTDNav ? (the = currentIdnex for example) Thanks. -------------------------- Cheers, Vanessa -------------------------------------------------------------------------= --- = -------------------------------------------------------------------------= This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/=20 -------------------------------------------------------------------------= --- _______________________________________________ Vtd-xml-users mailing list Vtd...@li... https://lists.sourceforge.net/lists/listinfo/vtd-xml-users |
|
From: Tatu S. <cow...@ya...> - 2007-07-02 16:45:34
|
--- Jimmy Zhang <cra...@co...> wrote:
> It is not much different from DOM, albeit VTD-XML's
> more efficient...
> when you send DOM downstream, still need to
> serialize...
> for VTD-XML, you just call XMLModifier's output()...
Not necessarily, more often it's all within a single
process, the tree model just gets passed. If so, no
additional serialization or parsing is needed.
I think it is fair to say that different models are
more optimal for different use cases, and that heavy
modifications are amongst hardest use cases for
extractive parsers to implement efficiently (as
opposed to read-only that is the most optimal).
Regarding original problem -- it is of course possible
to move from basic DOM to other Java tree
alternatives. Also, most xml processing overhead might
not come directly from DOM model itself: it is good to
profile to see if it might have to do with other
processing (xslt if it's being used, xpath, naive
business logic that uses too much traversal).
-+ Tatu +-
____________________________________________________________________________________
Sick sense of humor? Visit Yahoo! TV's
Comedy with an Edge to see what's on, when.
http://tv.yahoo.com/collections/222
|
|
From: Jimmy Z. <cra...@co...> - 2007-07-02 17:34:51
|
Ok, I think the assumption here is that everything happens within a single process... but pipelining usually is more broadly defined as passing XML data across process's boundry (even across the network, e.g. SOA components) between mutiple programs... like unix's IPC, in that case, it is usually not possible to pass a DOM tree around without re-serializing... ----- Original Message ----- From: "Tatu Saloranta" <cow...@ya...> To: "Jimmy Zhang" <cra...@co...>; "Sun, Vanessa" <vs...@am...>; <vtd...@li...> Sent: Monday, July 02, 2007 9:45 AM Subject: Re: [Vtd-xml-users] update-operation impact, if any? > --- Jimmy Zhang <cra...@co...> wrote: > >> It is not much different from DOM, albeit VTD-XML's >> more efficient... >> when you send DOM downstream, still need to >> serialize... >> for VTD-XML, you just call XMLModifier's output()... > > Not necessarily, more often it's all within a single > process, the tree model just gets passed. If so, no > additional serialization or parsing is needed. > > I think it is fair to say that different models are > more optimal for different use cases, and that heavy > modifications are amongst hardest use cases for > extractive parsers to implement efficiently (as > opposed to read-only that is the most optimal). > > Regarding original problem -- it is of course possible > to move from basic DOM to other Java tree > alternatives. Also, most xml processing overhead might > not come directly from DOM model itself: it is good to > profile to see if it might have to do with other > processing (xslt if it's being used, xpath, naive > business logic that uses too much traversal). > > -+ Tatu +- > > > > > ____________________________________________________________________________________ > Sick sense of humor? Visit Yahoo! TV's > Comedy with an Edge to see what's on, when. > http://tv.yahoo.com/collections/222 > |