From: Rodrigo C. <rn...@gm...> - 2007-02-14 01:40:42
|
Hello there! About 10 months ago I started a topic on the discussion forum concerning the need for true random access and location storing in VTD-XML. Currently we only have a pop/push interface. At the time I had no compelling reasons to advice such a change, from a pure stack-oriented approach into a more flexible one. I chenged the API, but my changes where not inserted into the project, due to lack of compelling reasons, and a somewhat bad design also. Now, after using VTD-XML for a few months to work with huge and complex files I have a reason: position caching. Let me give an example, taken from a real problem I faced: <document> [...] [...] <nes> <ne> <name>XPTO</name> [...]complex structure describing NE[...] <level1> <level2> [....] <nice_indexing_atribute> </level2> </level1> </ne> [...] a bunch of NEs... [...] [...] [...] [...] </nes> [...] [...] [...] <tpaths> <tpath> [...]complex structure describing tpath[...] <level1> <a few more levels> <level4> [....] <nice_indexing_atribute> <pointer to nice NE atribute> [....] </level2> </level1> <level1> <a few more levels> <level4> [....] <pointer to nice NE atribute> [....] </level2> </level1> </tpath> [...] more paths... [...] </tpaths> [...] </document> In order to navigate the file efficiently and produce interactive results I was forced to maintain positions caches for both <ne> and <tpath> indexed by those nice very-inner atributes. For example, a task that took 36 seconds using unhelped navigation can now be done in 1 or 2 seconds. I had previously changed the API to allow multiple stacks, and context-export, but as previously mentioned keeping a context unrelated to a VTDNav object makes not much sense. Perhaps a better operation would be something like: NavContext VTDNav.getCtx(); // sends back a context boolean VTDNav.setPos(NavContext ctx); // sets internal navigation registers from context VTDNav NavContext.getNav(); // gets the VTDNav object this context belongs to The Context would internally point at a VTDNav, so that they could check each other when they need. An exception could be generated if a non-related context is used in setPos, or simply "false" could be returned. Addicionally contexts should suport some interfaces so that they can be kept in hash tables efficiently, for example... but that's not a problem normally. I'm currenly using this kind of approach to caching and true random with my previous interface that exported multiple stacks, but that's cumbersome, heavy and prone to errors. A lighter interface like this one i'm proposing now, and better implemented, would be way better, and cleaner also. Any comments? -- Rodrigo Cunha |
From: Jimmy Z. <cra...@co...> - 2007-02-14 03:02:49
|
yes, this feature is due 2.0 coming in a few days :) ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Tuesday, February 13, 2007 5:40 PM Subject: [Vtd-xml-users] Random Access Proposal (take 2) > Hello there! > > About 10 months ago I started a topic on the discussion forum concerning > the need for true random access and location storing in VTD-XML. > Currently we only have a pop/push interface. > > At the time I had no compelling reasons to advice such a change, from a > pure stack-oriented approach into a more flexible one. I chenged the > API, but my changes where not inserted into the project, due to lack of > compelling reasons, and a somewhat bad design also. > > Now, after using VTD-XML for a few months to work with huge and complex > files I have a reason: position caching. > > Let me give an example, taken from a real problem I faced: > > <document> > [...] > [...] > <nes> > <ne> > <name>XPTO</name> > [...]complex structure describing NE[...] > <level1> > <level2> > [....] > <nice_indexing_atribute> > </level2> > </level1> > </ne> > [...] > a bunch of NEs... > [...] > [...] > [...] > [...] > </nes> > [...] > [...] > [...] > <tpaths> > <tpath> > [...]complex structure describing tpath[...] > <level1> > <a few more levels> > <level4> > [....] > <nice_indexing_atribute> > <pointer to nice NE atribute> > [....] > </level2> > </level1> > <level1> > <a few more levels> > <level4> > [....] > <pointer to nice NE atribute> > [....] > </level2> > </level1> > </tpath> > [...] > more paths... > [...] > </tpaths> > [...] > </document> > > > In order to navigate the file efficiently and produce interactive > results I was forced to maintain positions caches for both <ne> and > <tpath> indexed by those nice very-inner atributes. > > For example, a task that took 36 seconds using unhelped navigation can > now be done in 1 or 2 seconds. > > I had previously changed the API to allow multiple stacks, and > context-export, but as previously mentioned keeping a context unrelated > to a VTDNav object makes not much sense. Perhaps a better operation > would be something like: > > NavContext VTDNav.getCtx(); // sends back a context > > boolean VTDNav.setPos(NavContext ctx); // sets internal navigation > registers from context > > VTDNav NavContext.getNav(); // gets the VTDNav object this context > belongs to > > The Context would internally point at a VTDNav, so that they could check > each other when they need. An exception could be generated if a > non-related context is used in setPos, or simply "false" could be > returned. > > Addicionally contexts should suport some interfaces so that they can be > kept in hash tables efficiently, for example... but that's not a problem > normally. > > I'm currenly using this kind of approach to caching and true random with > my previous interface that exported multiple stacks, but that's > cumbersome, heavy and prone to errors. A lighter interface like this one > i'm proposing now, and better implemented, would be way better, and > cleaner also. > > Any comments? > > -- > Rodrigo Cunha > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier. > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-14 11:20:00
|
Oh...! That's great news! Could we have a look at the API before the release? I might have some suggestions, who knows... :-) Jimmy Zhang wrote: > yes, this feature is due 2.0 coming in a few days :) |
From: Jimmy Z. <cra...@co...> - 2007-02-15 09:00:28
|
Yes, will try, but then again, there will always be a 2.1 :) ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Wednesday, February 14, 2007 3:19 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Oh...! That's great news! > > Could we have a look at the API before the release? > > I might have some suggestions, who knows... :-) > > Jimmy Zhang wrote: >> yes, this feature is due 2.0 coming in a few days :) > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-15 11:19:14
|
Well, just some ideas concerning what I think should be the nature of a "context": - As light as possible to generate, manipulate and access (so just use a simple context with minimun clutter). - Comparable. - Hashable efficiently (good and fast dispertion function). - Possible to associate with VTDNav (so contains a pointer to VTDNav). - Usable in another VTDNav (that's a tricky one, and unsafe, but makes sense if you have various equal VTDNavs and a RMI-based system, so it should be possible despite perhaps including dire warnings in the documentation). Jimmy Zhang wrote: Yes, will try, but then again, there will always be a 2.1 :) |
From: Jimmy Z. <cra...@co...> - 2007-02-19 08:27:56
|
the source forge shell service is down, the document for 2.0 is at http://www.ximpleware.com/doc/ ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Thursday, February 15, 2007 3:19 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Well, just some ideas concerning what I think should be the nature of a > "context": > > - As light as possible to generate, manipulate and access (so just use a > simple context with minimun clutter). > - Comparable. > - Hashable efficiently (good and fast dispertion function). > - Possible to associate with VTDNav (so contains a pointer to VTDNav). > - Usable in another VTDNav (that's a tricky one, and unsafe, but makes > sense if you have various equal VTDNavs and a RMI-based system, so it > should be possible despite perhaps including dire warnings in the > documentation). > > Jimmy Zhang wrote: > > Yes, will try, but then again, there will always be a 2.1 :) > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-20 20:10:58
|
Ok, so I see a new NodeRecorder. I didn't saw the internals of NodeRecorder yet, but I presume it's lightweight, so I can instanciate a few thousands without major trouble and keep them in my internal structures, right? I think you should introduce two new methods into NodeRecorder: VTDNav NodeRecorder.getNav(); int NodeRecorder.getPositionsCount(); Thanks, Rodrigo Jimmy Zhang wrote: > the source forge shell service is down, the document for 2.0 is at > http://www.ximpleware.com/doc/ > ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> > To: <vtd...@li...> > Sent: Thursday, February 15, 2007 3:19 AM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Well, just some ideas concerning what I think should be the nature of a >> "context": >> >> - As light as possible to generate, manipulate and access (so just use a >> simple context with minimun clutter). >> - Comparable. >> - Hashable efficiently (good and fast dispertion function). >> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >> sense if you have various equal VTDNavs and a RMI-based system, so it >> should be possible despite perhaps including dire warnings in the >> documentation). >> >> Jimmy Zhang wrote: >> >> Yes, will try, but then again, there will always be a 2.1 :) >> >> >> ------------------------------------------------------------------------- >> >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >> > > > |
From: Jimmy Z. <cra...@co...> - 2007-02-21 04:16:10
|
A few thousand?? Why would you do that? A few thousand of anything would slow things down... What are you trying to accomplish? I think you may only need to instantiate one... ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Tuesday, February 20, 2007 12:10 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Ok, so I see a new NodeRecorder. > > I didn't saw the internals of NodeRecorder yet, but I presume it's > lightweight, so I can instanciate a few thousands without major trouble > and keep them in my internal structures, right? > > I think you should introduce two new methods into NodeRecorder: > > VTDNav NodeRecorder.getNav(); > > int NodeRecorder.getPositionsCount(); > > Thanks, > > Rodrigo > > Jimmy Zhang wrote: >> the source forge shell service is down, the document for 2.0 is at >> http://www.ximpleware.com/doc/ >> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >> To: <vtd...@li...> >> Sent: Thursday, February 15, 2007 3:19 AM >> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >> >> >>> Well, just some ideas concerning what I think should be the nature of a >>> "context": >>> >>> - As light as possible to generate, manipulate and access (so just use a >>> simple context with minimun clutter). >>> - Comparable. >>> - Hashable efficiently (good and fast dispertion function). >>> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >>> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >>> sense if you have various equal VTDNavs and a RMI-based system, so it >>> should be possible despite perhaps including dire warnings in the >>> documentation). >>> >>> Jimmy Zhang wrote: >>> >>> Yes, will try, but then again, there will always be a 2.1 :) >>> >>> >>> ------------------------------------------------------------------------- >>> >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys-and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> >>> _______________________________________________ >>> Vtd-xml-users mailing list >>> Vtd...@li... >>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>> >> >> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-21 10:44:57
|
Because there are a few thousands <ne> and several thousands <tpath>. In my version, 1.6 with my patches to allow Context export, that cache takes a few seconds to build, but the speedup afterwards is huge, on the order of 30x, due to the relative complexity of the original indexing. I'll try to convert the code into VTD-2.0 soon and see if the performance holds (or improves). Jimmy Zhang wrote: > A few thousand?? Why would you do that? > A few thousand of anything would slow things down... > > What are you trying to accomplish? I think you may only > need to instantiate one... > ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> > Cc: <vtd...@li...> > Sent: Tuesday, February 20, 2007 12:10 PM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Ok, so I see a new NodeRecorder. >> >> I didn't saw the internals of NodeRecorder yet, but I presume it's >> lightweight, so I can instanciate a few thousands without major trouble >> and keep them in my internal structures, right? >> >> I think you should introduce two new methods into NodeRecorder: >> >> VTDNav NodeRecorder.getNav(); >> >> int NodeRecorder.getPositionsCount(); >> >> Thanks, >> >> Rodrigo >> >> Jimmy Zhang wrote: >>> the source forge shell service is down, the document for 2.0 is at >>> http://www.ximpleware.com/doc/ >>> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >>> To: <vtd...@li...> >>> Sent: Thursday, February 15, 2007 3:19 AM >>> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >>> >>> >>>> Well, just some ideas concerning what I think should be the nature >>>> of a >>>> "context": >>>> >>>> - As light as possible to generate, manipulate and access (so just >>>> use a >>>> simple context with minimun clutter). >>>> - Comparable. >>>> - Hashable efficiently (good and fast dispertion function). >>>> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >>>> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >>>> sense if you have various equal VTDNavs and a RMI-based system, so it >>>> should be possible despite perhaps including dire warnings in the >>>> documentation). >>>> >>>> Jimmy Zhang wrote: >>>> >>>> Yes, will try, but then again, there will always be a 2.1 :) >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> >>>> >>>> Take Surveys. Earn Cash. Influence the Future of IT >>>> Join SourceForge.net's Techsay panel and you'll get the chance to >>>> share your >>>> opinions on IT & business topics through brief surveys-and earn cash >>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>>> >>>> >>>> _______________________________________________ >>>> Vtd-xml-users mailing list >>>> Vtd...@li... >>>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>>> >>> >>> >>> >> >> >> ------------------------------------------------------------------------- >> >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >> > > > |
From: Jimmy Z. <cra...@co...> - 2007-02-21 19:36:51
|
I will get back to you on this subject after finishing the benchmark report update... ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Wednesday, February 21, 2007 2:44 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Because there are a few thousands <ne> and several thousands <tpath>. > > In my version, 1.6 with my patches to allow Context export, that cache > takes a few seconds to build, but the speedup afterwards is huge, on the > order of 30x, due to the relative complexity of the original indexing. > > I'll try to convert the code into VTD-2.0 soon and see if the > performance holds (or improves). > > Jimmy Zhang wrote: >> A few thousand?? Why would you do that? >> A few thousand of anything would slow things down... >> >> What are you trying to accomplish? I think you may only >> need to instantiate one... >> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >> Cc: <vtd...@li...> >> Sent: Tuesday, February 20, 2007 12:10 PM >> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >> >> >>> Ok, so I see a new NodeRecorder. >>> >>> I didn't saw the internals of NodeRecorder yet, but I presume it's >>> lightweight, so I can instanciate a few thousands without major trouble >>> and keep them in my internal structures, right? >>> >>> I think you should introduce two new methods into NodeRecorder: >>> >>> VTDNav NodeRecorder.getNav(); >>> >>> int NodeRecorder.getPositionsCount(); >>> >>> Thanks, >>> >>> Rodrigo >>> >>> Jimmy Zhang wrote: >>>> the source forge shell service is down, the document for 2.0 is at >>>> http://www.ximpleware.com/doc/ >>>> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >>>> To: <vtd...@li...> >>>> Sent: Thursday, February 15, 2007 3:19 AM >>>> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >>>> >>>> >>>>> Well, just some ideas concerning what I think should be the nature >>>>> of a >>>>> "context": >>>>> >>>>> - As light as possible to generate, manipulate and access (so just >>>>> use a >>>>> simple context with minimun clutter). >>>>> - Comparable. >>>>> - Hashable efficiently (good and fast dispertion function). >>>>> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >>>>> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >>>>> sense if you have various equal VTDNavs and a RMI-based system, so it >>>>> should be possible despite perhaps including dire warnings in the >>>>> documentation). >>>>> >>>>> Jimmy Zhang wrote: >>>>> >>>>> Yes, will try, but then again, there will always be a 2.1 :) >>>>> >>>>> >>>>> ------------------------------------------------------------------------- >>>>> >>>>> >>>>> Take Surveys. Earn Cash. Influence the Future of IT >>>>> Join SourceForge.net's Techsay panel and you'll get the chance to >>>>> share your >>>>> opinions on IT & business topics through brief surveys-and earn cash >>>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>>>> >>>>> >>>>> _______________________________________________ >>>>> Vtd-xml-users mailing list >>>>> Vtd...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>>>> >>>> >>>> >>>> >>> >>> >>> ------------------------------------------------------------------------- >>> >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys-and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> >>> _______________________________________________ >>> Vtd-xml-users mailing list >>> Vtd...@li... >>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>> >> >> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Jimmy Z. <cra...@co...> - 2007-02-25 03:40:12
|
The latest benchmark reports (on Version 2.0) is now live at http://vtd-xml.sf.net/benchmark1.html The corresponding benchmark code also was uploaded to the sourceforge at http://sourceforge.net/project/showfiles.php?group_id=110612 |
From: Rodrigo C. <rn...@gm...> - 2007-02-27 18:43:40
|
Hi Jimmy! I've tested the new random access API... it doesn't quite work the way I expected, for example: AutoPilot ap = new AutoPilot(vn); ap.bind(vn); ap.selectElement("ServiceID"); NodeRecorder myContext = new NodeRecorder(vn); while(ap.iterate()){ myContext.record(); // do something messy myContext.resetPointer(); } resetPointer also affects ap context precluding it from correctly cycling the values. Why don't you just export something similar to what is kept in the stack anyway? A context is so simple, you just need an efficient byte array... When you ask for context repositioning you just have to overwrite the values in "vn", no need for anything more. A context is (should be) the equivalent to a stack position, nothing more... I think I sent you my altered ximpleware_1.6, do you want the code to look again? This is a simple context: /** * This class is used to store a single context of VTDNav class. */ public class SimpleContext{ private int[] buf; private int bufsize = 0; public SimpleContext(int[] in) { // This allows both allocation during creation and allocating // an adequate buffer size so that no further reallocation is // needed in the future. if (in != null) { buf = in.clone(); bufsize = in.length; } else { buf = new int[0]; bufsize = 0; } } public void set(int[] in) { if (buf.length < in.length) { buf = in.clone(); } else { System.arraycopy(in,0,buf,0,in.length); bufsize = in.length; } } public boolean get(int[] out) { if (bufsize > 0) { if (out.length != buf.length) { out = buf.clone(); } else { System.arraycopy(buf,0,out,0,bufsize); } return true; } else { return false; } } } Jimmy Zhang wrote: > The latest benchmark reports (on Version 2.0) is now live > at > http://vtd-xml.sf.net/benchmark1.html > > The corresponding benchmark code also was uploaded > to the sourceforge at > > http://sourceforge.net/project/showfiles.php?group_id=110612 > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > |
From: Jimmy Z. <cra...@co...> - 2007-02-27 19:45:40
|
There is an example in the code example that shows you how to use this class correctly... resetPointer is only called *after* you finish recording and *before* you* start reading... Look at the example and let me know, I thought about the possibility of creating something as part of VTDNav, and vote against it because (1) multiple instance of nodeRecorder can be instantiated (2) It could get pretty heavy if overused The suggestion that you wrote seems to assume that there are only a few copies of context, that may not general purpose enough for other people's needs ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Tuesday, February 27, 2007 10:42 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Hi Jimmy! > > I've tested the new random access API... it doesn't quite work the way I > expected, for example: > > AutoPilot ap = new AutoPilot(vn); > ap.bind(vn); > ap.selectElement("ServiceID"); > NodeRecorder myContext = new NodeRecorder(vn); > while(ap.iterate()){ > myContext.record(); > // do something messy > myContext.resetPointer(); > } > > resetPointer also affects ap context precluding it from correctly > cycling the values. > > Why don't you just export something similar to what is kept in the stack > anyway? A context is so simple, you just need an efficient byte array... > > When you ask for context repositioning you just have to overwrite the > values in "vn", no need for anything more. A context is (should be) the > equivalent to a stack position, nothing more... > > I think I sent you my altered ximpleware_1.6, do you want the code to > look again? > > This is a simple context: > > /** > * This class is used to store a single context of VTDNav class. > */ > public class SimpleContext{ > private int[] buf; > private int bufsize = 0; > public SimpleContext(int[] in) { > // This allows both allocation during creation and allocating > // an adequate buffer size so that no further reallocation is > // needed in the future. > if (in != null) { > buf = in.clone(); > bufsize = in.length; > } else { > buf = new int[0]; > bufsize = 0; > } > } > public void set(int[] in) { > if (buf.length < in.length) { > buf = in.clone(); > } else { > System.arraycopy(in,0,buf,0,in.length); > bufsize = in.length; > } > } > public boolean get(int[] out) { > if (bufsize > 0) { > if (out.length != buf.length) { > out = buf.clone(); > } else { > System.arraycopy(buf,0,out,0,bufsize); > } > return true; > } else { > return false; > } > } > } > > > Jimmy Zhang wrote: >> The latest benchmark reports (on Version 2.0) is now live >> at >> http://vtd-xml.sf.net/benchmark1.html >> >> The corresponding benchmark code also was uploaded >> to the sourceforge at >> >> http://sourceforge.net/project/showfiles.php?group_id=110612 >> >> >> ------------------------------------------------------------------------- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to share >> your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users |
From: Rodrigo C. <rn...@gm...> - 2007-02-28 12:45:25
|
I understand NodeRecorder was not intended to be kept in large numbers, but I think that should exactly be the idea of a random access API: a lightweight way of keeping a bunch of bookmarks in the datastructure the programmer wants, not in the structure we want, or something... Your API is nice for somewhat serial processing, not for true random access, using pre-build hash tables, for example, or trees, or whatever. I could built a wrapper around NodeRecorder implementing a simplier API, but that would be really clumsy. My API, while incomplete, is much more simple, and flexible also... it's also rather light. I would like to learn about other opinions on the subject, since we are probably both too used to our way of doing things to be impartial. Jimmy Zhang wrote: > There is an example in the code example that shows you how to use this > class correctly... > resetPointer is only called *after* you finish recording and *before* > you* > start reading... > Look at the example and let me know, I thought about the possibility of > creating something as part of VTDNav, and vote against it because > (1) multiple instance of nodeRecorder can be instantiated > (2) It could get pretty heavy if overused > > The suggestion that you wrote seems to assume that there are only a few > copies of context, that may not general purpose enough for other people's > needs > ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> > To: <vtd...@li...> > Sent: Tuesday, February 27, 2007 10:42 AM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Hi Jimmy! >> >> I've tested the new random access API... it doesn't quite work the way I >> expected, for example: >> >> AutoPilot ap = new AutoPilot(vn); >> ap.bind(vn); >> ap.selectElement("ServiceID"); >> NodeRecorder myContext = new NodeRecorder(vn); >> while(ap.iterate()){ >> myContext.record(); >> // do something messy >> myContext.resetPointer(); >> } >> >> resetPointer also affects ap context precluding it from correctly >> cycling the values. >> >> Why don't you just export something similar to what is kept in the stack >> anyway? A context is so simple, you just need an efficient byte array... >> >> When you ask for context repositioning you just have to overwrite the >> values in "vn", no need for anything more. A context is (should be) the >> equivalent to a stack position, nothing more... >> >> I think I sent you my altered ximpleware_1.6, do you want the code to >> look again? >> >> This is a simple context: >> >> /** >> * This class is used to store a single context of VTDNav class. >> */ >> public class SimpleContext{ >> private int[] buf; >> private int bufsize = 0; >> public SimpleContext(int[] in) { >> // This allows both allocation during creation and allocating >> // an adequate buffer size so that no further reallocation is >> // needed in the future. >> if (in != null) { >> buf = in.clone(); >> bufsize = in.length; >> } else { >> buf = new int[0]; >> bufsize = 0; >> } >> } >> public void set(int[] in) { >> if (buf.length < in.length) { >> buf = in.clone(); >> } else { >> System.arraycopy(in,0,buf,0,in.length); >> bufsize = in.length; >> } >> } >> public boolean get(int[] out) { >> if (bufsize > 0) { >> if (out.length != buf.length) { >> out = buf.clone(); >> } else { >> System.arraycopy(buf,0,out,0,bufsize); >> } >> return true; >> } else { >> return false; >> } >> } >> } >> >> >> Jimmy Zhang wrote: >>> The latest benchmark reports (on Version 2.0) is now live >>> at >>> http://vtd-xml.sf.net/benchmark1.html >>> >>> The corresponding benchmark code also was uploaded >>> to the sourceforge at >>> >>> http://sourceforge.net/project/showfiles.php?group_id=110612 >>> >>> >>> ------------------------------------------------------------------------- >>> >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys-and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> >>> _______________________________________________ >>> Vtd-xml-users mailing list >>> Vtd...@li... >>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>> >>> >> >> >> ------------------------------------------------------------------------- >> >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > > |
From: Mark S. <ma...@Sc...> - 2007-02-28 16:11:56
|
Rodrigo Cunha wrote: > I understand NodeRecorder was not intended to be kept in large numbers, > but I think that should exactly be the idea of a random access API: a > lightweight way of keeping a bunch of bookmarks in the datastructure the > programmer wants, not in the structure we want, or something... > > Your API is nice for somewhat serial processing, not for true random > access, using pre-build hash tables, for example, or trees, or whatever. > I could built a wrapper around NodeRecorder implementing a simplier API, > but that would be really clumsy. > > My API, while incomplete, is much more simple, and flexible also... it's > also rather light. I would like to learn about other opinions on the > subject, since we are probably both too used to our way of doing things > to be impartial. It would be most helpful to me if I could index arbitrary element indexes and start and XPath query from one of these indexes. I would cache these indexes in a Map with key: some unique ID, value: some sort of vtd-xml node index. For most of the applications I use XML for, this would be the only way to get acceptable performance. Ultimately, without this I would not be able to consider using vtd-xml for these apps and I would be forced to use an xml - Object mapping tool. I've been using and helping maintain/fix a number of XML - Object mapping tools over the years. It's been an interesting area of study for me. Please free me from the insufferable weight of those chains :-) Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Rodrigo C. <rn...@gm...> - 2007-02-28 18:40:44
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000000"> Mark, my implementation has some bug... it doesn't really work, but just let me know if the API is ok to you.<br> <br> We can iron-out bugs latter, with Jimmy help perhaps :-)<br> <br> You can create a SimpleContext, witch holds a position, with:<br> <br> SimpleContext ctx = new SimpleContext(null);<br> //starts as an empty context<br> <br> or<br> <br> SimpleContext ctx = new SimpleContext(int size);<br> // starts with given nodes capacity, so reallocation can be minimized during reuse<br> <br> and you can use the object to hold and retrieve positions with:<br> <br> navigator.setCtxFromNav(ctx);<br> //holds current position in this context object<br> <br> and<br> <br> navigator.setNavFromCtx(ctx);<br> //sets navigator pointers from this context object<br> <br> <br> The object itself has little storage requirements, only a small integer array, so you can create thousands and still live happy.<br> <br> They are also reusable, to minimize garbage collection just in case your algoritm wants to reuse them.<br> <br> Does this sound ok for you, Mark?<br> <br> <br> <br> Mark Swanson wrote: <blockquote cite="mid...@Sc..." type="cite"> <pre wrap="">Rodrigo Cunha wrote: </pre> <blockquote type="cite"> <pre wrap="">I understand NodeRecorder was not intended to be kept in large numbers, but I think that should exactly be the idea of a random access API: a lightweight way of keeping a bunch of bookmarks in the datastructure the programmer wants, not in the structure we want, or something... Your API is nice for somewhat serial processing, not for true random access, using pre-build hash tables, for example, or trees, or whatever. I could built a wrapper around NodeRecorder implementing a simplier API, but that would be really clumsy. My API, while incomplete, is much more simple, and flexible also... it's also rather light. I would like to learn about other opinions on the subject, since we are probably both too used to our way of doing things to be impartial. </pre> </blockquote> <pre wrap=""><!----> It would be most helpful to me if I could index arbitrary element indexes and start and XPath query from one of these indexes. I would cache these indexes in a Map with key: some unique ID, value: some sort of vtd-xml node index. For most of the applications I use XML for, this would be the only way to get acceptable performance. Ultimately, without this I would not be able to consider using vtd-xml for these apps and I would be forced to use an xml - Object mapping tool. I've been using and helping maintain/fix a number of XML - Object mapping tools over the years. It's been an interesting area of study for me. Please free me from the insufferable weight of those chains :-) Cheers. </pre> </blockquote> <br> </body> </html> |
From: Tatu S. <cow...@ya...> - 2007-02-28 22:59:30
|
--- Mark Swanson <ma...@Sc...> wrote: ... > It would be most helpful to me if I could index > arbitrary element > indexes and start and XPath query from one of these > indexes. I would > cache these indexes in a Map with key: some unique > ID, value: some sort > of vtd-xml node index. Given that VTD-XML indices are, well, ints, would this be anything more than a kind of Map<String,int>? (or, a stack thereof). That seems like a simple thing to build even outside of VTD-XML itself? Just curious, -+ Tatu +- ____________________________________________________________________________________ 8:00? 8:25? 8:40? Find a flick in no time with the Yahoo! Search movie showtime shortcut. http://tools.search.yahoo.com/shortcuts/#news |
From: Jimmy Z. <cra...@co...> - 2007-02-28 23:06:10
|
sounds good to me... ----- Original Message ----- From: "Tatu Saloranta" <cow...@ya...> To: <vtd...@li...> Sent: Wednesday, February 28, 2007 2:59 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > --- Mark Swanson <ma...@Sc...> wrote: > > ... >> It would be most helpful to me if I could index >> arbitrary element >> indexes and start and XPath query from one of these >> indexes. I would >> cache these indexes in a Map with key: some unique >> ID, value: some sort >> of vtd-xml node index. > > Given that VTD-XML indices are, well, ints, would this > be anything more than a kind of Map<String,int>? (or, > a stack thereof). > That seems like a simple thing to build even outside > of VTD-XML itself? > > Just curious, > > -+ Tatu +- > > > > > ____________________________________________________________________________________ > 8:00? 8:25? 8:40? Find a flick in no time > with the Yahoo! Search movie showtime shortcut. > http://tools.search.yahoo.com/shortcuts/#news > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-03-01 02:05:51
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> <title></title> </head> <body bgcolor="#ffffff" text="#000000"> For a SimpleContext, or something similar:<br> <br> The comparison operator could just compare the arrays containing the internal state of Navigator.<br> <br> The hashcode would be the XOR of all elements in the array, that implements java hashcode and equals in a compatible way with the assumed rules in java libraries.<br> <br> Jimmy, I think my "SimpleContext" or whatever you want to call it, if properly implemented and perhaps with a few extra functions could solve all the problems of random access.<br> <br> I think we just want something with the functionality of push() and pop() but that can be extracted and kept outside, memorizing the equivalent of a single stack position, and perhaps with a bit more functionality.<br> <br> Jimmy Zhang wrote: <blockquote cite="mid007001c75b8c$f5e9fc90$0d02a8c0@ximpleware" type="cite"> <pre wrap="">sounds good to me... ----- Original Message ----- From: "Tatu Saloranta" <a class="moz-txt-link-rfc2396E" href="mailto:cow...@ya..."><cow...@ya...></a> To: <a class="moz-txt-link-rfc2396E" href="mailto:vtd...@li..."><vtd...@li...></a> Sent: Wednesday, February 28, 2007 2:59 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) </pre> <blockquote type="cite"> <pre wrap="">--- Mark Swanson <a class="moz-txt-link-rfc2396E" href="mailto:ma...@Sc..."><ma...@Sc...></a> wrote: ... </pre> <blockquote type="cite"> <pre wrap="">It would be most helpful to me if I could index arbitrary element indexes and start and XPath query from one of these indexes. I would cache these indexes in a Map with key: some unique ID, value: some sort of vtd-xml node index. </pre> </blockquote> <pre wrap="">Given that VTD-XML indices are, well, ints, would this be anything more than a kind of Map<String,int>? (or, a stack thereof). That seems like a simple thing to build even outside of VTD-XML itself? Just curious, -+ Tatu +- ____________________________________________________________________________________ 8:00? 8:25? 8:40? Find a flick in no time with the Yahoo! Search movie showtime shortcut. <a class="moz-txt-link-freetext" href="http://tools.search.yahoo.com/shortcuts/#news">http://tools.search.yahoo.com/shortcuts/#news</a> ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash <a class="moz-txt-link-freetext" href="http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV">http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV</a> _______________________________________________ Vtd-xml-users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Vtd...@li...">Vtd...@li...</a> <a class="moz-txt-link-freetext" href="https://lists.sourceforge.net/lists/listinfo/vtd-xml-users">https://lists.sourceforge.net/lists/listinfo/vtd-xml-users</a> </pre> </blockquote> <pre wrap=""><!----> ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash <a class="moz-txt-link-freetext" href="http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV">http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV</a> _______________________________________________ Vtd-xml-users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Vtd...@li...">Vtd...@li...</a> <a class="moz-txt-link-freetext" href="https://lists.sourceforge.net/lists/listinfo/vtd-xml-users">https://lists.sourceforge.net/lists/listinfo/vtd-xml-users</a> </pre> </blockquote> <br> </body> </html> |
From: Mark S. <ma...@Sc...> - 2007-03-01 02:57:17
|
Rodrigo Cunha wrote: > For a SimpleContext, or something similar: > > The comparison operator could just compare the arrays containing the > internal state of Navigator. > > The hashcode would be the XOR of all elements in the array, that > implements java hashcode and equals in a compatible way with the assumed > rules in java libraries. > > Jimmy, I think my "SimpleContext" or whatever you want to call it, if > properly implemented and perhaps with a few extra functions could solve > all the problems of random access. > > I think we just want something with the functionality of push() and > pop() but that can be extracted and kept outside, memorizing the > equivalent of a single stack position, and perhaps with a bit more > functionality. It will be interesting to see how much / little info we can get away with saving and still be able to start an XPath expression from an arbitrary point. Important: it may make the implementation easier if we make the arbitrary starting point the new document root - just for xpath evaluation purposes. I think this is perfect, actually. I don't want anything except for the node and its children - that's why I'm explicitly pointing there in the first place. F.E. (forgive the illegal simplified syntax..) aaa bbb ccc <- Index saved, new root for xpath eval. eee fff /ccc dd ... I'd want to say something like this: vtdNav.toElement(cccIndex); // reset cursor to ccc ap.selectXPath("/ccc/*") Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Jimmy Z. <cra...@co...> - 2007-03-01 04:28:18
|
I believe you can relative location path for that instead of using /ccc/* you only need to say ccc (relative location path is basically loation paths that don't start from root) ----- Original Message ----- From: "Mark Swanson" <ma...@Sc...> To: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Wednesday, February 28, 2007 6:56 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Rodrigo Cunha wrote: >> For a SimpleContext, or something similar: >> >> The comparison operator could just compare the arrays containing the >> internal state of Navigator. >> >> The hashcode would be the XOR of all elements in the array, that >> implements java hashcode and equals in a compatible way with the assumed >> rules in java libraries. >> >> Jimmy, I think my "SimpleContext" or whatever you want to call it, if >> properly implemented and perhaps with a few extra functions could solve >> all the problems of random access. >> >> I think we just want something with the functionality of push() and >> pop() but that can be extracted and kept outside, memorizing the >> equivalent of a single stack position, and perhaps with a bit more >> functionality. > > It will be interesting to see how much / little info we can get away > with saving and still be able to start an XPath expression from an > arbitrary point. > > Important: it may make the implementation easier if we make the > arbitrary starting point the new document root - just for xpath > evaluation purposes. I think this is perfect, actually. I don't want > anything except for the node and its children - that's why I'm > explicitly pointing there in the first place. > > F.E. > (forgive the illegal simplified syntax..) > > aaa > bbb > ccc <- Index saved, new root for xpath eval. > eee > fff > /ccc > dd > ... > > I'd want to say something like this: > > vtdNav.toElement(cccIndex); // reset cursor to ccc > ap.selectXPath("/ccc/*") > > Cheers. > > -- > http://www.ScheduleWorld.com/ > Free Google Calendar synchronization with Outlook, Evolution, > cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, > Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! > WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Mark S. <ma...@Sc...> - 2007-03-01 07:05:07
|
Jimmy Zhang wrote: > I believe you can relative location path for that > instead of using /ccc/* > you only need to say ccc > (relative location path is basically loation paths > that don't start from root) I was speaking from the point of view where 'ccc' was the new document root, so /ccc/* would be what I want. Note: ccc wouldn't have to be the document root, it just seems nicer to me atm to deal with mini documents aggregated into a larger document. The xpath for querying snippets of xml would be simpler - some cases much simpler as you don't even have to know the crud required to get to the indexed element. Cheers. > ----- Original Message ----- > From: "Mark Swanson" <ma...@Sc...> > To: "Rodrigo Cunha" <rn...@gm...> > Cc: <vtd...@li...> > Sent: Wednesday, February 28, 2007 6:56 PM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Rodrigo Cunha wrote: >>> For a SimpleContext, or something similar: >>> >>> The comparison operator could just compare the arrays containing the >>> internal state of Navigator. >>> >>> The hashcode would be the XOR of all elements in the array, that >>> implements java hashcode and equals in a compatible way with the assumed >>> rules in java libraries. >>> >>> Jimmy, I think my "SimpleContext" or whatever you want to call it, if >>> properly implemented and perhaps with a few extra functions could solve >>> all the problems of random access. >>> >>> I think we just want something with the functionality of push() and >>> pop() but that can be extracted and kept outside, memorizing the >>> equivalent of a single stack position, and perhaps with a bit more >>> functionality. >> It will be interesting to see how much / little info we can get away >> with saving and still be able to start an XPath expression from an >> arbitrary point. >> >> Important: it may make the implementation easier if we make the >> arbitrary starting point the new document root - just for xpath >> evaluation purposes. I think this is perfect, actually. I don't want >> anything except for the node and its children - that's why I'm >> explicitly pointing there in the first place. >> >> F.E. >> (forgive the illegal simplified syntax..) >> >> aaa >> bbb >> ccc <- Index saved, new root for xpath eval. >> eee >> fff >> /ccc >> dd >> ... >> >> I'd want to say something like this: >> >> vtdNav.toElement(cccIndex); // reset cursor to ccc >> ap.selectXPath("/ccc/*") -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Rodrigo C. <rn...@gm...> - 2007-03-02 20:57:53
Attachments:
SimpleContext.java
VTDNav.java
|
Mark (et all.), you can try my random access API now if you want. The code is working and tested while processing very large and complex XML. As I previously said, the API can be used in a very simple way: SimpleContext ctx = new SimpleContext(); navigator.setCtxFromNav(ctx); // hold context // do whatever you want navigator.setNavFromCtx(ctx); // return to the context The code of SimpleContext is very easy and understandable. The code for the functions in VTDNav is also very simple, and basically a copy+paste of the code used for stack memory. Both hashCode() and equals() are implemented in an efficient way (untested...) so that contexts can be kept in structures requiring such attributes. Also the context class can be created in large numbers, since it's so lightweight. You just need to recompile ximpleware-2.0 using my 2 files, included here as attachements. New ideas, bug reports, etc, are welcomed :-) |
From: Jimmy Z. <cra...@co...> - 2007-03-01 04:30:33
|
I am doing an article that explains Xpath part of VTD-XML... I will get you a sneek preview since I am using google's doc services, which allows me to publish the content... ----- Original Message ----- From: "Mark Swanson" <ma...@Sc...> To: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Wednesday, February 28, 2007 7:51 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Rodrigo Cunha wrote: >> Nah... given what i understand about the internal structure making it >> the new document root is not trivial, i think. >> >> But you could get away with: >> >> - A bookmark/position saving system, aka the (in)famous SimpleContext >> class?... >> >> - A function evaluating: >> >> boolean SimpleContext.isSunOf(Simplecontext ctx); //this should be >> simple and efficient to implement > > You would know better than me. I have a high level requirement to be > able to jump to a point and do XPath on it. It doesn't have to be the > new document root - I just thought that might be easier. If your > SimpleContext class will help me get there then I'm all for it. > > Cheers. > > -- > http://www.ScheduleWorld.com/ > Free Google Calendar synchronization with Outlook, Evolution, > cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, > Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! > WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-03-01 03:20:56
|
Nah... given what i understand about the internal structure making it the new document root is not trivial, i think. But you could get away with: - A bookmark/position saving system, aka the (in)famous SimpleContext class?... - A function evaluating: boolean SimpleContext.isSunOf(Simplecontext ctx); //this should be simple and efficient to implement Regards, Rodrigo Mark Swanson wrote: > Rodrigo Cunha wrote: >> For a SimpleContext, or something similar: >> >> The comparison operator could just compare the arrays containing the >> internal state of Navigator. >> >> The hashcode would be the XOR of all elements in the array, that >> implements java hashcode and equals in a compatible way with the >> assumed rules in java libraries. >> >> Jimmy, I think my "SimpleContext" or whatever you want to call it, if >> properly implemented and perhaps with a few extra functions could >> solve all the problems of random access. >> >> I think we just want something with the functionality of push() and >> pop() but that can be extracted and kept outside, memorizing the >> equivalent of a single stack position, and perhaps with a bit more >> functionality. > > It will be interesting to see how much / little info we can get away > with saving and still be able to start an XPath expression from an > arbitrary point. > > Important: it may make the implementation easier if we make the > arbitrary starting point the new document root - just for xpath > evaluation purposes. I think this is perfect, actually. I don't want > anything except for the node and its children - that's why I'm > explicitly pointing there in the first place. > > F.E. > (forgive the illegal simplified syntax..) > > aaa > bbb > ccc <- Index saved, new root for xpath eval. > eee > fff > /ccc > dd > ... > > I'd want to say something like this: > > vtdNav.toElement(cccIndex); // reset cursor to ccc > ap.selectXPath("/ccc/*") > > Cheers. > |