You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(6) |
Jul
(21) |
Aug
(40) |
Sep
(7) |
Oct
(41) |
Nov
(52) |
Dec
(19) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(49) |
Feb
(37) |
Mar
(84) |
Apr
(11) |
May
(29) |
Jun
(9) |
Jul
(19) |
Aug
(9) |
Sep
(6) |
Oct
(5) |
Nov
(15) |
Dec
(3) |
2008 |
Jan
(7) |
Feb
(11) |
Mar
(25) |
Apr
(50) |
May
(7) |
Jun
(8) |
Jul
(10) |
Aug
(18) |
Sep
(1) |
Oct
(15) |
Nov
(1) |
Dec
(9) |
2009 |
Jan
(5) |
Feb
(2) |
Mar
(3) |
Apr
(5) |
May
(10) |
Jun
(4) |
Jul
(5) |
Aug
(5) |
Sep
(7) |
Oct
(15) |
Nov
(13) |
Dec
(6) |
2010 |
Jan
|
Feb
(3) |
Mar
(4) |
Apr
(6) |
May
|
Jun
(4) |
Jul
(12) |
Aug
(8) |
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2011 |
Jan
(19) |
Feb
(39) |
Mar
(28) |
Apr
(6) |
May
(7) |
Jun
(9) |
Jul
|
Aug
(1) |
Sep
|
Oct
(8) |
Nov
(3) |
Dec
(12) |
2012 |
Jan
(2) |
Feb
(1) |
Mar
(3) |
Apr
(4) |
May
(4) |
Jun
(3) |
Jul
(10) |
Aug
(2) |
Sep
(13) |
Oct
(24) |
Nov
(3) |
Dec
(1) |
2013 |
Jan
(11) |
Feb
(5) |
Mar
(4) |
Apr
(3) |
May
(3) |
Jun
(5) |
Jul
(7) |
Aug
(16) |
Sep
|
Oct
(7) |
Nov
(11) |
Dec
|
2014 |
Jan
(7) |
Feb
(4) |
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
(1) |
Sep
(3) |
Oct
|
Nov
(3) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
(1) |
Apr
(11) |
May
(8) |
Jun
(3) |
Jul
(1) |
Aug
(3) |
Sep
(5) |
Oct
(2) |
Nov
(1) |
Dec
(1) |
2016 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(3) |
May
(7) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(6) |
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
(5) |
Apr
|
May
(2) |
Jun
|
Jul
(4) |
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2019 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mark S. <ma...@Sc...> - 2007-02-28 16:11:56
|
Rodrigo Cunha wrote: > I understand NodeRecorder was not intended to be kept in large numbers, > but I think that should exactly be the idea of a random access API: a > lightweight way of keeping a bunch of bookmarks in the datastructure the > programmer wants, not in the structure we want, or something... > > Your API is nice for somewhat serial processing, not for true random > access, using pre-build hash tables, for example, or trees, or whatever. > I could built a wrapper around NodeRecorder implementing a simplier API, > but that would be really clumsy. > > My API, while incomplete, is much more simple, and flexible also... it's > also rather light. I would like to learn about other opinions on the > subject, since we are probably both too used to our way of doing things > to be impartial. It would be most helpful to me if I could index arbitrary element indexes and start and XPath query from one of these indexes. I would cache these indexes in a Map with key: some unique ID, value: some sort of vtd-xml node index. For most of the applications I use XML for, this would be the only way to get acceptable performance. Ultimately, without this I would not be able to consider using vtd-xml for these apps and I would be forced to use an xml - Object mapping tool. I've been using and helping maintain/fix a number of XML - Object mapping tools over the years. It's been an interesting area of study for me. Please free me from the insufferable weight of those chains :-) Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Rodrigo C. <rn...@gm...> - 2007-02-28 12:45:25
|
I understand NodeRecorder was not intended to be kept in large numbers, but I think that should exactly be the idea of a random access API: a lightweight way of keeping a bunch of bookmarks in the datastructure the programmer wants, not in the structure we want, or something... Your API is nice for somewhat serial processing, not for true random access, using pre-build hash tables, for example, or trees, or whatever. I could built a wrapper around NodeRecorder implementing a simplier API, but that would be really clumsy. My API, while incomplete, is much more simple, and flexible also... it's also rather light. I would like to learn about other opinions on the subject, since we are probably both too used to our way of doing things to be impartial. Jimmy Zhang wrote: > There is an example in the code example that shows you how to use this > class correctly... > resetPointer is only called *after* you finish recording and *before* > you* > start reading... > Look at the example and let me know, I thought about the possibility of > creating something as part of VTDNav, and vote against it because > (1) multiple instance of nodeRecorder can be instantiated > (2) It could get pretty heavy if overused > > The suggestion that you wrote seems to assume that there are only a few > copies of context, that may not general purpose enough for other people's > needs > ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> > To: <vtd...@li...> > Sent: Tuesday, February 27, 2007 10:42 AM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Hi Jimmy! >> >> I've tested the new random access API... it doesn't quite work the way I >> expected, for example: >> >> AutoPilot ap = new AutoPilot(vn); >> ap.bind(vn); >> ap.selectElement("ServiceID"); >> NodeRecorder myContext = new NodeRecorder(vn); >> while(ap.iterate()){ >> myContext.record(); >> // do something messy >> myContext.resetPointer(); >> } >> >> resetPointer also affects ap context precluding it from correctly >> cycling the values. >> >> Why don't you just export something similar to what is kept in the stack >> anyway? A context is so simple, you just need an efficient byte array... >> >> When you ask for context repositioning you just have to overwrite the >> values in "vn", no need for anything more. A context is (should be) the >> equivalent to a stack position, nothing more... >> >> I think I sent you my altered ximpleware_1.6, do you want the code to >> look again? >> >> This is a simple context: >> >> /** >> * This class is used to store a single context of VTDNav class. >> */ >> public class SimpleContext{ >> private int[] buf; >> private int bufsize = 0; >> public SimpleContext(int[] in) { >> // This allows both allocation during creation and allocating >> // an adequate buffer size so that no further reallocation is >> // needed in the future. >> if (in != null) { >> buf = in.clone(); >> bufsize = in.length; >> } else { >> buf = new int[0]; >> bufsize = 0; >> } >> } >> public void set(int[] in) { >> if (buf.length < in.length) { >> buf = in.clone(); >> } else { >> System.arraycopy(in,0,buf,0,in.length); >> bufsize = in.length; >> } >> } >> public boolean get(int[] out) { >> if (bufsize > 0) { >> if (out.length != buf.length) { >> out = buf.clone(); >> } else { >> System.arraycopy(buf,0,out,0,bufsize); >> } >> return true; >> } else { >> return false; >> } >> } >> } >> >> >> Jimmy Zhang wrote: >>> The latest benchmark reports (on Version 2.0) is now live >>> at >>> http://vtd-xml.sf.net/benchmark1.html >>> >>> The corresponding benchmark code also was uploaded >>> to the sourceforge at >>> >>> http://sourceforge.net/project/showfiles.php?group_id=110612 >>> >>> >>> ------------------------------------------------------------------------- >>> >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys-and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> >>> _______________________________________________ >>> Vtd-xml-users mailing list >>> Vtd...@li... >>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>> >>> >> >> >> ------------------------------------------------------------------------- >> >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > > |
From: Jimmy Z. <cra...@co...> - 2007-02-27 19:45:40
|
There is an example in the code example that shows you how to use this class correctly... resetPointer is only called *after* you finish recording and *before* you* start reading... Look at the example and let me know, I thought about the possibility of creating something as part of VTDNav, and vote against it because (1) multiple instance of nodeRecorder can be instantiated (2) It could get pretty heavy if overused The suggestion that you wrote seems to assume that there are only a few copies of context, that may not general purpose enough for other people's needs ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Tuesday, February 27, 2007 10:42 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Hi Jimmy! > > I've tested the new random access API... it doesn't quite work the way I > expected, for example: > > AutoPilot ap = new AutoPilot(vn); > ap.bind(vn); > ap.selectElement("ServiceID"); > NodeRecorder myContext = new NodeRecorder(vn); > while(ap.iterate()){ > myContext.record(); > // do something messy > myContext.resetPointer(); > } > > resetPointer also affects ap context precluding it from correctly > cycling the values. > > Why don't you just export something similar to what is kept in the stack > anyway? A context is so simple, you just need an efficient byte array... > > When you ask for context repositioning you just have to overwrite the > values in "vn", no need for anything more. A context is (should be) the > equivalent to a stack position, nothing more... > > I think I sent you my altered ximpleware_1.6, do you want the code to > look again? > > This is a simple context: > > /** > * This class is used to store a single context of VTDNav class. > */ > public class SimpleContext{ > private int[] buf; > private int bufsize = 0; > public SimpleContext(int[] in) { > // This allows both allocation during creation and allocating > // an adequate buffer size so that no further reallocation is > // needed in the future. > if (in != null) { > buf = in.clone(); > bufsize = in.length; > } else { > buf = new int[0]; > bufsize = 0; > } > } > public void set(int[] in) { > if (buf.length < in.length) { > buf = in.clone(); > } else { > System.arraycopy(in,0,buf,0,in.length); > bufsize = in.length; > } > } > public boolean get(int[] out) { > if (bufsize > 0) { > if (out.length != buf.length) { > out = buf.clone(); > } else { > System.arraycopy(buf,0,out,0,bufsize); > } > return true; > } else { > return false; > } > } > } > > > Jimmy Zhang wrote: >> The latest benchmark reports (on Version 2.0) is now live >> at >> http://vtd-xml.sf.net/benchmark1.html >> >> The corresponding benchmark code also was uploaded >> to the sourceforge at >> >> http://sourceforge.net/project/showfiles.php?group_id=110612 >> >> >> ------------------------------------------------------------------------- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to share >> your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users |
From: Rodrigo C. <rn...@gm...> - 2007-02-27 19:40:01
|
Hi! Here goes ximpleware_2.0 files I've changed to make my very simple random navigation API available. A very simple and useless example of usage: myContext = new SimpleContext(null); //Or another size you like, but null works just fine while(ap.iterate()){ vn.setCtxFromNav(myContext); // do something messy vn.setNavFromCtx(myContext); } Of course nicer example might include keeping large numbers of SimpleContexts in hash tables, trees, etc. Jimmy Zhang wrote: > The latest benchmark reports (on Version 2.0) is now live > at > http://vtd-xml.sf.net/benchmark1.html > > The corresponding benchmark code also was uploaded > to the sourceforge at > > http://sourceforge.net/project/showfiles.php?group_id=110612 > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > |
From: Rodrigo C. <rn...@gm...> - 2007-02-27 18:43:40
|
Hi Jimmy! I've tested the new random access API... it doesn't quite work the way I expected, for example: AutoPilot ap = new AutoPilot(vn); ap.bind(vn); ap.selectElement("ServiceID"); NodeRecorder myContext = new NodeRecorder(vn); while(ap.iterate()){ myContext.record(); // do something messy myContext.resetPointer(); } resetPointer also affects ap context precluding it from correctly cycling the values. Why don't you just export something similar to what is kept in the stack anyway? A context is so simple, you just need an efficient byte array... When you ask for context repositioning you just have to overwrite the values in "vn", no need for anything more. A context is (should be) the equivalent to a stack position, nothing more... I think I sent you my altered ximpleware_1.6, do you want the code to look again? This is a simple context: /** * This class is used to store a single context of VTDNav class. */ public class SimpleContext{ private int[] buf; private int bufsize = 0; public SimpleContext(int[] in) { // This allows both allocation during creation and allocating // an adequate buffer size so that no further reallocation is // needed in the future. if (in != null) { buf = in.clone(); bufsize = in.length; } else { buf = new int[0]; bufsize = 0; } } public void set(int[] in) { if (buf.length < in.length) { buf = in.clone(); } else { System.arraycopy(in,0,buf,0,in.length); bufsize = in.length; } } public boolean get(int[] out) { if (bufsize > 0) { if (out.length != buf.length) { out = buf.clone(); } else { System.arraycopy(buf,0,out,0,bufsize); } return true; } else { return false; } } } Jimmy Zhang wrote: > The latest benchmark reports (on Version 2.0) is now live > at > http://vtd-xml.sf.net/benchmark1.html > > The corresponding benchmark code also was uploaded > to the sourceforge at > > http://sourceforge.net/project/showfiles.php?group_id=110612 > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > |
From: Jimmy Z. <cra...@co...> - 2007-02-25 03:40:12
|
The latest benchmark reports (on Version 2.0) is now live at http://vtd-xml.sf.net/benchmark1.html The corresponding benchmark code also was uploaded to the sourceforge at http://sourceforge.net/project/showfiles.php?group_id=110612 |
From: Jimmy Z. <cra...@co...> - 2007-02-21 19:36:51
|
I will get back to you on this subject after finishing the benchmark report update... ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Wednesday, February 21, 2007 2:44 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Because there are a few thousands <ne> and several thousands <tpath>. > > In my version, 1.6 with my patches to allow Context export, that cache > takes a few seconds to build, but the speedup afterwards is huge, on the > order of 30x, due to the relative complexity of the original indexing. > > I'll try to convert the code into VTD-2.0 soon and see if the > performance holds (or improves). > > Jimmy Zhang wrote: >> A few thousand?? Why would you do that? >> A few thousand of anything would slow things down... >> >> What are you trying to accomplish? I think you may only >> need to instantiate one... >> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >> Cc: <vtd...@li...> >> Sent: Tuesday, February 20, 2007 12:10 PM >> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >> >> >>> Ok, so I see a new NodeRecorder. >>> >>> I didn't saw the internals of NodeRecorder yet, but I presume it's >>> lightweight, so I can instanciate a few thousands without major trouble >>> and keep them in my internal structures, right? >>> >>> I think you should introduce two new methods into NodeRecorder: >>> >>> VTDNav NodeRecorder.getNav(); >>> >>> int NodeRecorder.getPositionsCount(); >>> >>> Thanks, >>> >>> Rodrigo >>> >>> Jimmy Zhang wrote: >>>> the source forge shell service is down, the document for 2.0 is at >>>> http://www.ximpleware.com/doc/ >>>> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >>>> To: <vtd...@li...> >>>> Sent: Thursday, February 15, 2007 3:19 AM >>>> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >>>> >>>> >>>>> Well, just some ideas concerning what I think should be the nature >>>>> of a >>>>> "context": >>>>> >>>>> - As light as possible to generate, manipulate and access (so just >>>>> use a >>>>> simple context with minimun clutter). >>>>> - Comparable. >>>>> - Hashable efficiently (good and fast dispertion function). >>>>> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >>>>> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >>>>> sense if you have various equal VTDNavs and a RMI-based system, so it >>>>> should be possible despite perhaps including dire warnings in the >>>>> documentation). >>>>> >>>>> Jimmy Zhang wrote: >>>>> >>>>> Yes, will try, but then again, there will always be a 2.1 :) >>>>> >>>>> >>>>> ------------------------------------------------------------------------- >>>>> >>>>> >>>>> Take Surveys. Earn Cash. Influence the Future of IT >>>>> Join SourceForge.net's Techsay panel and you'll get the chance to >>>>> share your >>>>> opinions on IT & business topics through brief surveys-and earn cash >>>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>>>> >>>>> >>>>> _______________________________________________ >>>>> Vtd-xml-users mailing list >>>>> Vtd...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>>>> >>>> >>>> >>>> >>> >>> >>> ------------------------------------------------------------------------- >>> >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys-and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> >>> _______________________________________________ >>> Vtd-xml-users mailing list >>> Vtd...@li... >>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>> >> >> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Mark S. <ma...@Sc...> - 2007-02-21 14:51:11
|
Jimmy Zhang wrote: > That is a nice feature that can already be accomplished using what is in > VTD-XML 2.0... > Moreover, since the representation of VTD-XML's node is inherently > persistent... > it can be included as part of the VTD+XML index, but it is a very > advanced feature > ..... and you guys are way ahead now that I am feeling behind already ... LOL - don't sweat it Jimmy. vtd-xml is inspirational. Looking forward to an article on this whenever you get the time. Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Rodrigo C. <rn...@gm...> - 2007-02-21 10:44:57
|
Because there are a few thousands <ne> and several thousands <tpath>. In my version, 1.6 with my patches to allow Context export, that cache takes a few seconds to build, but the speedup afterwards is huge, on the order of 30x, due to the relative complexity of the original indexing. I'll try to convert the code into VTD-2.0 soon and see if the performance holds (or improves). Jimmy Zhang wrote: > A few thousand?? Why would you do that? > A few thousand of anything would slow things down... > > What are you trying to accomplish? I think you may only > need to instantiate one... > ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> > Cc: <vtd...@li...> > Sent: Tuesday, February 20, 2007 12:10 PM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Ok, so I see a new NodeRecorder. >> >> I didn't saw the internals of NodeRecorder yet, but I presume it's >> lightweight, so I can instanciate a few thousands without major trouble >> and keep them in my internal structures, right? >> >> I think you should introduce two new methods into NodeRecorder: >> >> VTDNav NodeRecorder.getNav(); >> >> int NodeRecorder.getPositionsCount(); >> >> Thanks, >> >> Rodrigo >> >> Jimmy Zhang wrote: >>> the source forge shell service is down, the document for 2.0 is at >>> http://www.ximpleware.com/doc/ >>> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >>> To: <vtd...@li...> >>> Sent: Thursday, February 15, 2007 3:19 AM >>> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >>> >>> >>>> Well, just some ideas concerning what I think should be the nature >>>> of a >>>> "context": >>>> >>>> - As light as possible to generate, manipulate and access (so just >>>> use a >>>> simple context with minimun clutter). >>>> - Comparable. >>>> - Hashable efficiently (good and fast dispertion function). >>>> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >>>> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >>>> sense if you have various equal VTDNavs and a RMI-based system, so it >>>> should be possible despite perhaps including dire warnings in the >>>> documentation). >>>> >>>> Jimmy Zhang wrote: >>>> >>>> Yes, will try, but then again, there will always be a 2.1 :) >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> >>>> >>>> Take Surveys. Earn Cash. Influence the Future of IT >>>> Join SourceForge.net's Techsay panel and you'll get the chance to >>>> share your >>>> opinions on IT & business topics through brief surveys-and earn cash >>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>>> >>>> >>>> _______________________________________________ >>>> Vtd-xml-users mailing list >>>> Vtd...@li... >>>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>>> >>> >>> >>> >> >> >> ------------------------------------------------------------------------- >> >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >> > > > |
From: Jimmy Z. <cra...@co...> - 2007-02-21 08:04:35
|
That is a nice feature that can already be accomplished using what is in VTD-XML 2.0... Moreover, since the representation of VTD-XML's node is inherently persistent... it can be included as part of the VTD+XML index, but it is a very advanced feature ..... and you guys are way ahead now that I am feeling behind already ... ----- Original Message ----- From: "Mark Swanson" <ma...@Sc...> To: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Tuesday, February 20, 2007 9:58 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > <snip> >> In order to navigate the file efficiently and produce interactive >> results I was forced to maintain positions caches for both <ne> and >> <tpath> indexed by those nice very-inner atributes. >> >> For example, a task that took 36 seconds using unhelped navigation can >> now be done in 1 or 2 seconds. > > Interesting. > >> Any comments? > > I like it. I have a project coming up that is going to require ripping > through the same 1mb-5mb xml documents a lot. All of the required > queries will be able to start from a unique position > (nice_indexing_attribute) and will not need to traverse more than a > dozen elements on average. > > If I could cache the starting positions in a HashMap and force the > VTD-XML xpath query to start at a specific position that would save the > xpath query engine from having to walk through a lot of unnecessary data. > > Just thinking out loud. I'm not sure if I understood if this is coming > in 2.0 or not. > > Cheers. > > > -- > http://www.ScheduleWorld.com/ > Free Google Calendar synchronization with Outlook, Evolution, > cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, > Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! > WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Jimmy Z. <cra...@co...> - 2007-02-21 07:59:16
|
Mark, there is no easy answer to Rodrigo's question, I think a dedicated article would help explain some of my thoughts on this ... Jimmy ----- Original Message ----- From: "Mark Swanson" <ma...@Sc...> To: "Jimmy Zhang" <cra...@co...> Cc: <vtd...@li...> Sent: Tuesday, February 20, 2007 9:48 PM Subject: Re: [Vtd-xml-users] vtd-xml 2.0 is here! > Jimmy Zhang wrote: >> Mark, the index feature has a short summary at >> http://vtd-xml.sourceforge.net/persistence.html >> >> It basically allows users to parse XML once and use >> it for as often as they would like ... It also can be >> viewed as a native XML index, or a binary XML >> format back-compatible with XML itself... > > Ok, that's easy enough to understand. Thanks. > > I was confused with the position/index caching in the previous thread. > > Cheers. > > -- > http://www.ScheduleWorld.com/ > Free Google Calendar synchronization with Outlook, Evolution, > cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, > Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! > WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. > |
From: Mark S. <ma...@Sc...> - 2007-02-21 05:58:25
|
<snip> > In order to navigate the file efficiently and produce interactive > results I was forced to maintain positions caches for both <ne> and > <tpath> indexed by those nice very-inner atributes. > > For example, a task that took 36 seconds using unhelped navigation can > now be done in 1 or 2 seconds. Interesting. > Any comments? I like it. I have a project coming up that is going to require ripping through the same 1mb-5mb xml documents a lot. All of the required queries will be able to start from a unique position (nice_indexing_attribute) and will not need to traverse more than a dozen elements on average. If I could cache the starting positions in a HashMap and force the VTD-XML xpath query to start at a specific position that would save the xpath query engine from having to walk through a lot of unnecessary data. Just thinking out loud. I'm not sure if I understood if this is coming in 2.0 or not. Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Mark S. <ma...@Sc...> - 2007-02-21 05:48:31
|
Jimmy Zhang wrote: > Mark, the index feature has a short summary at > http://vtd-xml.sourceforge.net/persistence.html > > It basically allows users to parse XML once and use > it for as often as they would like ... It also can be > viewed as a native XML index, or a binary XML > format back-compatible with XML itself... Ok, that's easy enough to understand. Thanks. I was confused with the position/index caching in the previous thread. Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Jimmy Z. <cra...@co...> - 2007-02-21 04:20:00
|
The C,C# and Java, both light and full version are now released, in a few days the benchmark code will also be released, at that time, new benchmark reports will come out... if you downloaded 2.0 yesterday, I advise you to do it again... as today there is addition to the example directory containng code on how to use NodeRecorder... for those using the C version, there is a parseFile method that will make the coding a lot easier... There also will be articles coming out soon concerning the Xpath design and uses, as well as the new indexing feature ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Tuesday, February 20, 2007 12:10 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Ok, so I see a new NodeRecorder. > > I didn't saw the internals of NodeRecorder yet, but I presume it's > lightweight, so I can instanciate a few thousands without major trouble > and keep them in my internal structures, right? > > I think you should introduce two new methods into NodeRecorder: > > VTDNav NodeRecorder.getNav(); > > int NodeRecorder.getPositionsCount(); > > Thanks, > > Rodrigo > > Jimmy Zhang wrote: >> the source forge shell service is down, the document for 2.0 is at >> http://www.ximpleware.com/doc/ >> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >> To: <vtd...@li...> >> Sent: Thursday, February 15, 2007 3:19 AM >> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >> >> >>> Well, just some ideas concerning what I think should be the nature of a >>> "context": >>> >>> - As light as possible to generate, manipulate and access (so just use a >>> simple context with minimun clutter). >>> - Comparable. >>> - Hashable efficiently (good and fast dispertion function). >>> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >>> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >>> sense if you have various equal VTDNavs and a RMI-based system, so it >>> should be possible despite perhaps including dire warnings in the >>> documentation). >>> >>> Jimmy Zhang wrote: >>> >>> Yes, will try, but then again, there will always be a 2.1 :) >>> >>> >>> ------------------------------------------------------------------------- >>> >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys-and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> >>> _______________________________________________ >>> Vtd-xml-users mailing list >>> Vtd...@li... >>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>> >> >> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Jimmy Z. <cra...@co...> - 2007-02-21 04:16:10
|
A few thousand?? Why would you do that? A few thousand of anything would slow things down... What are you trying to accomplish? I think you may only need to instantiate one... ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> Cc: <vtd...@li...> Sent: Tuesday, February 20, 2007 12:10 PM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Ok, so I see a new NodeRecorder. > > I didn't saw the internals of NodeRecorder yet, but I presume it's > lightweight, so I can instanciate a few thousands without major trouble > and keep them in my internal structures, right? > > I think you should introduce two new methods into NodeRecorder: > > VTDNav NodeRecorder.getNav(); > > int NodeRecorder.getPositionsCount(); > > Thanks, > > Rodrigo > > Jimmy Zhang wrote: >> the source forge shell service is down, the document for 2.0 is at >> http://www.ximpleware.com/doc/ >> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >> To: <vtd...@li...> >> Sent: Thursday, February 15, 2007 3:19 AM >> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >> >> >>> Well, just some ideas concerning what I think should be the nature of a >>> "context": >>> >>> - As light as possible to generate, manipulate and access (so just use a >>> simple context with minimun clutter). >>> - Comparable. >>> - Hashable efficiently (good and fast dispertion function). >>> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >>> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >>> sense if you have various equal VTDNavs and a RMI-based system, so it >>> should be possible despite perhaps including dire warnings in the >>> documentation). >>> >>> Jimmy Zhang wrote: >>> >>> Yes, will try, but then again, there will always be a 2.1 :) >>> >>> >>> ------------------------------------------------------------------------- >>> >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys-and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> >>> _______________________________________________ >>> Vtd-xml-users mailing list >>> Vtd...@li... >>> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >>> >> >> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-20 20:10:58
|
Ok, so I see a new NodeRecorder. I didn't saw the internals of NodeRecorder yet, but I presume it's lightweight, so I can instanciate a few thousands without major trouble and keep them in my internal structures, right? I think you should introduce two new methods into NodeRecorder: VTDNav NodeRecorder.getNav(); int NodeRecorder.getPositionsCount(); Thanks, Rodrigo Jimmy Zhang wrote: > the source forge shell service is down, the document for 2.0 is at > http://www.ximpleware.com/doc/ > ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> > To: <vtd...@li...> > Sent: Thursday, February 15, 2007 3:19 AM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Well, just some ideas concerning what I think should be the nature of a >> "context": >> >> - As light as possible to generate, manipulate and access (so just use a >> simple context with minimun clutter). >> - Comparable. >> - Hashable efficiently (good and fast dispertion function). >> - Possible to associate with VTDNav (so contains a pointer to VTDNav). >> - Usable in another VTDNav (that's a tricky one, and unsafe, but makes >> sense if you have various equal VTDNavs and a RMI-based system, so it >> should be possible despite perhaps including dire warnings in the >> documentation). >> >> Jimmy Zhang wrote: >> >> Yes, will try, but then again, there will always be a 2.1 :) >> >> >> ------------------------------------------------------------------------- >> >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys-and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> _______________________________________________ >> Vtd-xml-users mailing list >> Vtd...@li... >> https://lists.sourceforge.net/lists/listinfo/vtd-xml-users >> > > > |
From: Jimmy Z. <cra...@co...> - 2007-02-20 08:28:52
|
Mark, the index feature has a short summary at http://vtd-xml.sourceforge.net/persistence.html It basically allows users to parse XML once and use it for as often as they would like ... It also can be viewed as a native XML index, or a binary XML format back-compatible with XML itself... There will be updates on the VTD-XML site that explain this in more details... currently sourceforge is down ... Jimmy ----- Original Message ----- From: "Mark Swanson" <ma...@Sc...> To: <vtd...@li...> Sent: Monday, February 19, 2007 6:47 PM Subject: Re: [Vtd-xml-users] vtd-xml 2.0 is here! > Jimmy Zhang wrote: >> VTD-XML 2.0 is here! >> >> Among the features introduced in this version is the indexing feature >> that >> is simple, general-purpose and easy to use... >> >> We are still working to update the benchmark to include indexing >> performances ... > > I see the indexing example: > > yet I'm unsure of what the consequences/benefits are of using this. Is > there a paragraph somewhere that explains the index feature in more > detail? Maybe it could be added to the 2.0 javadocs? > > I read the thread on position caching. > > Cheers. > > -- > http://www.ScheduleWorld.com/ > Free Google Calendar synchronization with Outlook, Evolution, > cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, > Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! > WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Mark S. <ma...@Sc...> - 2007-02-20 02:47:55
|
Jimmy Zhang wrote: > VTD-XML 2.0 is here! > > Among the features introduced in this version is the indexing feature that > is simple, general-purpose and easy to use... > > We are still working to update the benchmark to include indexing > performances ... I see the indexing example: yet I'm unsure of what the consequences/benefits are of using this. Is there a paragraph somewhere that explains the index feature in more detail? Maybe it could be added to the 2.0 javadocs? I read the thread on position caching. Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Jimmy Z. <cra...@co...> - 2007-02-20 01:22:07
|
VTD-XML 2.0 is here! Among the features introduced in this version is the indexing feature = that is simple, general-purpose and easy to use... We are still working to update the benchmark to include indexing=20 performances ... http://sourceforge.net/project/showfiles.php?group_id=3D110612 |
From: Jimmy Z. <cra...@co...> - 2007-02-19 08:27:56
|
the source forge shell service is down, the document for 2.0 is at http://www.ximpleware.com/doc/ ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Thursday, February 15, 2007 3:19 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Well, just some ideas concerning what I think should be the nature of a > "context": > > - As light as possible to generate, manipulate and access (so just use a > simple context with minimun clutter). > - Comparable. > - Hashable efficiently (good and fast dispertion function). > - Possible to associate with VTDNav (so contains a pointer to VTDNav). > - Usable in another VTDNav (that's a tricky one, and unsafe, but makes > sense if you have various equal VTDNavs and a RMI-based system, so it > should be possible despite perhaps including dire warnings in the > documentation). > > Jimmy Zhang wrote: > > Yes, will try, but then again, there will always be a 2.1 :) > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-15 11:19:14
|
Well, just some ideas concerning what I think should be the nature of a "context": - As light as possible to generate, manipulate and access (so just use a simple context with minimun clutter). - Comparable. - Hashable efficiently (good and fast dispertion function). - Possible to associate with VTDNav (so contains a pointer to VTDNav). - Usable in another VTDNav (that's a tricky one, and unsafe, but makes sense if you have various equal VTDNavs and a RMI-based system, so it should be possible despite perhaps including dire warnings in the documentation). Jimmy Zhang wrote: Yes, will try, but then again, there will always be a 2.1 :) |
From: Jimmy Z. <cra...@co...> - 2007-02-15 09:00:28
|
Yes, will try, but then again, there will always be a 2.1 :) ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Wednesday, February 14, 2007 3:19 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Oh...! That's great news! > > Could we have a look at the API before the release? > > I might have some suggestions, who knows... :-) > > Jimmy Zhang wrote: >> yes, this feature is due 2.0 coming in a few days :) > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-14 11:20:00
|
Oh...! That's great news! Could we have a look at the API before the release? I might have some suggestions, who knows... :-) Jimmy Zhang wrote: > yes, this feature is due 2.0 coming in a few days :) |
From: Jimmy Z. <cra...@co...> - 2007-02-14 03:02:49
|
yes, this feature is due 2.0 coming in a few days :) ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Tuesday, February 13, 2007 5:40 PM Subject: [Vtd-xml-users] Random Access Proposal (take 2) > Hello there! > > About 10 months ago I started a topic on the discussion forum concerning > the need for true random access and location storing in VTD-XML. > Currently we only have a pop/push interface. > > At the time I had no compelling reasons to advice such a change, from a > pure stack-oriented approach into a more flexible one. I chenged the > API, but my changes where not inserted into the project, due to lack of > compelling reasons, and a somewhat bad design also. > > Now, after using VTD-XML for a few months to work with huge and complex > files I have a reason: position caching. > > Let me give an example, taken from a real problem I faced: > > <document> > [...] > [...] > <nes> > <ne> > <name>XPTO</name> > [...]complex structure describing NE[...] > <level1> > <level2> > [....] > <nice_indexing_atribute> > </level2> > </level1> > </ne> > [...] > a bunch of NEs... > [...] > [...] > [...] > [...] > </nes> > [...] > [...] > [...] > <tpaths> > <tpath> > [...]complex structure describing tpath[...] > <level1> > <a few more levels> > <level4> > [....] > <nice_indexing_atribute> > <pointer to nice NE atribute> > [....] > </level2> > </level1> > <level1> > <a few more levels> > <level4> > [....] > <pointer to nice NE atribute> > [....] > </level2> > </level1> > </tpath> > [...] > more paths... > [...] > </tpaths> > [...] > </document> > > > In order to navigate the file efficiently and produce interactive > results I was forced to maintain positions caches for both <ne> and > <tpath> indexed by those nice very-inner atributes. > > For example, a task that took 36 seconds using unhelped navigation can > now be done in 1 or 2 seconds. > > I had previously changed the API to allow multiple stacks, and > context-export, but as previously mentioned keeping a context unrelated > to a VTDNav object makes not much sense. Perhaps a better operation > would be something like: > > NavContext VTDNav.getCtx(); // sends back a context > > boolean VTDNav.setPos(NavContext ctx); // sets internal navigation > registers from context > > VTDNav NavContext.getNav(); // gets the VTDNav object this context > belongs to > > The Context would internally point at a VTDNav, so that they could check > each other when they need. An exception could be generated if a > non-related context is used in setPos, or simply "false" could be > returned. > > Addicionally contexts should suport some interfaces so that they can be > kept in hash tables efficiently, for example... but that's not a problem > normally. > > I'm currenly using this kind of approach to caching and true random with > my previous interface that exported multiple stacks, but that's > cumbersome, heavy and prone to errors. A lighter interface like this one > i'm proposing now, and better implemented, would be way better, and > cleaner also. > > Any comments? > > -- > Rodrigo Cunha > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier. > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Rodrigo C. <rn...@gm...> - 2007-02-14 01:40:42
|
Hello there! About 10 months ago I started a topic on the discussion forum concerning the need for true random access and location storing in VTD-XML. Currently we only have a pop/push interface. At the time I had no compelling reasons to advice such a change, from a pure stack-oriented approach into a more flexible one. I chenged the API, but my changes where not inserted into the project, due to lack of compelling reasons, and a somewhat bad design also. Now, after using VTD-XML for a few months to work with huge and complex files I have a reason: position caching. Let me give an example, taken from a real problem I faced: <document> [...] [...] <nes> <ne> <name>XPTO</name> [...]complex structure describing NE[...] <level1> <level2> [....] <nice_indexing_atribute> </level2> </level1> </ne> [...] a bunch of NEs... [...] [...] [...] [...] </nes> [...] [...] [...] <tpaths> <tpath> [...]complex structure describing tpath[...] <level1> <a few more levels> <level4> [....] <nice_indexing_atribute> <pointer to nice NE atribute> [....] </level2> </level1> <level1> <a few more levels> <level4> [....] <pointer to nice NE atribute> [....] </level2> </level1> </tpath> [...] more paths... [...] </tpaths> [...] </document> In order to navigate the file efficiently and produce interactive results I was forced to maintain positions caches for both <ne> and <tpath> indexed by those nice very-inner atributes. For example, a task that took 36 seconds using unhelped navigation can now be done in 1 or 2 seconds. I had previously changed the API to allow multiple stacks, and context-export, but as previously mentioned keeping a context unrelated to a VTDNav object makes not much sense. Perhaps a better operation would be something like: NavContext VTDNav.getCtx(); // sends back a context boolean VTDNav.setPos(NavContext ctx); // sets internal navigation registers from context VTDNav NavContext.getNav(); // gets the VTDNav object this context belongs to The Context would internally point at a VTDNav, so that they could check each other when they need. An exception could be generated if a non-related context is used in setPos, or simply "false" could be returned. Addicionally contexts should suport some interfaces so that they can be kept in hash tables efficiently, for example... but that's not a problem normally. I'm currenly using this kind of approach to caching and true random with my previous interface that exported multiple stacks, but that's cumbersome, heavy and prone to errors. A lighter interface like this one i'm proposing now, and better implemented, would be way better, and cleaner also. Any comments? -- Rodrigo Cunha |