From: Jimmy Z. <cra...@co...> - 2007-02-14 03:02:49
|
yes, this feature is due 2.0 coming in a few days :) ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: <vtd...@li...> Sent: Tuesday, February 13, 2007 5:40 PM Subject: [Vtd-xml-users] Random Access Proposal (take 2) > Hello there! > > About 10 months ago I started a topic on the discussion forum concerning > the need for true random access and location storing in VTD-XML. > Currently we only have a pop/push interface. > > At the time I had no compelling reasons to advice such a change, from a > pure stack-oriented approach into a more flexible one. I chenged the > API, but my changes where not inserted into the project, due to lack of > compelling reasons, and a somewhat bad design also. > > Now, after using VTD-XML for a few months to work with huge and complex > files I have a reason: position caching. > > Let me give an example, taken from a real problem I faced: > > <document> > [...] > [...] > <nes> > <ne> > <name>XPTO</name> > [...]complex structure describing NE[...] > <level1> > <level2> > [....] > <nice_indexing_atribute> > </level2> > </level1> > </ne> > [...] > a bunch of NEs... > [...] > [...] > [...] > [...] > </nes> > [...] > [...] > [...] > <tpaths> > <tpath> > [...]complex structure describing tpath[...] > <level1> > <a few more levels> > <level4> > [....] > <nice_indexing_atribute> > <pointer to nice NE atribute> > [....] > </level2> > </level1> > <level1> > <a few more levels> > <level4> > [....] > <pointer to nice NE atribute> > [....] > </level2> > </level1> > </tpath> > [...] > more paths... > [...] > </tpaths> > [...] > </document> > > > In order to navigate the file efficiently and produce interactive > results I was forced to maintain positions caches for both <ne> and > <tpath> indexed by those nice very-inner atributes. > > For example, a task that took 36 seconds using unhelped navigation can > now be done in 1 or 2 seconds. > > I had previously changed the API to allow multiple stacks, and > context-export, but as previously mentioned keeping a context unrelated > to a VTDNav object makes not much sense. Perhaps a better operation > would be something like: > > NavContext VTDNav.getCtx(); // sends back a context > > boolean VTDNav.setPos(NavContext ctx); // sets internal navigation > registers from context > > VTDNav NavContext.getNav(); // gets the VTDNav object this context > belongs to > > The Context would internally point at a VTDNav, so that they could check > each other when they need. An exception could be generated if a > non-related context is used in setPos, or simply "false" could be > returned. > > Addicionally contexts should suport some interfaces so that they can be > kept in hash tables efficiently, for example... but that's not a problem > normally. > > I'm currenly using this kind of approach to caching and true random with > my previous interface that exported multiple stacks, but that's > cumbersome, heavy and prone to errors. A lighter interface like this one > i'm proposing now, and better implemented, would be way better, and > cleaner also. > > Any comments? > > -- > Rodrigo Cunha > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier. > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |