From: Jimmy Z. <cra...@co...> - 2007-03-06 04:45:01
|
> Just a FYI: I have cases where the key is an Integer, and cases where > it's a string. By Integer is it a java class? or just a primitive data type? Maybe I can modify Rodrigo's class and put it into CVS so you guys can use immediately... however, I can't guarantee that it will be included in the next release... Would that work? > >> If instead of keeping a context I can keep a simple integer and then >> order a VTDNav "hey you, get this integer you told me to keep and go to >> node you bookmarked" I would say it's ok, if the operation "get to the >> node" is fast. >> >> So, you're suggesting an API that would work like this: >> >> RandomNodeRecorder xpto = new RandomNodeRecorder(navigator); >> // xpto is the bookmark keeper organized in a way Jimmy likes :-) >> int mark = xpto.keepPos(); >> /* do some stuff here */ >> boolean xpto.fetchPos(mark); // back to the bookmarked node >> xpto.del(mark); // don't need the mark any longer >> >> I still fail to understand why shoudn't a context be kept outside the >> structures you seem to like :-) > > Well, I'd be interested in knowing the time/space trade offs for both. > For one specific case, I could have an int as the key, and an int as the > mark/vtd-node. Both ints could be native ints with fastutil. Maybe the CPU > overhead is much smaller with SimpleContext though... it would be nice to > see what Jimmy has in mind (the details). > >> Memory is cheap, and for example, if I keep a hash of NEs, and each NE >> occupies a few KB itself, it's irrelevant if I'm gona use a few more >> bytes for each NE. >> >> I'm not suggesting one should keep large structures containing any single >> node in the document, ok? But the random access to a cached node must be >> fast. I emphasize: fast random access to cached nodes. > > +1 > >> As far as I understand the SimpleContext structure grows 4 bytes for each >> depth level, so a deeper node consumes more space, right? So a really >> deep node, let's say, at level 10, will consume 40 extra bytes, plus the >> base consumption... that's 48 bytes, quite small, unless the node is >> small and irrelevant. > > It is small, but I'm looking at about 6k indexes to cache per document, > and as many documents cached as possible. Over-guessing at 100 bytes per > SimpleContext (total) would mean 600KB of SimpleContext objects per > document. I understand my use cases deal with larger than normal > documents, but that just means I have so much more to gain from random > access. > > Cheers. > > -- > http://www.ScheduleWorld.com/ > Free Google Calendar synchronization with Outlook, Evolution, > cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, > Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! > WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. > |