You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(6) |
Jul
(21) |
Aug
(40) |
Sep
(7) |
Oct
(41) |
Nov
(52) |
Dec
(19) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(49) |
Feb
(37) |
Mar
(84) |
Apr
(11) |
May
(29) |
Jun
(9) |
Jul
(19) |
Aug
(9) |
Sep
(6) |
Oct
(5) |
Nov
(15) |
Dec
(3) |
2008 |
Jan
(7) |
Feb
(11) |
Mar
(25) |
Apr
(50) |
May
(7) |
Jun
(8) |
Jul
(10) |
Aug
(18) |
Sep
(1) |
Oct
(15) |
Nov
(1) |
Dec
(9) |
2009 |
Jan
(5) |
Feb
(2) |
Mar
(3) |
Apr
(5) |
May
(10) |
Jun
(4) |
Jul
(5) |
Aug
(5) |
Sep
(7) |
Oct
(15) |
Nov
(13) |
Dec
(6) |
2010 |
Jan
|
Feb
(3) |
Mar
(4) |
Apr
(6) |
May
|
Jun
(4) |
Jul
(12) |
Aug
(8) |
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2011 |
Jan
(19) |
Feb
(39) |
Mar
(28) |
Apr
(6) |
May
(7) |
Jun
(9) |
Jul
|
Aug
(1) |
Sep
|
Oct
(8) |
Nov
(3) |
Dec
(12) |
2012 |
Jan
(2) |
Feb
(1) |
Mar
(3) |
Apr
(4) |
May
(4) |
Jun
(3) |
Jul
(10) |
Aug
(2) |
Sep
(13) |
Oct
(24) |
Nov
(3) |
Dec
(1) |
2013 |
Jan
(11) |
Feb
(5) |
Mar
(4) |
Apr
(3) |
May
(3) |
Jun
(5) |
Jul
(7) |
Aug
(16) |
Sep
|
Oct
(7) |
Nov
(11) |
Dec
|
2014 |
Jan
(7) |
Feb
(4) |
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
(1) |
Sep
(3) |
Oct
|
Nov
(3) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
(1) |
Apr
(11) |
May
(8) |
Jun
(3) |
Jul
(1) |
Aug
(3) |
Sep
(5) |
Oct
(2) |
Nov
(1) |
Dec
(1) |
2016 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(3) |
May
(7) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(6) |
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
(5) |
Apr
|
May
(2) |
Jun
|
Jul
(4) |
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2019 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jimmy Z. <cra...@co...> - 2007-03-20 20:57:25
|
Ok, I see... it seems that you can be sure that the "chunks" of GML = files contain what the user would need... But in general, if the chunks don't have what one is looking for, you = will have to load in another chunk... then another chunk.. that could mean a lot of disk activities As an alternative, would it be possible to split GML into little chunks = of well-formed GML files, then index them individually.=20 So instead of dealing with 10 big GML files, split them into 100 smaller = GML files and the algorithm you describe may still work.. ----- Original Message -----=20 From: Fernando Gonzalez=20 To: vtd...@li...=20 Sent: Tuesday, March 20, 2007 2:39 AM Subject: Re: [Vtd-xml-users] Storing parsing info On 3/20/07, Jimmy Zhang <cra...@co...> wrote: So what you are trying to accomplish is to load all the GML docs = into memory at once... I guess you can simply index all those files to avoid parsing... but I still don't seem to understand the benefits of read teh parse = info and a chunk of the XML file.. Quite near. What I need is to access a random feature at any time with = as a low cost as possible. That could be possible loading all the GML = docs in memory but the GML files are very big so I cannot do it.=20 As that solution wasn't suitable to my problem, I thought of opening = one file each time (using buffer reuse) and then it came to my mind that = I could save parsing time storing the parse info. As I told before I = cannot delete the GML. Storing the GML twice will waste disk space. I'm = talking about an environment where the user can have in his computer a = lot of digital cartography. Disk space is quite a bottle neck. It could = be valid, but storing only the parse info was so easy that I did it and = I obtained a better solution (for my environment). There is a use case where the user doesn't work with the files = directly, but with a spatial region. In this case, the GML files and = other spatial data are "layers", so the user can work at the same time = with a lot of files. These files can be in other formats than GML, = satellite images, different raster or vectorial formats; and these can = bring the system to a even more memory constrained situation. That's = what lead me to load chunks of the GML file. The workflow is the following * I open a file with the chunk approach * I parse the file (loading it with the chunks approach takes a lot, = but no problem) * I store the parse info=20 The user asks for information: * I load the parse info * I load the chunk * I return the asked information I want to speed up the asking of information because the user can ask = for a map image with 20 GML files, and the map code is something like = this: for each gml file guess what "features" are inside the map bounds (GML is indexed = spatially previously) get those features from the GML (random access) (load parse info + = load chunk + return info)=20 draw the features on a image next gml file Maybe this will make things a bit clearer. This screenshot = (http://www.gvsig.gva.es/fileadmin/conselleria/images/Documentacion/captu= ras/raster_shp_dgn_750.gif) shows a program that uses the library. You = can see on the left all the loaded (from the user point of view) files: = four "dgn" files, one "shp" and seven "ecw" files. A lot of operations = done in the map are done over *every* file listed on the left so I don't = care how much time it takes to put all those files on the left = (generating parse info, etc). I care how much time takes to read the = information after they are loaded (again, from the user point of view). Well, I hope it's clear enough. Notice that I'm not proposing changing = the way VTD-XML works but I'm proposing to add new ways. greetings, Fernando =20 ----- Original Message -----=20 From: Fernando Gonzalez=20 To: vtd...@li...=20 Sent: Monday, March 19, 2007 2:56 AM Subject: Re: [Vtd-xml-users] Storing parsing info Well, jeje, the computer is new but I don't think my disk is so = fast. I think Java or the operating system has to cache something = because the first time I load the file it takes a bit more than 2 = seconds and after the first load, it only takes 300ms to read the = file...=20 I have no experience on doing benchmarks and maybe I'm am missing = something. That's why I attached the program. "So if you can't delete the orginal XML files, can you compress = them and=20 store them away (archiving)?" I cannot delete nor archive the GML file because in this context = it won't be rare to be reading it from two different programs at the = same time... It's difficult to find an open source program that does = everything you need. For example, in a development context, there may be = a map server serving a map image based on a GML file while you are = opening it to see some data in it.=20 "The other issue you raised is buffer reuse. To reuse internal = buffers of=20 VTDGen, you can call setDoc_BR(...). But there is more you can = do... you can in fact reuse the byte array containing the XML document." Buffer reuse absolutly solves my memory constraints. But the = problem I see with buffer reuse is that it will force me to read and = parse the whole XML file every time the user ask for information on = another XML file, won't it? If I read the XML file by chunks and I = store/read the parse information, each time the user asks for = information on another XML file I only have to read the parse info and a = chunk of the XML file.=20 To show you my point of view: The "user asking for another XML file" may be a map server that = reads some big GML files and draws its spatial information in a map = image. If each time the map server draws a GML file and "changes" to the = next it takes 2 seconds or so, the drawing of the map (all the GML = files) takes too much time.=20 best regards, Fernando On 3/19/07, Jimmy Zhang <cra...@co...> wrote:=20 What intrigues me with Fernando's test results is that it only = takes 300ms to read a 100MB file? He got a super fast disk... ----- Original Message -----=20 From: Rodrigo Cunha=20 To: Jimmy Zhang=20 Cc: Fernando Gonzalez ; vtd...@li...=20 Sent: Sunday, March 18, 2007 8:40 PM Subject: Re: [Vtd-xml-users] Storing parsing info In fact the idea occured to me in the past also... but VTD is = so fast reading large files anyway! With a fast processor I think we = might be disk-limited rather than processor-limited. Still, if the code = is made already, the option seems cute enought to keep :-) Since I mainly deal with large files requiring a lots of = processing this has not been an issue. Others, in different = environments, might disagree. Jimmy Zhang wrote:=20 Fernando, The option for storing VTD in a separate file is = open.=20 I attached the technical document from your last email, and = am also=20 interested in the suggestions/comments from the mailing list = ...=20 -------------------------------------------------------------------------= - = -------------------------------------------------------------------------= Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to = share your opinions on IT & business topics through brief surveys-and earn = cash = http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV=20 -------------------------------------------------------------------------= - _______________________________________________ Vtd-xml-users mailing list Vtd...@li... https://lists.sourceforge.net/lists/listinfo/vtd-xml-users = -------------------------------------------------------------------------= Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to = share your=20 opinions on IT & business topics through brief surveys-and earn cash = http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV _______________________________________________ Vtd-xml-users mailing list Vtd...@li... https://lists.sourceforge.net/lists/listinfo/vtd-xml-users=20 -------------------------------------------------------------------------= ----- = -------------------------------------------------------------------------= Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to = share your opinions on IT & business topics through brief surveys-and earn cash = http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV -------------------------------------------------------------------------= ----- _______________________________________________ Vtd-xml-users mailing list Vtd...@li... https://lists.sourceforge.net/lists/listinfo/vtd-xml-users |
From: Tatu S. <cow...@ya...> - 2007-03-20 18:25:49
|
--- Fernando Gonzalez <fer...@gm...> wrote: > As long as I know about GML files... It only defines > data types to be used > on your application specific schema. If you have a > concrete > application-specific schema, congratulations. Use a > mapping tool like JAXB > or Castor or whatever. If you haven't, you can't map > because you don't have > a schema to map. In this point I agree with Jimmy Or equivalent object hierarchy... schema isn't strictly necessary, fortunatley. But object model has to exist and be (or made) compatible. JibX is another fine candidate. > that you should use XPath > to obtain the data of interest and I think VTD-XML > is very useful. I don't know anything about GML specifically, but point I tried to make is that the question was specifically (even if mistakenly) asking about xml/object mapping. Alternative to that may be using VTD-XML with xpath ("instead of object mapping, why not just directly exact data you need"). Just want to keep terminology accurate -- otherwise users expectations may not be met (if they really have object model as was implied). -+ Tatu +- ____________________________________________________________________________________ Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. http://games.yahoo.com/games/front |
From: Fernando G. <fer...@gm...> - 2007-03-20 09:44:18
|
As long as I know about GML files... It only defines data types to be used on your application specific schema. If you have a concrete application-specific schema, congratulations. Use a mapping tool like JAXB or Castor or whatever. If you haven't, you can't map because you don't have a schema to map. In this point I agree with Jimmy that you should use XPath to obtain the data of interest and I think VTD-XML is very useful. Fernando On 3/19/07, Tatu Saloranta <cow...@ya...> wrote: > > --- Peter Neu <pet...@gm...> wrote: > > > Hello, > > > > someone on a different mailing list suggested vtd is > > a good tool for mapping > > GML to Java Objects. How is this done in detail? > > Are there any tutorials on this? > > Hmmh. This actually sounds like a strange piece of > advice, depending on what "good" means. VTD does offer > high performance for efficient tree-based access to > xml content. But it does not (and is not meant to I > think) offer any support for actual mapping between > xml and Java. That's what data binding (and/or > serialization) toolkits/libs offer. And on top of > that, there isn't any specific support for GML, and > likewise, that sounds like something to be done on > higher level tools. > VTD could of course be used as the underlying parser > component for such tools. I don't know if there are > such tools out there yet. > > As to binding data to Java objects, JAXB 2.0 is widely > considered to be a good and reasonably fast library. > As far as I know, it currently supports just SAX and > StAX parsers under the hood. Since much of overhead is > at binding level (constructing java objects, > populating fileds), parser choice may not affect > overall performance that much. > > So regarding goodness: if raw performance is the main > goal, using VTD might make sense. If so, you would NOT > want to map things to objects, but rather deal with > raw xml entities directly. > This would be a trade-off, and resulting code could > very well be hard to use or understand, so its > goodness would depend a lot on priorities. > > -+ Tatu +- > > > > > > ____________________________________________________________________________________ > The fish are biting. > Get more visitors on your site using Yahoo! Search Marketing. > http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Fernando G. <fer...@gm...> - 2007-03-20 09:39:41
|
On 3/20/07, Jimmy Zhang <cra...@co...> wrote: > > So what you are trying to accomplish is to load all the GML docs into > memory at once... > I guess you can simply index all those files to avoid parsing... > but I still don't seem to understand the benefits of read teh parse info > and a chunk of > the XML file.. > Quite near. What I need is to access a random feature at any time with as a low cost as possible. That could be possible loading all the GML docs in memory but the GML files are very big so I cannot do it. As that solution wasn't suitable to my problem, I thought of opening one file each time (using buffer reuse) and then it came to my mind that I could save parsing time storing the parse info. As I told before I cannot delete the GML. Storing the GML twice will waste disk space. I'm talking about an environment where the user can have in his computer a lot of digital cartography. Disk space is quite a bottle neck. It could be valid, but storing only the parse info was so easy that I did it and I obtained a better solution (for my environment). There is a use case where the user doesn't work with the files directly, but with a spatial region. In this case, the GML files and other spatial data are "layers", so the user can work at the same time with a lot of files. These files can be in other formats than GML, satellite images, different raster or vectorial formats; and these can bring the system to a even more memory constrained situation. That's what lead me to load chunks of the GML file. The workflow is the following * I open a file with the chunk approach * I parse the file (loading it with the chunks approach takes a lot, but no problem) * I store the parse info The user asks for information: * I load the parse info * I load the chunk * I return the asked information I want to speed up the asking of information because the user can ask for a map image with 20 GML files, and the map code is something like this: for each gml file guess what "features" are inside the map bounds (GML is indexed spatially previously) get those features from the GML (random access) (load parse info + load chunk + return info) draw the features on a image next gml file Maybe this will make things a bit clearer. This screenshot ( http://www.gvsig.gva.es/fileadmin/conselleria/images/Documentacion/capturas/raster_shp_dgn_750.gif) shows a program that uses the library. You can see on the left all the loaded (from the user point of view) files: four "dgn" files, one "shp" and seven "ecw" files. A lot of operations done in the map are done over *every* file listed on the left so I don't care how much time it takes to put all those files on the left (generating parse info, etc). I care how much time takes to read the information after they are loaded (again, from the user point of view). Well, I hope it's clear enough. Notice that I'm not proposing changing the way VTD-XML works but I'm proposing to add new ways. greetings, Fernando ----- Original Message ----- > *From:* Fernando Gonzalez <fer...@gm...> > *To:* vtd...@li... > *Sent:* Monday, March 19, 2007 2:56 AM > *Subject:* Re: [Vtd-xml-users] Storing parsing info > > Well, jeje, the computer is new but I don't think my disk is so fast. I > think Java or the operating system has to cache something because the first > time I load the file it takes a bit more than 2 seconds and after the first > load, it only takes 300ms to read the file... > I have no experience on doing benchmarks and maybe I'm am missing > something. That's why I attached the program. > > "So if you can't delete the orginal XML files, can you compress them and store > them away (archiving)?" > I cannot delete nor archive the GML file because in this context it won't > be rare to be reading it from two different programs at the same time... > It's difficult to find an open source program that does everything you need. > For example, in a development context, there may be a map server serving a > map image based on a GML file while you are opening it to see some data in > it. > > "The other issue you raised is buffer reuse. To reuse internal buffers of VTDGen, > you can call setDoc_BR(...). But there is more you can do... > you can in fact reuse the byte array containing the XML document." > Buffer reuse absolutly solves my memory constraints. But the problem I see > with buffer reuse is that it will force me to read and parse the whole XML > file every time the user ask for information on another XML file, won't it? > If I read the XML file by chunks and I store/read the parse information, > each time the user asks for information on another XML file I only have to > read the parse info and a chunk of the XML file. > > To show you my point of view: > The "user asking for another XML file" may be a map server that reads some > big GML files and draws its spatial information in a map image. If each time > the map server draws a GML file and "changes" to the next it takes 2 seconds > or so, the drawing of the map (all the GML files) takes too much time. > > best regards, > Fernando > > > On 3/19/07, Jimmy Zhang <cra...@co...> wrote: > > > > > > What intrigues me with Fernando's test results is that it only takes > > 300ms to read a 100MB > > file? He got a super fast disk... > > > > ----- Original Message ----- > > *From:* Rodrigo Cunha <rn...@gm...> > > *To:* Jimmy Zhang <cra...@co...> > > *Cc:* Fernando Gonzalez <fer...@gm...> ; vtd...@li... > > > > *Sent:* Sunday, March 18, 2007 8:40 PM > > *Subject:* Re: [Vtd-xml-users] Storing parsing info > > > > In fact the idea occured to me in the past also... but VTD is so fast > > reading large files anyway! With a fast processor I think we might be > > disk-limited rather than processor-limited. Still, if the code is made > > already, the option seems cute enought to keep :-) > > > > Since I mainly deal with large files requiring a lots of processing this > > has not been an issue. Others, in different environments, might disagree. > > > > Jimmy Zhang wrote: > > > > Fernando, The option for storing VTD in a separate file is open. > > I attached the technical document from your last email, and am also > > interested in the suggestions/comments from the mailing list ... > > > > > > > ------------------------------ > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > ------------------------------ > > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > |
From: Mark S. <ma...@Sc...> - 2007-03-20 02:57:06
|
Jimmy Zhang wrote: > > But putting XML in a separate file also raises the issue of > synchronization.. > (what if one changes the orginal document, but not the VTD file)... if > you use > VTD-XML 2.0 API, the *only* way you can generate a VTD+XML file is > after a successful parsing, this enforce the synchrony of VTD and the > corresponding XML ... > The synchronization issue is important to me. A single encapsulated file is an excellent feature for me. At the very least, please always provide an option to keep things in a single encapsulated byte[]. I strongly suspect I will never write the management code to keep the separate files in sync as long as I have a choice. Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Jimmy Z. <cra...@co...> - 2007-03-20 01:38:11
|
So what you are trying to accomplish is to load all the GML docs into = memory at once... I guess you can simply index all those files to avoid parsing... but I still don't seem to understand the benefits of read teh parse info = and a chunk of the XML file.. ----- Original Message -----=20 From: Fernando Gonzalez=20 To: vtd...@li...=20 Sent: Monday, March 19, 2007 2:56 AM Subject: Re: [Vtd-xml-users] Storing parsing info Well, jeje, the computer is new but I don't think my disk is so fast. = I think Java or the operating system has to cache something because the = first time I load the file it takes a bit more than 2 seconds and after = the first load, it only takes 300ms to read the file...=20 I have no experience on doing benchmarks and maybe I'm am missing = something. That's why I attached the program. "So if you can't delete the orginal XML files, can you compress them = and=20 store them away (archiving)?" I cannot delete nor archive the GML file because in this context it = won't be rare to be reading it from two different programs at the same = time... It's difficult to find an open source program that does = everything you need. For example, in a development context, there may be = a map server serving a map image based on a GML file while you are = opening it to see some data in it.=20 "The other issue you raised is buffer reuse. To reuse internal buffers = of=20 VTDGen, you can call setDoc_BR(...). But there is more you can do... you can in fact reuse the byte array containing the XML document." Buffer reuse absolutly solves my memory constraints. But the problem I = see with buffer reuse is that it will force me to read and parse the = whole XML file every time the user ask for information on another XML = file, won't it? If I read the XML file by chunks and I store/read the = parse information, each time the user asks for information on another = XML file I only have to read the parse info and a chunk of the XML file. = To show you my point of view: The "user asking for another XML file" may be a map server that reads = some big GML files and draws its spatial information in a map image. If = each time the map server draws a GML file and "changes" to the next it = takes 2 seconds or so, the drawing of the map (all the GML files) takes = too much time.=20 best regards, Fernando On 3/19/07, Jimmy Zhang <cra...@co...> wrote:=20 What intrigues me with Fernando's test results is that it only takes = 300ms to read a 100MB file? He got a super fast disk... ----- Original Message -----=20 From: Rodrigo Cunha=20 To: Jimmy Zhang=20 Cc: Fernando Gonzalez ; vtd...@li...=20 Sent: Sunday, March 18, 2007 8:40 PM Subject: Re: [Vtd-xml-users] Storing parsing info In fact the idea occured to me in the past also... but VTD is so = fast reading large files anyway! With a fast processor I think we might = be disk-limited rather than processor-limited. Still, if the code is = made already, the option seems cute enought to keep :-) Since I mainly deal with large files requiring a lots of = processing this has not been an issue. Others, in different = environments, might disagree. Jimmy Zhang wrote:=20 Fernando, The option for storing VTD in a separate file is = open.=20 I attached the technical document from your last email, and am = also=20 interested in the suggestions/comments from the mailing list ... = -------------------------------------------------------------------------= ----- = -------------------------------------------------------------------------= Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to = share your opinions on IT & business topics through brief surveys-and earn cash = http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV -------------------------------------------------------------------------= ----- _______________________________________________ Vtd-xml-users mailing list Vtd...@li... https://lists.sourceforge.net/lists/listinfo/vtd-xml-users |
From: Tatu S. <cow...@ya...> - 2007-03-19 23:47:14
|
--- Rodrigo Cunha <rn...@gm...> wrote: --------------------------------- You may force pre-compilation using -server I think. In fact I usuallydo so, and performance increases even for relatively short and not verycyclic programs. --- Actually, not really: -server just reduces compilation threshold, and that can be done for both -client and -server set ups. By default thresholds are different though. Now, it is possible to set threshold to 0, to really force pre-compilation, but based on my experiences (and comments from Sun VM folks as well). But often that is not a smart thing to do. Much of code is only run through once (initializations), and there is net performance loss, due to wasting memory on native code you are never going to run again. Perhaps there are other cases where this would help. Or perhaps it's more due to HotSpot VM then not being to able to detect real hot spots, and can not do more aggressive optimizations where needed, having to do minimum inlining everywhere. Defining '-server' flag, however, is usually a good thing to do, compared to default -client (at least prior to Java6 where it doesn't matter). -+ Tatu +- ____________________________________________________________________________________ Sucker-punch spam with award-winning protection. Try the free Yahoo! Mail Beta. http://advision.webevents.yahoo.com/mailbeta/features_spam.html |
From: Tatu S. <cow...@ya...> - 2007-03-19 23:42:35
|
--- Boris Kolpackov <bo...@co...> wrote: > Tatu Saloranta <cow...@ya...> writes: > > > > So it perfectly normal (and expected) that > > loading+parsing things first time is significantly > > slower than second time around. > > You forgot to prefix this statement with "For us, > Java > die-hards," ;-). For the rest of us this is neither > normal nor expected. Well, I think it should be more like "for us running on managed runtime environment"? Or does C# naively just compile all code at startup? But yes, comments only relate to JVM, as I don't know intricacies of C#. And as to c version, yeah, it should be more straight-forward. -+ Tatu +- ____________________________________________________________________________________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 |
From: Rodrigo C. <rn...@gm...> - 2007-03-19 22:41:04
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> <title></title> </head> <body bgcolor="#ffffff" text="#000000"> You may force pre-compilation using -server I think. In fact I usually do so, and performance increases even for relatively short and not very cyclic programs.<br> <br> Tatu Saloranta wrote: <blockquote cite="mid...@we..." type="cite"> <pre wrap="">--- Jimmy Zhang <a class="moz-txt-link-rfc2396E" href="mailto:cra...@co..."><cra...@co...></a> wrote: </pre> <blockquote type="cite"> <pre wrap="">Tatu, Are you aware of any pointers that discusses the behavior for server JVM a bit more... In addition to your suggestion, I also think that the </pre> </blockquote> <pre wrap=""><!----> Javasoft has good online docs, so I think googling might work. It is possible to change HotSpot inline threshold via: -XX:CompileThreshold=2000 flag to java executable, for example. </pre> <blockquote type="cite"> <pre wrap="">behavior of JVM has to do with file sizes as well... for a large GML, JVM seems to reach steady state (native code) pretty quickly... </pre> </blockquote> <pre wrap=""><!----> Yeah, that makes sense since bigger files trigger more processing. So it is likely that most tight loops do get inlined during first (single) processing of larger files. So for really big files, the effect may not be as significant as for small files. It's still a good idea to run it multiple times, if possible. -+ Tatu +- ____________________________________________________________________________________ No need to miss a message. Get email on-the-go with Yahoo! Mail for Mobile. Get started. <a class="moz-txt-link-freetext" href="http://mobile.yahoo.com/mail">http://mobile.yahoo.com/mail</a> ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash <a class="moz-txt-link-freetext" href="http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV">http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV</a> _______________________________________________ Vtd-xml-users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Vtd...@li...">Vtd...@li...</a> <a class="moz-txt-link-freetext" href="https://lists.sourceforge.net/lists/listinfo/vtd-xml-users">https://lists.sourceforge.net/lists/listinfo/vtd-xml-users</a> </pre> </blockquote> <br> </body> </html> |
From: Jimmy Z. <cra...@co...> - 2007-03-19 22:06:36
|
Yep, C's performance is fairly constant since the object code is compiled ... I would be interested in C# as well, at the point it seems to me that C# is more close to C and JVM ... ----- Original Message ----- From: "Boris Kolpackov" <bo...@co...> To: "Tatu Saloranta" <cow...@ya...> Cc: <vtd...@li...> Sent: Monday, March 19, 2007 1:35 PM Subject: Re: [Vtd-xml-users] Storing parsing info > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV -------------------------------------------------------------------------------- > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Boris K. <bo...@co...> - 2007-03-19 20:51:33
|
Tatu Saloranta <cow...@ya...> writes: > So it perfectly normal (and expected) that > loading+parsing things first time is significantly > slower than second time around. You forgot to prefix this statement with "For us, Java die-hards," ;-). For the rest of us this is neither normal nor expected. -boris |
From: Tatu S. <cow...@ya...> - 2007-03-19 20:01:01
|
--- Jimmy Zhang <cra...@co...> wrote: > Tatu, Are you aware of any pointers that discusses > the behavior for server > JVM a bit more... In addition to your suggestion, I > also think that the Javasoft has good online docs, so I think googling might work. It is possible to change HotSpot inline threshold via: -XX:CompileThreshold=2000 flag to java executable, for example. > behavior > of JVM has to do with file sizes as well... for a > large GML, JVM seems to > reach > steady state (native code) pretty quickly... Yeah, that makes sense since bigger files trigger more processing. So it is likely that most tight loops do get inlined during first (single) processing of larger files. So for really big files, the effect may not be as significant as for small files. It's still a good idea to run it multiple times, if possible. -+ Tatu +- ____________________________________________________________________________________ No need to miss a message. Get email on-the-go with Yahoo! Mail for Mobile. Get started. http://mobile.yahoo.com/mail |
From: Jimmy Z. <cra...@co...> - 2007-03-19 19:30:54
|
Tatu, Are you aware of any pointers that discusses the behavior for server JVM a bit more... In addition to your suggestion, I also think that the behavior of JVM has to do with file sizes as well... for a large GML, JVM seems to reach steady state (native code) pretty quickly... Jimmy ----- Original Message ----- From: "Tatu Saloranta" <cow...@ya...> To: "Fernando Gonzalez" <fer...@gm...>; <vtd...@li...> Sent: Monday, March 19, 2007 12:19 PM Subject: Re: [Vtd-xml-users] Storing parsing info > > --- Fernando Gonzalez <fer...@gm...> wrote: > >> Well, jeje, the computer is new but I don't think my >> disk is so fast. I >> think Java or the operating system has to cache >> something because the first >> time I load the file it takes a bit more than 2 >> seconds and after the first >> load, it only takes 300ms to read the file... >> I have no experience on doing benchmarks and maybe >> I'm am missing something. >> That's why I attached the program. > > One thing you need to know is that JVM's HotSpot does > incremental optimization. As such, calling anything > first time (or, rather, first N, where N can be like > 2000 times) will take much much longer. This because > code at first will be interpreted from byte code. Only > after JVM notices particular part code is so-called > "hot spot" (called so often it is where most time is > spent) it will inline it to native code. > > Because of this, one must always ensure that there is > enough of so-called warmup time to reach steady state > performance. That is, unless your use case is such > where you only run things once, using separate JVM > instance. > > So it perfectly normal (and expected) that > loading+parsing things first time is significantly > slower than second time around. > If you want to ensure it's not due to OS-level disk > block caching (which also affects results), you can > warm things up using different xml files: code paths > should be same or similar, so HotSpot will optimize > code, but data is different, so OS won't be able to > cache it. > > Hope this helps, > > -+ Tatu +- > > > > > ____________________________________________________________________________________ > Looking for earth-friendly autos? > Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center. > http://autos.yahoo.com/green_center/ > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Tatu S. <cow...@ya...> - 2007-03-19 19:27:58
|
--- Peter Neu <pet...@gm...> wrote: > Hello, > > someone on a different mailing list suggested vtd is > a good tool for mapping > GML to Java Objects. How is this done in detail? > Are there any tutorials on this? Hmmh. This actually sounds like a strange piece of advice, depending on what "good" means. VTD does offer high performance for efficient tree-based access to xml content. But it does not (and is not meant to I think) offer any support for actual mapping between xml and Java. That's what data binding (and/or serialization) toolkits/libs offer. And on top of that, there isn't any specific support for GML, and likewise, that sounds like something to be done on higher level tools. VTD could of course be used as the underlying parser component for such tools. I don't know if there are such tools out there yet. As to binding data to Java objects, JAXB 2.0 is widely considered to be a good and reasonably fast library. As far as I know, it currently supports just SAX and StAX parsers under the hood. Since much of overhead is at binding level (constructing java objects, populating fileds), parser choice may not affect overall performance that much. So regarding goodness: if raw performance is the main goal, using VTD might make sense. If so, you would NOT want to map things to objects, but rather deal with raw xml entities directly. This would be a trade-off, and resulting code could very well be hard to use or understand, so its goodness would depend a lot on priorities. -+ Tatu +- ____________________________________________________________________________________ The fish are biting. Get more visitors on your site using Yahoo! Search Marketing. http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php |
From: Tatu S. <cow...@ya...> - 2007-03-19 19:19:39
|
--- Fernando Gonzalez <fer...@gm...> wrote: > Well, jeje, the computer is new but I don't think my > disk is so fast. I > think Java or the operating system has to cache > something because the first > time I load the file it takes a bit more than 2 > seconds and after the first > load, it only takes 300ms to read the file... > I have no experience on doing benchmarks and maybe > I'm am missing something. > That's why I attached the program. One thing you need to know is that JVM's HotSpot does incremental optimization. As such, calling anything first time (or, rather, first N, where N can be like 2000 times) will take much much longer. This because code at first will be interpreted from byte code. Only after JVM notices particular part code is so-called "hot spot" (called so often it is where most time is spent) it will inline it to native code. Because of this, one must always ensure that there is enough of so-called warmup time to reach steady state performance. That is, unless your use case is such where you only run things once, using separate JVM instance. So it perfectly normal (and expected) that loading+parsing things first time is significantly slower than second time around. If you want to ensure it's not due to OS-level disk block caching (which also affects results), you can warm things up using different xml files: code paths should be same or similar, so HotSpot will optimize code, but data is different, so OS won't be able to cache it. Hope this helps, -+ Tatu +- ____________________________________________________________________________________ Looking for earth-friendly autos? Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center. http://autos.yahoo.com/green_center/ |
From: Fernando G. <fer...@gm...> - 2007-03-19 09:56:50
|
Well, jeje, the computer is new but I don't think my disk is so fast. I think Java or the operating system has to cache something because the first time I load the file it takes a bit more than 2 seconds and after the first load, it only takes 300ms to read the file... I have no experience on doing benchmarks and maybe I'm am missing something. That's why I attached the program. "So if you can't delete the orginal XML files, can you compress them and store them away (archiving)?" I cannot delete nor archive the GML file because in this context it won't be rare to be reading it from two different programs at the same time... It's difficult to find an open source program that does everything you need. For example, in a development context, there may be a map server serving a map image based on a GML file while you are opening it to see some data in it. "The other issue you raised is buffer reuse. To reuse internal buffers of VTDGen, you can call setDoc_BR(...). But there is more you can do... you can in fact reuse the byte array containing the XML document." Buffer reuse absolutly solves my memory constraints. But the problem I see with buffer reuse is that it will force me to read and parse the whole XML file every time the user ask for information on another XML file, won't it? If I read the XML file by chunks and I store/read the parse information, each time the user asks for information on another XML file I only have to read the parse info and a chunk of the XML file. To show you my point of view: The "user asking for another XML file" may be a map server that reads some big GML files and draws its spatial information in a map image. If each time the map server draws a GML file and "changes" to the next it takes 2 seconds or so, the drawing of the map (all the GML files) takes too much time. best regards, Fernando On 3/19/07, Jimmy Zhang <cra...@co...> wrote: > > > What intrigues me with Fernando's test results is that it only takes 300ms > to read a 100MB > file? He got a super fast disk... > > ----- Original Message ----- > *From:* Rodrigo Cunha <rn...@gm...> > *To:* Jimmy Zhang <cra...@co...> > *Cc:* Fernando Gonzalez <fer...@gm...> ; > vtd...@li... > *Sent:* Sunday, March 18, 2007 8:40 PM > *Subject:* Re: [Vtd-xml-users] Storing parsing info > > In fact the idea occured to me in the past also... but VTD is so fast > reading large files anyway! With a fast processor I think we might be > disk-limited rather than processor-limited. Still, if the code is made > already, the option seems cute enought to keep :-) > > Since I mainly deal with large files requiring a lots of processing this > has not been an issue. Others, in different environments, might disagree. > > Jimmy Zhang wrote: > > Fernando, The option for storing VTD in a separate file is open. > I attached the technical document from your last email, and am also > interested in the suggestions/comments from the mailing list ... > > > |
From: Fernando G. <fer...@gm...> - 2007-03-19 09:25:05
|
Hi All, Maybe this is a topic for other list... but anyway: Geotools provides an abstraction to access GML 2.x sources. The problem is that it provides sequential and read only access. There are some other projects that access GML in some way. I know Open Jump, gvSIG, kosmo. If the GML files you have to read aren't very big and you don't have to access a lot of them simultaneosly, maybe some of the solutions I have mentioned may work. Pete, I don't know how much experience you have in GML (I'm quite a beginner), but I think GML is very complicated to do a parser from scratch. Currently I'm involved in gdbms ( http://geosysin.iict.ch/irstv-trac/wiki/GDBMS/TechnicalIssues, http://gdbms.sourceforge.net) which is a general purpose library to read/write different types of sources. I'm trying to build a GML driver for gdbms and that's why I'm in this mailing list but I have the will to reuse parsers from the mentioned projects as much as I can. By the way, I would like to know what "mailing list suggested vtd". Can you tell me? greetings, Fernando On 3/19/07, Jimmy Zhang <cra...@co...> wrote: > > Do you start with a schema or not? I think you can just use XPath to > create only objects of interest... > ----- Original Message ----- > From: "Peter Neu" <pet...@gm... > > To: "'Jimmy Zhang'" <cra...@co...> > Sent: Monday, March 19, 2007 12:40 AM > Subject: AW: [Vtd-xml-users] HowTo use Vtd for GML to Java Mapping? > > > Hello, > > > I'm working on a project for a tool that gathers geo information > from different data sources e.g. different xml dialects, csv and also > GML. Data must be first mapped to java objects in order to allow > the (submitting) user to make adjustments to the data provided and then > write everything to a db in order to give later an output in same data > type > or our own xml dialect. > > > Cheers, > Pete > > > -----Urspr=FCngliche Nachricht----- > > Von: Jimmy Zhang [mailto:cra...@co...] > > Gesendet: Montag, 19. M=E4rz 2007 08:33 > > An: Peter Neu; Vtd...@li... > > Betreff: Re: [Vtd-xml-users] HowTo use Vtd for GML to Java Mapping? > > > > Can you provide a bit more details? For example, what is the starting > > point > > of your task? > > VTD-XML is a fast parser with built-in support for XPath ... > > If you have version 2.0, in it there are various short code examples to > > get > > you started... > > ----- Original Message ----- > > From: "Peter Neu" < pet...@gm...> > > To: <Vtd...@li...> > > Sent: Monday, March 19, 2007 12:06 AM > > Subject: [Vtd-xml-users] HowTo use Vtd for GML to Java Mapping? > > > > > > > Hello, > > > > > > someone on a different mailing list suggested vtd is a good tool for > > > mapping > > > GML to Java Objects. How is this done in detail? > > > Are there any tutorials on this? > > > > > > > > > Cheers, > > > Pete > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > - > > > Take Surveys. Earn Cash. Influence the Future of IT > > > Join SourceForge.net's Techsay panel and you'll get the chance to > share > > > your > > > opinions on IT & business topics through brief surveys-and earn cash > > > > > > http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV > > > _______________________________________________ > > > Vtd-xml-users mailing list > > > Vtd...@li... > > > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > > > > > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Jimmy Z. <cra...@co...> - 2007-03-19 07:56:25
|
Do you start with a schema or not? I think you can just use XPath to create only objects of interest... ----- Original Message ----- From: "Peter Neu" <pet...@gm...> To: "'Jimmy Zhang'" <cra...@co...> Sent: Monday, March 19, 2007 12:40 AM Subject: AW: [Vtd-xml-users] HowTo use Vtd for GML to Java Mapping? Hello, I'm working on a project for a tool that gathers geo information from different data sources e.g. different xml dialects, csv and also GML. Data must be first mapped to java objects in order to allow the (submitting) user to make adjustments to the data provided and then write everything to a db in order to give later an output in same data type or our own xml dialect. Cheers, Pete > -----Ursprüngliche Nachricht----- > Von: Jimmy Zhang [mailto:cra...@co...] > Gesendet: Montag, 19. März 2007 08:33 > An: Peter Neu; Vtd...@li... > Betreff: Re: [Vtd-xml-users] HowTo use Vtd for GML to Java Mapping? > > Can you provide a bit more details? For example, what is the starting > point > of your task? > VTD-XML is a fast parser with built-in support for XPath ... > If you have version 2.0, in it there are various short code examples to > get > you started... > ----- Original Message ----- > From: "Peter Neu" <pet...@gm...> > To: <Vtd...@li...> > Sent: Monday, March 19, 2007 12:06 AM > Subject: [Vtd-xml-users] HowTo use Vtd for GML to Java Mapping? > > > > Hello, > > > > someone on a different mailing list suggested vtd is a good tool for > > mapping > > GML to Java Objects. How is this done in detail? > > Are there any tutorials on this? > > > > > > Cheers, > > Pete > > > > > > > > > > > > > > ------------------------------------------------------------------------ > - > > Take Surveys. Earn Cash. Influence the Future of IT > > Join SourceForge.net's Techsay panel and you'll get the chance to share > > your > > opinions on IT & business topics through brief surveys-and earn cash > > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > _______________________________________________ > > Vtd-xml-users mailing list > > Vtd...@li... > > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > > |
From: Jimmy Z. <cra...@co...> - 2007-03-19 07:37:57
|
What intrigues me with Fernando's test results is that it only takes = 300ms to read a 100MB file? He got a super fast disk... ----- Original Message -----=20 From: Rodrigo Cunha=20 To: Jimmy Zhang=20 Cc: Fernando Gonzalez ; vtd...@li...=20 Sent: Sunday, March 18, 2007 8:40 PM Subject: Re: [Vtd-xml-users] Storing parsing info In fact the idea occured to me in the past also... but VTD is so fast = reading large files anyway! With a fast processor I think we might be = disk-limited rather than processor-limited. Still, if the code is made = already, the option seems cute enought to keep :-) Since I mainly deal with large files requiring a lots of processing = this has not been an issue. Others, in different environments, might = disagree. Jimmy Zhang wrote:=20 Fernando, The option for storing VTD in a separate file is open.=20 I attached the technical document from your last email, and am also = interested in the suggestions/comments from the mailing list ...=20 |
From: Jimmy Z. <cra...@co...> - 2007-03-19 07:33:46
|
Can you provide a bit more details? For example, what is the starting point of your task? VTD-XML is a fast parser with built-in support for XPath ... If you have version 2.0, in it there are various short code examples to get you started... ----- Original Message ----- From: "Peter Neu" <pet...@gm...> To: <Vtd...@li...> Sent: Monday, March 19, 2007 12:06 AM Subject: [Vtd-xml-users] HowTo use Vtd for GML to Java Mapping? > Hello, > > someone on a different mailing list suggested vtd is a good tool for > mapping > GML to Java Objects. How is this done in detail? > Are there any tutorials on this? > > > Cheers, > Pete > > > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |
From: Peter N. <pet...@gm...> - 2007-03-19 07:07:22
|
Hello, someone on a different mailing list suggested vtd is a good tool for mapping GML to Java Objects. How is this done in detail? Are there any tutorials on this? Cheers, Pete |
From: Rodrigo C. <rn...@gm...> - 2007-03-19 03:40:23
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> <title></title> </head> <body bgcolor="#ffffff" text="#000000"> In fact the idea occured to me in the past also... but VTD is so fast reading large files anyway! With a fast processor I think we might be disk-limited rather than processor-limited. Still, if the code is made already, the option seems cute enought to keep :-)<br> <br> Since I mainly deal with large files requiring a lots of processing this has not been an issue. Others, in different environments, might disagree.<br> <br> Jimmy Zhang wrote: <blockquote cite="mid006301c768f7$a8ca75c0$0d02a8c0@ximpleware" type="cite"> <meta http-equiv="Content-Type" content="text/html; "> <meta content="MSHTML 6.00.6000.16414" name="GENERATOR"> <style></style> <div><font face="Arial">Fernando, The option for storing VTD in a separate file </font><font face="Arial"> is open. </font></div> <div><font face="Arial">I attached the technical document from your last email, and am </font><font face="Arial">also </font></div> <div><font face="Arial">interested in the </font><font face="Arial">suggestions/comments from the </font><font face="Arial">mailing </font><font face="Arial">list ...</font> </div> </blockquote> <br> </body> </html> |
From: Mark S. <ma...@Sc...> - 2007-03-15 18:13:49
|
Jimmy Zhang wrote: > hmm...interesting... "plup" sounds like a tennis racket hitting the > ground... > it may work :-) > > Please take a look the file below and let me know what you think of it ... > > http://vtd-xml.cvs.sourceforge.net/*checkout*/vtd-xml/ximple-dev/com/ximpleware/BookMark.java It looks fine on the surface. I am thinking of incorporating it into something I'm working on right now - if I can just get through the 'other stuff' first... Cheers. -- http://www.ScheduleWorld.com/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |
From: Rodrigo C. <rn...@gm...> - 2007-03-15 15:46:52
|
Jimmy, the API itself seems quite OK to me. I had made the decision not to keep a pointer to VN inside the context but for extra safety this is both highly recomended and adds little extra fat, so let's keep it your way. Ah... since you keep a pointer do VTDNav you should add a method "getNavigator". This will spare some parameter passing to functions that might use a Bookmark. It's functionally irrelevant but convenient and might even add some efficiency. Now the inner workings: The implementation of both equals and hashcode could be different. Note that in typical usage inside data structures my complex methods assymptotically get as fast as yours, since they end up comparing pre-calculated hashCodes, considering there are probably no collisions. Also, for maximum efficiency hashCode should be a spread function. Using index numbers does not typically spread the hashCode amonsgst the available 2^32 values, still this is a minor observation, don't bother with that :-) And we don't have collisions, so it's quite ok. If Mark also agrees I think this new class is ready for prime time. Jimmy Zhang wrote: > hmm...interesting... "plup" sounds like a tennis racket hitting the > ground... > it may work :-) > > Please take a look the file below and let me know what you think of it > ... > > http://vtd-xml.cvs.sourceforge.net/*checkout*/vtd-xml/ximple-dev/com/ximpleware/BookMark.java > > > > ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> > To: "Jimmy Zhang" <cra...@co...> > Cc: "Mark Swanson" <ma...@Sc...>; > <vtd...@li...> > Sent: Wednesday, March 14, 2007 4:19 AM > Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > > >> Yes, that would be a nice add :-) >> >> Wouldn't solve the exception handling or true random access issues >> tought, but would be a nice addon. >> >> I propose "plup()" hehe >> >> Jimmy Zhang wrote: >>> Rodrigo, How about add a method (tentatively called set()) that >>> basically >>> does pop() and push() in one shot , so your >>> example >>> >>> push(); >>> // do something >>> pop(); >>> push(); >>> // do something >>> pop(); >>> push(); >>> // do something >>> pop(); >>> push(); >>> // do something >>> pop(); >>> >>> becomes >>> >>> push(); >>> // do something >>> set(); >>> // do something >>> set(); >>> // do something >>> set(); >>> // do something >>> pop(); >>> >>> >>> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >>> To: "Mark Swanson" <ma...@Sc...>; "Jimmy Zhang" >>> <cra...@co...> >>> Cc: <vtd...@li...> >>> Sent: Tuesday, March 06, 2007 9:24 AM >>> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >>> >>> >>>> Hum, thanks for the hint also Mark. I normally use java.util >>>> collections, but i'll give fastutils a look, since they might be >>>> better for some of my problems. >>>> >>>> Jimmy, a context can be imported/exported as an array of integers, >>>> this being a bit raw perhaps... and leaving the programmer with the >>>> job of storing that array. I just created SimpleContext as a way of >>>> abstracting that implementation detail, but since you like >>>> performance, perhaps you could leave that raw. I still prefer the >>>> abstracted version... besides, the internal representation might >>>> change and the abstracted version would still work. >>>> >>>> So, yes, go ahead with my API even if you decide to change it later. >>>> >>>> Just as a little silly footnote I found my API is also useful in >>>> cases where you have: >>>> >>>> push(); >>>> // do something >>>> pop(); >>>> push(); >>>> // do something >>>> pop(); >>>> push(); >>>> // do something >>>> pop(); >>>> push(); >>>> // do something >>>> pop(); >>>> >>>> Can be abreviated to: >>>> >>>> setCtxFromNav(ctx); >>>> // do something >>>> setNavFromCtx(ctx); >>>> // do something >>>> setNavFromCtx(ctx); >>>> // do something >>>> setNavFromCtx(ctx); >>>> // do something >>>> setNavFromCtx(ctx); >>>> >>>> This is both faster and more readable. >>>> >>>> It's also useful for things like: >>>> >>>> xpto = getComplexToFindTransmissionPath(VTDNav n, String s); // >>>> might even use caching! >>>> push(); >>>> nav.setNavFromCtx(xpto); >>>> // Do something >>>> pop(); >>>> >>>> and also for exception handling: >>>> >>>> { >>>> setCtxFromNav(xpto); >>>> // do something complex and error-prone, >>>> // so I won't use the stack, but a few locally declared contexts >>>> instead >>>> // oops! crash! go to handler! >>>> finally { >>>> setNavFromCtx(xpto) >>>> // we are now clean >>>> } >>>> >>>> So, it made using VTD much more enjoyable to me, even in rather >>>> trivial situations. >>>> >>>> Mark Swanson wrote: >>>>> Jimmy Zhang wrote: >>>>>> >>>>>>> Just a FYI: I have cases where the key is an Integer, and cases >>>>>>> where >>>>>>> it's a string. >>>>>> >>>>>> By Integer is it a java class? or just a primitive data type? >>>>>> Maybe I can >>>>>> modify Rodrigo's class and put it into CVS so you guys can use >>>>>> immediately... >>>>>> however, I can't guarantee that it will be included in the next >>>>>> release... >>>>>> Would that work? >>>>> >>>>> Oh, I always use native ints and fastutil wherever possible. >>>>> >>>>> Just a thought: I use autojar on my code to build a tiny fastutil >>>>> jar that just has the code I need. You could do the same thing to >>>>> get excellent native collections instead of writing your own. I >>>>> see you already wrote your own, but in case you need more.. >>>>> Fastutil uses the LGPL. >>>>> >>>>> Cheers. >>>>> >>>> >>>> >>> >>> >>> >> > > > |
From: Jimmy Z. <cra...@co...> - 2007-03-15 03:37:26
|
hmm...interesting... "plup" sounds like a tennis racket hitting the ground... it may work :-) Please take a look the file below and let me know what you think of it ... http://vtd-xml.cvs.sourceforge.net/*checkout*/vtd-xml/ximple-dev/com/ximpleware/BookMark.java ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> To: "Jimmy Zhang" <cra...@co...> Cc: "Mark Swanson" <ma...@Sc...>; <vtd...@li...> Sent: Wednesday, March 14, 2007 4:19 AM Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) > Yes, that would be a nice add :-) > > Wouldn't solve the exception handling or true random access issues tought, > but would be a nice addon. > > I propose "plup()" hehe > > Jimmy Zhang wrote: >> Rodrigo, How about add a method (tentatively called set()) that basically >> does pop() and push() in one shot , so your >> example >> >> push(); >> // do something >> pop(); >> push(); >> // do something >> pop(); >> push(); >> // do something >> pop(); >> push(); >> // do something >> pop(); >> >> becomes >> >> push(); >> // do something >> set(); >> // do something >> set(); >> // do something >> set(); >> // do something >> pop(); >> >> >> ----- Original Message ----- From: "Rodrigo Cunha" <rn...@gm...> >> To: "Mark Swanson" <ma...@Sc...>; "Jimmy Zhang" >> <cra...@co...> >> Cc: <vtd...@li...> >> Sent: Tuesday, March 06, 2007 9:24 AM >> Subject: Re: [Vtd-xml-users] Random Access Proposal (take 2) >> >> >>> Hum, thanks for the hint also Mark. I normally use java.util >>> collections, but i'll give fastutils a look, since they might be better >>> for some of my problems. >>> >>> Jimmy, a context can be imported/exported as an array of integers, this >>> being a bit raw perhaps... and leaving the programmer with the job of >>> storing that array. I just created SimpleContext as a way of abstracting >>> that implementation detail, but since you like performance, perhaps you >>> could leave that raw. I still prefer the abstracted version... besides, >>> the internal representation might change and the abstracted version >>> would still work. >>> >>> So, yes, go ahead with my API even if you decide to change it later. >>> >>> Just as a little silly footnote I found my API is also useful in cases >>> where you have: >>> >>> push(); >>> // do something >>> pop(); >>> push(); >>> // do something >>> pop(); >>> push(); >>> // do something >>> pop(); >>> push(); >>> // do something >>> pop(); >>> >>> Can be abreviated to: >>> >>> setCtxFromNav(ctx); >>> // do something >>> setNavFromCtx(ctx); >>> // do something >>> setNavFromCtx(ctx); >>> // do something >>> setNavFromCtx(ctx); >>> // do something >>> setNavFromCtx(ctx); >>> >>> This is both faster and more readable. >>> >>> It's also useful for things like: >>> >>> xpto = getComplexToFindTransmissionPath(VTDNav n, String s); // might >>> even use caching! >>> push(); >>> nav.setNavFromCtx(xpto); >>> // Do something >>> pop(); >>> >>> and also for exception handling: >>> >>> { >>> setCtxFromNav(xpto); >>> // do something complex and error-prone, >>> // so I won't use the stack, but a few locally declared contexts instead >>> // oops! crash! go to handler! >>> finally { >>> setNavFromCtx(xpto) >>> // we are now clean >>> } >>> >>> So, it made using VTD much more enjoyable to me, even in rather trivial >>> situations. >>> >>> Mark Swanson wrote: >>>> Jimmy Zhang wrote: >>>>> >>>>>> Just a FYI: I have cases where the key is an Integer, and cases where >>>>>> it's a string. >>>>> >>>>> By Integer is it a java class? or just a primitive data type? Maybe I >>>>> can >>>>> modify Rodrigo's class and put it into CVS so you guys can use >>>>> immediately... >>>>> however, I can't guarantee that it will be included in the next >>>>> release... >>>>> Would that work? >>>> >>>> Oh, I always use native ints and fastutil wherever possible. >>>> >>>> Just a thought: I use autojar on my code to build a tiny fastutil jar >>>> that just has the code I need. You could do the same thing to get >>>> excellent native collections instead of writing your own. I see you >>>> already wrote your own, but in case you need more.. Fastutil uses the >>>> LGPL. >>>> >>>> Cheers. >>>> >>> >>> >> >> >> > |