From: Hilmar L. <hl...@ne...> - 2012-01-24 22:44:19
|
And to add to my previous response, a useful byproduct of such an effort could be a shared AMI, and in fact if you load up the Postgres dump to S3, you could slice up the file dump generation to run in parallel on multiple EC2 nodes. This could also be a nice target for an Education & Research grant from AWS, the next round of which, I think, are due in the first or second week of February. -hilmar Sent with a tap. On Jan 24, 2012, at 11:07 AM, William Piel <wil...@ya...> wrote: > > On Jan 24, 2012, at 7:53 AM, Rutger Vos wrote: > >> Hi all, >> >> I've had a request from one of Enrico Pontelli's students for a complete dump in NeXML of TreeBASE. I would like to have one as well for my own purposes. Because we now have caching this may not be as big a problem as previously, though most studies will not yet ever have been serialized to NeXML since the start of caching so we still need to be careful. On the plus side: once we've done this we will have all of them in cache so all subsequent requests should be more snappy. Can we come up with a reasonable waiting time between requests so we don't kill the server? Is there a quiet time during which this can best be done? Do tb-stage or tb-dev also have caches? >> >> Rutger > > I think this is a good idea, given that it will build up a war-chest of cached data. (In fact, maybe we should first extend the expire date on the cache so that this lasts longer?) Perhaps it will also catch datasets that are problematic. > > Google Analytics shows that activity is lowest on the weekend -- no surprise there. But maybe it would be better to do it during the week so that it's easy to intervene if the application gets locked up. Also, it might make sense to throttle the download process intentionally (e.g. interspersing requests with the "sleep" function in perl, for example) so that the application has ample time for garbage collection, etc, and so not to impact the system too much. Finally, even if you're not capturing NEXUS, maybe it would help to also download NEXUS as well, as the NEXUS cache is also valuable to build up. > > bp > > > > > ------------------------------------------------------------------------------ > Keep Your Developer Skills Current with LearnDevNow! > The most comprehensive online learning library for Microsoft developers > is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, > Metro Style Apps, more. Free future releases when you subscribe now! > http://p.sf.net/sfu/learndevnow-d2d > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel |