You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
(1) |
Apr
(41) |
May
(41) |
Jun
(50) |
Jul
(14) |
Aug
(21) |
Sep
(37) |
Oct
(8) |
Nov
(4) |
Dec
(135) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(145) |
Feb
(110) |
Mar
(216) |
Apr
(101) |
May
(42) |
Jun
(42) |
Jul
(23) |
Aug
(17) |
Sep
(33) |
Oct
(15) |
Nov
(18) |
Dec
(6) |
2011 |
Jan
(8) |
Feb
(10) |
Mar
(8) |
Apr
(41) |
May
(48) |
Jun
(62) |
Jul
(7) |
Aug
(9) |
Sep
(7) |
Oct
(11) |
Nov
(49) |
Dec
(1) |
2012 |
Jan
(17) |
Feb
(63) |
Mar
(4) |
Apr
(13) |
May
(17) |
Jun
(21) |
Jul
(10) |
Aug
(10) |
Sep
|
Oct
|
Nov
|
Dec
(16) |
2013 |
Jan
(10) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(5) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
From: Carl B. <cbo...@gm...> - 2011-05-11 21:10:23
|
Hi Treebase, This worked a moment ago: http://treebase.org/treebase-web/top/oai?verb=GetRecord&metadataPrefix=oai_dc&identifier=TB:s1234 Now I get an error: Uncaught Exception Encountered java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.cipres.treebase.web.controllers.OAIPMHController.handle(OAIPMHController.java:117) at org.springframework.web.servlet.mvc.AbstractCommandController.handleRequestInternal(AbstractCommandController.java:84) at [...] basic queries are still there: http://treebase.org/treebase-web/top/oai?verb=Identify but not and GetRecord queries. Just curious if this is a problem at my end or just a temporary interruption. Thanks, Carl -- Carl Boettiger UC Davis http://www.carlboettiger.info/ |
From: William P. <wil...@ya...> - 2011-05-11 17:09:44
|
On May 11, 2011, at 12:10 PM, Carl Boettiger wrote: > Hi Dryad team, Treebase team, > > Thanks Hilmar, here's an example: > > For instance, this Dryad entry does not point to this treebase entry, Perhaps this is just because it is a recent entry? > The treebase entry in this cahas the Dryad id under a tag I didn't expect: <tb:title.study xmlns:tb="http://treebase.org/terms#">dryad_8661</tb:title.study> > > Meanwhile this Dryad entry does have the treebase identifer, but the treebase entry doesn't have the dryad identifier. > > -Carl Hi Carl, With respect the TreeBASE: <tb:title.study> is not the intended location for a dryad cross-reference -- it happens to be there because the title of the study is pre-populated with "dryad_8661", but the submitter can easily change this. Currently, TreeBASE exposes a DOI like so: <prism:doi xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/">10.1111/j.1469-8137.2011.03677.x</prism:doi> ... but that is for the doi to the publication, so we wouldn't want to use that for Dryad. On the other hand, I'd rather not do a schema change. But we have a field called "URL", which we could pre-populate with "http://dx.doi.org/10.5061/dryad.8661". This field is usually used for publications that don't have a doi, and instead have a particular url -- and it's rarely used. So one kluge (short of a schema change) is to pre-populate the url field with "http://dx.doi.org/10.5061/dryad.8661" and then expose that under the "prism:url" element. What would be a good element to use other than "prism:url" or "prism:doi"? bp |
From: Carl B. <cbo...@gm...> - 2011-05-11 16:10:20
|
Hi Dryad team, Treebase team, Thanks Hilmar, here's an example: For instance, this Dryad entry<http://datadryad.org/handle/10255/dryad.8661> does not point to this treebase entry,<http://treebase.org/treebase-web/search/study/anyObjectAsRDF.rdf?namespacedGUID=TB2:S11266> Perhaps this is just because it is a recent entry? The treebase entry in this cahas the Dryad id under a tag I didn't expect: <tb:title.study xmlns:tb="http://treebase.org/terms#">dryad_8661</tb:title.study> Meanwhile this Dryad entry<http://datadryad.org/handle/10255/dryad.4939?show=full> does have the treebase identifer, but the treebase entry doesn't have the dryad identifier<http://treebase.org/treebase-web/search/study/anyObjectAsRDF.rdf?namespacedGUID=TB2:S84> . -Carl On Wed, May 11, 2011 at 7:04 AM, Hilmar Lapp <hl...@ne...> wrote: > Hi Carl - > > first off, the perfect place to ask these and related questions is the > Dryad Developers Google Group [1], which is public. I'm copying some of this > thread there. > > The DataONE API implementation in R is as far as I understand not pretty > and far from mature, but in principle functional. I'm copying Dave Vieglais, > who should be able to point you to the appropriate places in the DataONE svn > repository. > > Our handshaking mechanism with TreeBASE is in principle designed (or was in > my understanding) such that TreeBASE studyIDs and Dryad DOIs are harvested > and updated on the respective ends. (Completing a TreeBASE submission is > asynchronous from, and can take a lot longer than, completing the > "containing" Dryad submission.) If you have an example for where that > doesn't seem to have worked, can you post that here? > > Cheers, > > -hilmar > > [1] http://groups.google.com/group/dryad-dev > > On May 10, 2011, at 1:30 AM, Carl Boettiger wrote: > > Hi Ryan, > > I am working on implementing the TreeBASE API in R, and am implemented in > extending this to the DataDryad and DataONE APIs. (In case you're curious, > the project development is currently hosted on github<https://github.com/cboettig/treeBASE>. > Pretty straight forward, so far just done the PhyloWS side, hoping to do the > OAI-PMH side next). > > Todd suggests that there is already an implementation of the DataONE API in > R, which is great to hear. Do you know who's involved in this, and what > state it's in? I'd be happy to help test or whatnot. I've started looking > at what's available for the investigator-level API<http://mule1.dataone.org/ArchitectureDocs-current/apis/index.html>for DataONE; is this the current documentation? > > William Piel also forwarded you a question from my about getting the > treebase and dryad data better integrated -- the treebase metadata doesn't > currently include the dryad dois, and it seems the dryad dois don't include > the treebase study ids. Certainly this would be a pretty valuable to have, > though I suppose a work-around might be possible by matching other metadata > in the search? > > Anyway, nice to meet you, and look forward to hearing a bit more about what > directions you're heading with this. Is there an appropriate forum / > mailing list to discuss the APIs for Dryad and DataONE? > > Thanks much, > > Carl > > -- > Carl Boettiger > UC Davis > http://www.carlboettiger.info/ > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : > =========================================================== > > > > -- Carl Boettiger UC Davis http://www.carlboettiger.info/ |
From: William P. <wil...@ya...> - 2011-05-10 21:07:50
|
Hi Mark, Could you try our dev server? Harry recently committed a bug fix for this. http://purl.org/phylo/treebase/dev/phylows/study/TB2:S2012 regards, Bill On Apr 21, 2011, at 11:09 PM, Mark Holder wrote: > Hi, > I just downloaded S2012.nex (I can send it along if needed, but I won't clutter your inboxes unless it is needed. The url ishttp://purl.org/phylo/treebase/phylows/study/TB2:S2012 ). > > My NEXUS parser is choking on a couple of things: > 1. The names of the CHARSET commands in the SETS block have spaces, but the names have not been "escaped" to convert them to single NEXUS tokens. > > 2. There are multiple TAXA blocks but the TREES blocks do not use the LINK command to disambiguate. Mesquite is smart enough to figure out which TAXA block is correct. I suppose I could add taxa-block-detection code to NCL (the NEXUS parsing library that I use), but it would be much easier if TreeBase used LINK to clarify the connections. > > all the best, > Mark |
From: William P. <wil...@ya...> - 2011-04-29 03:32:34
|
On Apr 28, 2011, at 10:39 PM, Hilmar Lapp wrote: > are there really so many different ways to display a paragraph of class readyStateError? Presently we're using two different ways to display readyStateError paragraphs: red+bold and black+plain. bp |
From: Hilmar L. <hl...@dr...> - 2011-04-29 02:45:02
|
On Apr 28, 2011, at 4:48 PM, sfr...@us... wrote: > +<p class="readyStateError" style="color:red; font-weight:bold; > display:none;"> Any reason why the style attributes are simply being made part of the CSS definition for the class readyStateError? I.e., are there really so many different ways to display a paragraph of class readyStateError? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== |
From: William P. <wil...@ya...> - 2011-04-22 13:33:35
|
On Apr 21, 2011, at 11:09 PM, Mark Holder wrote: > Hi, > I just downloaded S2012.nex (I can send it along if needed, but I won't clutter your inboxes unless it is needed. The urlishttp://purl.org/phylo/treebase/phylows/study/TB2:S2012 ). > > My NEXUS parser is choking on a couple of things: > 1. The names of the CHARSET commands in the SETS block have spaces, but the names have not been "escaped" to convert them to single NEXUS tokens. > > 2. There are multiple TAXA blocks but the TREES blocks do not use the LINK command to disambiguate. Mesquite is smart enough to figure out which TAXA block is correct. I suppose I could add taxa-block-detection code to NCL (the NEXUS parsing library that I use), but it would be much easier if TreeBase used LINK to clarify the connections. > > all the best, > Mark We need to fix those things. Thanks for reporting. The LINK issue seems to work if analyses are downloaded individually, e.g.: http://treebase.org/treebase-web/search/downloadAnAnalysisStep.html?analysisid=4172&id=2012 http://treebase.org/treebase-web/search/downloadAnAnalysisStep.html?analysisid=4173&id=2012 But indeed the CHARSETs are still not escaped properly. To be added to the bug list... bp |
From: Mark H. <mth...@gm...> - 2011-04-22 03:09:49
|
Hi, I just downloaded S2012.nex (I can send it along if needed, but I won't clutter your inboxes unless it is needed. The url ishttp://purl.org/phylo/treebase/phylows/study/TB2:S2012 ). My NEXUS parser is choking on a couple of things: 1. The names of the CHARSET commands in the SETS block have spaces, but the names have not been "escaped" to convert them to single NEXUS tokens. 2. There are multiple TAXA blocks but the TREES blocks do not use the LINK command to disambiguate. Mesquite is smart enough to figure out which TAXA block is correct. I suppose I could add taxa-block-detection code to NCL (the NEXUS parsing library that I use), but it would be much easier if TreeBase used LINK to clarify the connections. all the best, Mark PS: You can add my name to the list of people who would appreciate a link to a full dump of the DB whenever it is available. |
From: Hilmar L. <hl...@ne...> - 2011-04-19 14:20:47
|
On Apr 19, 2011, at 7:10 AM, Rutger Vos wrote: > perhaps at some point we might move the code base to github? Bit of > a pain because we do use some of the supporting infrastructure that > sourceforge provides: the wiki, the mailing list(s), the bug tracker. Note that moving to git need not mean moving to Github - Sf.net supports git (albeit as a current limitation a project can only have one git repository - which, though, wouldn't be different from having one svn repository as we do now). Also, having a repository on Github does not preclude having one on Sf.net, too (git is distributed version control, and pushing to multiple remotes is easy). And finally, having the repo on Github does not preclude having the remaining infrastructure on Sf.net. So, in summary, there are lots of possibilities so that the question of where other project resources are hosted need not dictate the choice of version control system. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Rutger V. <R....@re...> - 2011-04-19 14:12:03
|
On Tue, Apr 19, 2011 at 3:06 PM, Jon Auman <jon...@ne...> wrote: > Treebase is back up. It had nothing to do with Rutger's commits. Treebase seems to get hammered more in the wee hours of the morning... Or, harvest o'clock in GMT. > On Apr 19, 2011, at 7:31 AM, Rutger Vos wrote: > >> Mmmm, don't know what's up with that. Presumably that will be dealt >> with once North Carolina wakes up. >> >> Anyway, I committed some code that makes the produced NeXML somewhat >> more concise and quicker to generate. This doesn't yet fix the issue >> with the mixed data types in the same matrix, though. So the very, >> very large files are still very, very large. Maybe we can see this >> code on the dev server at some point? >> >> On Tue, Apr 19, 2011 at 12:21 PM, Roderic Page <r....@bi...> wrote: >>> Well, now TreeBASE has crashed... >>> >>> >>> On 19 Apr 2011, at 11:45, Rutger Vos wrote: >>> >>>> If you've tried to harvest them in a batch, long-running queries from >>>> aborted downloads will accumulate so you get more failures later on. >>>> I've downloaded several of these, so one thing you can do is simply >>>> try again. >>>> >>>> On Tue, Apr 19, 2011 at 11:25 AM, Roderic Page <r....@bi...> wrote: >>>>> Below is the list of TreeBASE studies that have failed to output Nexml when I've tried to harvest them with a timeout of 10 minutes. Any way to get hold of these? >>>>> >>>>> Rod >>>>> >>>>> S131 >>>>> S132 >>>>> S134 >>>>> S202 >>>>> S613 >>>>> S1085 >>>>> S1158 >>>>> S1183 >>>>> S1197 >>>>> S1302 >>>>> S1303 >>>>> S1306 >>>>> S1307 >>>>> S1308 >>>>> S1309 >>>>> S1310 >>>>> S1311 >>>>> S1312 >>>>> S1313 >>>>> S1314 >>>>> S1315 >>>>> S1316 >>>>> S1317 >>>>> S1318 >>>>> S1319 >>>>> S1320 >>>>> S1321 >>>>> S1322 >>>>> S1326 >>>>> S1330 >>>>> S1936 >>>>> S2039 >>>>> S2078 >>>>> S2372 >>>>> S2373 >>>>> S2376 >>>>> S2377 >>>>> S9993 >>>>> S9997 >>>>> S9998 >>>>> S9999 >>>>> S10071 >>>>> S10287 >>>>> S10316 >>>>> S10335 >>>>> S10433 >>>>> S10507 >>>>> S10508 >>>>> S10511 >>>>> S10541 >>>>> S10603 >>>>> S10613 >>>>> S10635 >>>>> S10665 >>>>> S10689 >>>>> S10736 >>>>> S10888 >>>>> S10917 >>>>> S10940 >>>>> S11032 >>>>> S11080 >>>>> >>>>> >>>>> On 18 Apr 2011, at 13:47, Rutger Vos wrote: >>>>> >>>>>> To give an example of how things should be: I've also done a NeXML >>>>>> dump and split all harvested studies in their constituent trees, >>>>>> matrices and taxa blocks. The largest NeXML tree file (with taxa >>>>>> block) in TreeBASE is 365Kb for a for a 585 taxon tree. To me that >>>>>> seems a reasonable size. The bulk of a matrix file for that set of >>>>>> taxa should be <seq> elements with raw character state sequences, >>>>>> preceded by a taxa block and an nchar list of <char> elements. You can >>>>>> imagine that that's not going to be 13.7 Mb once things are working >>>>>> correctly. >>>>>> >>>>>> On Mon, Apr 18, 2011 at 1:40 PM, Rutger Vos <R....@re...> wrote: >>>>>>> Yeah, I know, some of the studies are serialized incorrectly, >>>>>>> especially the ones with "mixed" data containing both DNA and >>>>>>> categorical data in the same matrix, or unusual state definitions in >>>>>>> some other way. This results in a character state set definition being >>>>>>> written out for every matrix column, and that takes up most of the >>>>>>> file. Another thing is that we're now using owl:sameAs statements to >>>>>>> specify the TreeBASE ID for every character. >>>>>>> >>>>>>> There are a number of these issues, they're bugs, I'm recording them - >>>>>>> it's one of the things we should be fixing during Laurel's project. A >>>>>>> correctly formatted NeXML file is going to be bigger than the >>>>>>> equivalent NEXUS file, but perhaps like a factor of ten or so max, >>>>>>> depending on the amount of metadata (i.e. on the order of 1Mb for >>>>>>> S2012). That is a trade-off that is worth it because it will allow us >>>>>>> to export all the metadata in a single file. 13.7 Mb is obviously >>>>>>> wrong. >>>>>>> >>>>>>> On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: >>>>>>>> I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. >>>>>>>> >>>>>>>> Need I say more...? >>>>>>>> >>>>>>>> Regards >>>>>>>> >>>>>>>> Rod >>>>>>>> >>>>>>>> On 15 Apr 2011, at 13:42, William Piel wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >>>>>>>>> >>>>>>>>>> For large studies the Nexml generation simply times out, so I gave up. >>>>>>>>> >>>>>>>>> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >>>>>>>>> >>>>>>>>> But, indeed, I'd like access to a dump too. >>>>>>>>> >>>>>>>>> bp >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>>>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>>>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>>>>>> application availability and disaster protection. Learn more about boosting >>>>>>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>>>>>> _______________________________________________ >>>>>>>>> Treebase-devel mailing list >>>>>>>>> Tre...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------- >>>>>>>> Roderic Page >>>>>>>> Professor of Taxonomy >>>>>>>> Institute of Biodiversity, Animal Health and Comparative Medicine >>>>>>>> College of Medical, Veterinary and Life Sciences >>>>>>>> Graham Kerr Building >>>>>>>> University of Glasgow >>>>>>>> Glasgow G12 8QQ, UK >>>>>>>> >>>>>>>> Email: r....@bi... >>>>>>>> Tel: +44 141 330 4778 >>>>>>>> Fax: +44 141 330 2792 >>>>>>>> AIM: rod...@ai... >>>>>>>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>>>>>>> Twitter: http://twitter.com/rdmpage >>>>>>>> Blog: http://iphylo.blogspot.com >>>>>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>>>>> application availability and disaster protection. Learn more about boosting >>>>>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>>>>> _______________________________________________ >>>>>>>> Treebase-devel mailing list >>>>>>>> Tre...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Dr. Rutger A. Vos >>>>>>> School of Biological Sciences >>>>>>> Philip Lyle Building, Level 4 >>>>>>> University of Reading >>>>>>> Reading, RG6 6BX, United Kingdom >>>>>>> Tel: +44 (0) 118 378 7535 >>>>>>> http://rutgervos.blogspot.com >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Dr. Rutger A. Vos >>>>>> School of Biological Sciences >>>>>> Philip Lyle Building, Level 4 >>>>>> University of Reading >>>>>> Reading, RG6 6BX, United Kingdom >>>>>> Tel: +44 (0) 118 378 7535 >>>>>> http://rutgervos.blogspot.com >>>>>> >>>>> >>>>> --------------------------------------------------------- >>>>> Roderic Page >>>>> Professor of Taxonomy >>>>> Institute of Biodiversity, Animal Health and Comparative Medicine >>>>> College of Medical, Veterinary and Life Sciences >>>>> Graham Kerr Building >>>>> University of Glasgow >>>>> Glasgow G12 8QQ, UK >>>>> >>>>> Email: r....@bi... >>>>> Tel: +44 141 330 4778 >>>>> Fax: +44 141 330 2792 >>>>> AIM: rod...@ai... >>>>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>>>> Twitter: http://twitter.com/rdmpage >>>>> Blog: http://iphylo.blogspot.com >>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>> application availability and disaster protection. Learn more about boosting >>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>> _______________________________________________ >>>>> Treebase-devel mailing list >>>>> Tre...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>> >>>> >>>> >>>> >>>> -- >>>> Dr. Rutger A. Vos >>>> School of Biological Sciences >>>> Philip Lyle Building, Level 4 >>>> University of Reading >>>> Reading, RG6 6BX, United Kingdom >>>> Tel: +44 (0) 118 378 7535 >>>> http://rutgervos.blogspot.com >>>> >>> >>> --------------------------------------------------------- >>> Roderic Page >>> Professor of Taxonomy >>> Institute of Biodiversity, Animal Health and Comparative Medicine >>> College of Medical, Veterinary and Life Sciences >>> Graham Kerr Building >>> University of Glasgow >>> Glasgow G12 8QQ, UK >>> >>> Email: r....@bi... >>> Tel: +44 141 330 4778 >>> Fax: +44 141 330 2792 >>> AIM: rod...@ai... >>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>> Twitter: http://twitter.com/rdmpage >>> Blog: http://iphylo.blogspot.com >>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>> >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Benefiting from Server Virtualization: Beyond Initial Workload >>> Consolidation -- Increasing the use of server virtualization is a top >>> priority.Virtualization can reduce costs, simplify management, and improve >>> application availability and disaster protection. Learn more about boosting >>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>> _______________________________________________ >>> Treebase-devel mailing list >>> Tre...@li... >>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>> >> >> >> >> -- >> Dr. Rutger A. Vos >> School of Biological Sciences >> Philip Lyle Building, Level 4 >> University of Reading >> Reading, RG6 6BX, United Kingdom >> Tel: +44 (0) 118 378 7535 >> http://rutgervos.blogspot.com >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel > > ------------------------------------------------------- > Jon Auman > Systems Administrator > National Evolutionary Synthesis Center > Duke University > http:www.nescent.org > jon...@ne... > ------------------------------------------------------ > > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com |
From: Rutger V. <rut...@gm...> - 2011-04-19 14:10:47
|
> On dev only, not production. I can dig through the mailing list archives, > but if my recollection is correct, this was discussed here and agreed to by > Bill. Don't worry, I'll take your word for it. >> That means I'm going to have to change my commit behavior, I tend to >> commit files individually so that I can give more specific messages to >> explain the changes in each of them. > > I've shared that concern. I still think we shouldn't change commit behavior; > just be prepared that sometimes the Hudson builds will fail while you're in > the midst of something. Ah, ok, I guess I'll learn to live with the occasional complaint from Hudson. I don't commit code that, as a whole, is broken (...well...), but I do push the commits out in a series. > In part this is also a result of our not better exploiting the capabilities > of version control. It's arguably not the best idea to do development of > features or changes that amount to more than "atomic" bug fixes on the main > trunk. > > If we were on git, this would probably mostly solve itself, as branching and > switching between branches is just so easy that there are few excuses not to > do it. Merges back to master would then contain all the changes that as a > whole would not break the build. But it's not that svn doesn't support > branching and merging. Yeah, it does support it, but just in a goofy way. The extra folders it creates aren't particularly pretty. I'm really starting to like git - perhaps at some point we might move the code base to github? Bit of a pain because we do use some of the supporting infrastructure that sourceforge provides: the wiki, the mailing list(s), the bug tracker. Mmmmm. >> Do I have a login on hudson? > > I don't know - ask for one if you need one (do you?). I don't know, what would it tell me beyond a stack trace when the dev won't build at a given point in time? I don't really need to see that, probably. Rutger -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Jon A. <jon...@ne...> - 2011-04-19 14:06:38
|
Treebase is back up. It had nothing to do with Rutger's commits. Treebase seems to get hammered more in the wee hours of the morning... -Jon On Apr 19, 2011, at 7:31 AM, Rutger Vos wrote: > Mmmm, don't know what's up with that. Presumably that will be dealt > with once North Carolina wakes up. > > Anyway, I committed some code that makes the produced NeXML somewhat > more concise and quicker to generate. This doesn't yet fix the issue > with the mixed data types in the same matrix, though. So the very, > very large files are still very, very large. Maybe we can see this > code on the dev server at some point? > > On Tue, Apr 19, 2011 at 12:21 PM, Roderic Page <r....@bi...> wrote: >> Well, now TreeBASE has crashed... >> >> >> On 19 Apr 2011, at 11:45, Rutger Vos wrote: >> >>> If you've tried to harvest them in a batch, long-running queries from >>> aborted downloads will accumulate so you get more failures later on. >>> I've downloaded several of these, so one thing you can do is simply >>> try again. >>> >>> On Tue, Apr 19, 2011 at 11:25 AM, Roderic Page <r....@bi...> wrote: >>>> Below is the list of TreeBASE studies that have failed to output Nexml when I've tried to harvest them with a timeout of 10 minutes. Any way to get hold of these? >>>> >>>> Rod >>>> >>>> S131 >>>> S132 >>>> S134 >>>> S202 >>>> S613 >>>> S1085 >>>> S1158 >>>> S1183 >>>> S1197 >>>> S1302 >>>> S1303 >>>> S1306 >>>> S1307 >>>> S1308 >>>> S1309 >>>> S1310 >>>> S1311 >>>> S1312 >>>> S1313 >>>> S1314 >>>> S1315 >>>> S1316 >>>> S1317 >>>> S1318 >>>> S1319 >>>> S1320 >>>> S1321 >>>> S1322 >>>> S1326 >>>> S1330 >>>> S1936 >>>> S2039 >>>> S2078 >>>> S2372 >>>> S2373 >>>> S2376 >>>> S2377 >>>> S9993 >>>> S9997 >>>> S9998 >>>> S9999 >>>> S10071 >>>> S10287 >>>> S10316 >>>> S10335 >>>> S10433 >>>> S10507 >>>> S10508 >>>> S10511 >>>> S10541 >>>> S10603 >>>> S10613 >>>> S10635 >>>> S10665 >>>> S10689 >>>> S10736 >>>> S10888 >>>> S10917 >>>> S10940 >>>> S11032 >>>> S11080 >>>> >>>> >>>> On 18 Apr 2011, at 13:47, Rutger Vos wrote: >>>> >>>>> To give an example of how things should be: I've also done a NeXML >>>>> dump and split all harvested studies in their constituent trees, >>>>> matrices and taxa blocks. The largest NeXML tree file (with taxa >>>>> block) in TreeBASE is 365Kb for a for a 585 taxon tree. To me that >>>>> seems a reasonable size. The bulk of a matrix file for that set of >>>>> taxa should be <seq> elements with raw character state sequences, >>>>> preceded by a taxa block and an nchar list of <char> elements. You can >>>>> imagine that that's not going to be 13.7 Mb once things are working >>>>> correctly. >>>>> >>>>> On Mon, Apr 18, 2011 at 1:40 PM, Rutger Vos <R....@re...> wrote: >>>>>> Yeah, I know, some of the studies are serialized incorrectly, >>>>>> especially the ones with "mixed" data containing both DNA and >>>>>> categorical data in the same matrix, or unusual state definitions in >>>>>> some other way. This results in a character state set definition being >>>>>> written out for every matrix column, and that takes up most of the >>>>>> file. Another thing is that we're now using owl:sameAs statements to >>>>>> specify the TreeBASE ID for every character. >>>>>> >>>>>> There are a number of these issues, they're bugs, I'm recording them - >>>>>> it's one of the things we should be fixing during Laurel's project. A >>>>>> correctly formatted NeXML file is going to be bigger than the >>>>>> equivalent NEXUS file, but perhaps like a factor of ten or so max, >>>>>> depending on the amount of metadata (i.e. on the order of 1Mb for >>>>>> S2012). That is a trade-off that is worth it because it will allow us >>>>>> to export all the metadata in a single file. 13.7 Mb is obviously >>>>>> wrong. >>>>>> >>>>>> On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: >>>>>>> I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. >>>>>>> >>>>>>> Need I say more...? >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Rod >>>>>>> >>>>>>> On 15 Apr 2011, at 13:42, William Piel wrote: >>>>>>> >>>>>>>> >>>>>>>> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >>>>>>>> >>>>>>>>> For large studies the Nexml generation simply times out, so I gave up. >>>>>>>> >>>>>>>> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >>>>>>>> >>>>>>>> But, indeed, I'd like access to a dump too. >>>>>>>> >>>>>>>> bp >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>>>>> application availability and disaster protection. Learn more about boosting >>>>>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>>>>> _______________________________________________ >>>>>>>> Treebase-devel mailing list >>>>>>>> Tre...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>>>>> >>>>>>> >>>>>>> --------------------------------------------------------- >>>>>>> Roderic Page >>>>>>> Professor of Taxonomy >>>>>>> Institute of Biodiversity, Animal Health and Comparative Medicine >>>>>>> College of Medical, Veterinary and Life Sciences >>>>>>> Graham Kerr Building >>>>>>> University of Glasgow >>>>>>> Glasgow G12 8QQ, UK >>>>>>> >>>>>>> Email: r....@bi... >>>>>>> Tel: +44 141 330 4778 >>>>>>> Fax: +44 141 330 2792 >>>>>>> AIM: rod...@ai... >>>>>>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>>>>>> Twitter: http://twitter.com/rdmpage >>>>>>> Blog: http://iphylo.blogspot.com >>>>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>>>> application availability and disaster protection. Learn more about boosting >>>>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>>>> _______________________________________________ >>>>>>> Treebase-devel mailing list >>>>>>> Tre...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Dr. Rutger A. Vos >>>>>> School of Biological Sciences >>>>>> Philip Lyle Building, Level 4 >>>>>> University of Reading >>>>>> Reading, RG6 6BX, United Kingdom >>>>>> Tel: +44 (0) 118 378 7535 >>>>>> http://rutgervos.blogspot.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Dr. Rutger A. Vos >>>>> School of Biological Sciences >>>>> Philip Lyle Building, Level 4 >>>>> University of Reading >>>>> Reading, RG6 6BX, United Kingdom >>>>> Tel: +44 (0) 118 378 7535 >>>>> http://rutgervos.blogspot.com >>>>> >>>> >>>> --------------------------------------------------------- >>>> Roderic Page >>>> Professor of Taxonomy >>>> Institute of Biodiversity, Animal Health and Comparative Medicine >>>> College of Medical, Veterinary and Life Sciences >>>> Graham Kerr Building >>>> University of Glasgow >>>> Glasgow G12 8QQ, UK >>>> >>>> Email: r....@bi... >>>> Tel: +44 141 330 4778 >>>> Fax: +44 141 330 2792 >>>> AIM: rod...@ai... >>>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>>> Twitter: http://twitter.com/rdmpage >>>> Blog: http://iphylo.blogspot.com >>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>> Consolidation -- Increasing the use of server virtualization is a top >>>> priority.Virtualization can reduce costs, simplify management, and improve >>>> application availability and disaster protection. Learn more about boosting >>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>> _______________________________________________ >>>> Treebase-devel mailing list >>>> Tre...@li... >>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>> >>> >>> >>> >>> -- >>> Dr. Rutger A. Vos >>> School of Biological Sciences >>> Philip Lyle Building, Level 4 >>> University of Reading >>> Reading, RG6 6BX, United Kingdom >>> Tel: +44 (0) 118 378 7535 >>> http://rutgervos.blogspot.com >>> >> >> --------------------------------------------------------- >> Roderic Page >> Professor of Taxonomy >> Institute of Biodiversity, Animal Health and Comparative Medicine >> College of Medical, Veterinary and Life Sciences >> Graham Kerr Building >> University of Glasgow >> Glasgow G12 8QQ, UK >> >> Email: r....@bi... >> Tel: +44 141 330 4778 >> Fax: +44 141 330 2792 >> AIM: rod...@ai... >> Facebook: http://www.facebook.com/profile.php?id=1112517192 >> Twitter: http://twitter.com/rdmpage >> Blog: http://iphylo.blogspot.com >> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading, RG6 6BX, United Kingdom > Tel: +44 (0) 118 378 7535 > http://rutgervos.blogspot.com > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel ------------------------------------------------------- Jon Auman Systems Administrator National Evolutionary Synthesis Center Duke University http:www.nescent.org jon...@ne... ------------------------------------------------------ |
From: Hilmar L. <hl...@ne...> - 2011-04-19 14:03:34
|
On Apr 19, 2011, at 6:34 AM, Rutger Vos wrote: > Not sure if I was supposed to know that - so any commit goes live on > the production server immediately? On dev only, not production. I can dig through the mailing list archives, but if my recollection is correct, this was discussed here and agreed to by Bill. > That means I'm going to have to change my commit behavior, I tend to > commit files individually so that I can give more specific messages > to explain the changes in each of them. I've shared that concern. I still think we shouldn't change commit behavior; just be prepared that sometimes the Hudson builds will fail while you're in the midst of something. In part this is also a result of our not better exploiting the capabilities of version control. It's arguably not the best idea to do development of features or changes that amount to more than "atomic" bug fixes on the main trunk. If we were on git, this would probably mostly solve itself, as branching and switching between branches is just so easy that there are few excuses not to do it. Merges back to master would then contain all the changes that as a whole would not break the build. But it's not that svn doesn't support branching and merging. > Do I have a login on hudson? I don't know - ask for one if you need one (do you?). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Rutger V. <R....@re...> - 2011-04-19 14:01:12
|
> Only treebase-dev is built and deployed automatically by hudson. > Production is also deployed through hudson, but that is triggered > manually, as needed. > It's indeed a good idea to only commit code that is believed to be > buildable and runnable. > -V Well, I wouldn't walk away in the middle of committing a series of files - just didn't realize they had to be atomic. Now I know. -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com |
From: Vladimir G. <vla...@du...> - 2011-04-19 13:56:48
|
On Apr 19, 2011, at 9:34 AM, Rutger Vos wrote: > Not sure if I was supposed to know that - so any commit goes live on > the production server immediately? That means I'm going to have to > change my commit behavior, I tend to commit files individually so that > I can give more specific messages to explain the changes in each of > them. Bulk commits will have to have more generic explanations, I > guess. Only treebase-dev is built and deployed automatically by hudson. Production is also deployed through hudson, but that is triggered manually, as needed. It's indeed a good idea to only commit code that is believed to be buildable and runnable. -V |
From: Rutger V. <rut...@gm...> - 2011-04-19 13:34:44
|
Not sure if I was supposed to know that - so any commit goes live on the production server immediately? That means I'm going to have to change my commit behavior, I tend to commit files individually so that I can give more specific messages to explain the changes in each of them. Bulk commits will have to have more generic explanations, I guess. Do I have a login on hudson? Rutger On Tue, Apr 19, 2011 at 2:30 PM, Hilmar Lapp <hl...@ne...> wrote: > Hudson will rebuild and redeploy TreeBASE whenever you commit. It's a > continuous integration tool: http://hudson-ci.org/ > > -hilmar > > On Apr 19, 2011, at 4:32 AM, Rutger Vos wrote: > >> I don't know who hudson is and since when we have him, but he seems to >> have done a continuous build that failed when I made my first commit. >> Now he says everything is back to normal - though I'm still getting a >> 503. >> >> >> ---------- Forwarded message ---------- >> From: IT Admin <it...@ne...> >> Date: Tue, Apr 19, 2011 at 12:01 PM >> Subject: Hudson build is back to normal : Treebase-dev #92 >> To: vga...@ne..., rut...@gm... >> >> >> See <http://hudson.nescent.org/job/Treebase-dev/92/changes> >> >> >> >> >> >> -- >> Dr. Rutger A. Vos >> School of Biological Sciences >> Philip Lyle Building, Level 4 >> University of Reading >> Reading >> RG6 6BX >> United Kingdom >> Tel: +44 (0) 118 378 7535 >> http://www.nexml.org >> http://rutgervos.blogspot.com >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about >> boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : > =========================================================== > > > > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Hilmar L. <hl...@ne...> - 2011-04-19 13:30:32
|
Hudson will rebuild and redeploy TreeBASE whenever you commit. It's a continuous integration tool: http://hudson-ci.org/ -hilmar On Apr 19, 2011, at 4:32 AM, Rutger Vos wrote: > I don't know who hudson is and since when we have him, but he seems to > have done a continuous build that failed when I made my first commit. > Now he says everything is back to normal - though I'm still getting a > 503. > > > ---------- Forwarded message ---------- > From: IT Admin <it...@ne...> > Date: Tue, Apr 19, 2011 at 12:01 PM > Subject: Hudson build is back to normal : Treebase-dev #92 > To: vga...@ne..., rut...@gm... > > > See <http://hudson.nescent.org/job/Treebase-dev/92/changes> > > > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading > RG6 6BX > United Kingdom > Tel: +44 (0) 118 378 7535 > http://www.nexml.org > http://rutgervos.blogspot.com > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and > improve > application availability and disaster protection. Learn more about > boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Rutger V. <rut...@gm...> - 2011-04-19 11:32:58
|
I don't know who hudson is and since when we have him, but he seems to have done a continuous build that failed when I made my first commit. Now he says everything is back to normal - though I'm still getting a 503. ---------- Forwarded message ---------- From: IT Admin <it...@ne...> Date: Tue, Apr 19, 2011 at 12:01 PM Subject: Hudson build is back to normal : Treebase-dev #92 To: vga...@ne..., rut...@gm... See <http://hudson.nescent.org/job/Treebase-dev/92/changes> -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Rutger V. <R....@re...> - 2011-04-19 11:31:20
|
Mmmm, don't know what's up with that. Presumably that will be dealt with once North Carolina wakes up. Anyway, I committed some code that makes the produced NeXML somewhat more concise and quicker to generate. This doesn't yet fix the issue with the mixed data types in the same matrix, though. So the very, very large files are still very, very large. Maybe we can see this code on the dev server at some point? On Tue, Apr 19, 2011 at 12:21 PM, Roderic Page <r....@bi...> wrote: > Well, now TreeBASE has crashed... > > > On 19 Apr 2011, at 11:45, Rutger Vos wrote: > >> If you've tried to harvest them in a batch, long-running queries from >> aborted downloads will accumulate so you get more failures later on. >> I've downloaded several of these, so one thing you can do is simply >> try again. >> >> On Tue, Apr 19, 2011 at 11:25 AM, Roderic Page <r....@bi...> wrote: >>> Below is the list of TreeBASE studies that have failed to output Nexml when I've tried to harvest them with a timeout of 10 minutes. Any way to get hold of these? >>> >>> Rod >>> >>> S131 >>> S132 >>> S134 >>> S202 >>> S613 >>> S1085 >>> S1158 >>> S1183 >>> S1197 >>> S1302 >>> S1303 >>> S1306 >>> S1307 >>> S1308 >>> S1309 >>> S1310 >>> S1311 >>> S1312 >>> S1313 >>> S1314 >>> S1315 >>> S1316 >>> S1317 >>> S1318 >>> S1319 >>> S1320 >>> S1321 >>> S1322 >>> S1326 >>> S1330 >>> S1936 >>> S2039 >>> S2078 >>> S2372 >>> S2373 >>> S2376 >>> S2377 >>> S9993 >>> S9997 >>> S9998 >>> S9999 >>> S10071 >>> S10287 >>> S10316 >>> S10335 >>> S10433 >>> S10507 >>> S10508 >>> S10511 >>> S10541 >>> S10603 >>> S10613 >>> S10635 >>> S10665 >>> S10689 >>> S10736 >>> S10888 >>> S10917 >>> S10940 >>> S11032 >>> S11080 >>> >>> >>> On 18 Apr 2011, at 13:47, Rutger Vos wrote: >>> >>>> To give an example of how things should be: I've also done a NeXML >>>> dump and split all harvested studies in their constituent trees, >>>> matrices and taxa blocks. The largest NeXML tree file (with taxa >>>> block) in TreeBASE is 365Kb for a for a 585 taxon tree. To me that >>>> seems a reasonable size. The bulk of a matrix file for that set of >>>> taxa should be <seq> elements with raw character state sequences, >>>> preceded by a taxa block and an nchar list of <char> elements. You can >>>> imagine that that's not going to be 13.7 Mb once things are working >>>> correctly. >>>> >>>> On Mon, Apr 18, 2011 at 1:40 PM, Rutger Vos <R....@re...> wrote: >>>>> Yeah, I know, some of the studies are serialized incorrectly, >>>>> especially the ones with "mixed" data containing both DNA and >>>>> categorical data in the same matrix, or unusual state definitions in >>>>> some other way. This results in a character state set definition being >>>>> written out for every matrix column, and that takes up most of the >>>>> file. Another thing is that we're now using owl:sameAs statements to >>>>> specify the TreeBASE ID for every character. >>>>> >>>>> There are a number of these issues, they're bugs, I'm recording them - >>>>> it's one of the things we should be fixing during Laurel's project. A >>>>> correctly formatted NeXML file is going to be bigger than the >>>>> equivalent NEXUS file, but perhaps like a factor of ten or so max, >>>>> depending on the amount of metadata (i.e. on the order of 1Mb for >>>>> S2012). That is a trade-off that is worth it because it will allow us >>>>> to export all the metadata in a single file. 13.7 Mb is obviously >>>>> wrong. >>>>> >>>>> On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: >>>>>> I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. >>>>>> >>>>>> Need I say more...? >>>>>> >>>>>> Regards >>>>>> >>>>>> Rod >>>>>> >>>>>> On 15 Apr 2011, at 13:42, William Piel wrote: >>>>>> >>>>>>> >>>>>>> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >>>>>>> >>>>>>>> For large studies the Nexml generation simply times out, so I gave up. >>>>>>> >>>>>>> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >>>>>>> >>>>>>> But, indeed, I'd like access to a dump too. >>>>>>> >>>>>>> bp >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>>>> application availability and disaster protection. Learn more about boosting >>>>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>>>> _______________________________________________ >>>>>>> Treebase-devel mailing list >>>>>>> Tre...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>>>> >>>>>> >>>>>> --------------------------------------------------------- >>>>>> Roderic Page >>>>>> Professor of Taxonomy >>>>>> Institute of Biodiversity, Animal Health and Comparative Medicine >>>>>> College of Medical, Veterinary and Life Sciences >>>>>> Graham Kerr Building >>>>>> University of Glasgow >>>>>> Glasgow G12 8QQ, UK >>>>>> >>>>>> Email: r....@bi... >>>>>> Tel: +44 141 330 4778 >>>>>> Fax: +44 141 330 2792 >>>>>> AIM: rod...@ai... >>>>>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>>>>> Twitter: http://twitter.com/rdmpage >>>>>> Blog: http://iphylo.blogspot.com >>>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>>> application availability and disaster protection. Learn more about boosting >>>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>>> _______________________________________________ >>>>>> Treebase-devel mailing list >>>>>> Tre...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Dr. Rutger A. Vos >>>>> School of Biological Sciences >>>>> Philip Lyle Building, Level 4 >>>>> University of Reading >>>>> Reading, RG6 6BX, United Kingdom >>>>> Tel: +44 (0) 118 378 7535 >>>>> http://rutgervos.blogspot.com >>>>> >>>> >>>> >>>> >>>> -- >>>> Dr. Rutger A. Vos >>>> School of Biological Sciences >>>> Philip Lyle Building, Level 4 >>>> University of Reading >>>> Reading, RG6 6BX, United Kingdom >>>> Tel: +44 (0) 118 378 7535 >>>> http://rutgervos.blogspot.com >>>> >>> >>> --------------------------------------------------------- >>> Roderic Page >>> Professor of Taxonomy >>> Institute of Biodiversity, Animal Health and Comparative Medicine >>> College of Medical, Veterinary and Life Sciences >>> Graham Kerr Building >>> University of Glasgow >>> Glasgow G12 8QQ, UK >>> >>> Email: r....@bi... >>> Tel: +44 141 330 4778 >>> Fax: +44 141 330 2792 >>> AIM: rod...@ai... >>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>> Twitter: http://twitter.com/rdmpage >>> Blog: http://iphylo.blogspot.com >>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>> >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Benefiting from Server Virtualization: Beyond Initial Workload >>> Consolidation -- Increasing the use of server virtualization is a top >>> priority.Virtualization can reduce costs, simplify management, and improve >>> application availability and disaster protection. Learn more about boosting >>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>> _______________________________________________ >>> Treebase-devel mailing list >>> Tre...@li... >>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>> >> >> >> >> -- >> Dr. Rutger A. Vos >> School of Biological Sciences >> Philip Lyle Building, Level 4 >> University of Reading >> Reading, RG6 6BX, United Kingdom >> Tel: +44 (0) 118 378 7535 >> http://rutgervos.blogspot.com >> > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: r....@bi... > Tel: +44 141 330 4778 > Fax: +44 141 330 2792 > AIM: rod...@ai... > Facebook: http://www.facebook.com/profile.php?id=1112517192 > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com |
From: Roderic P. <r....@bi...> - 2011-04-19 11:21:28
|
Well, now TreeBASE has crashed... On 19 Apr 2011, at 11:45, Rutger Vos wrote: > If you've tried to harvest them in a batch, long-running queries from > aborted downloads will accumulate so you get more failures later on. > I've downloaded several of these, so one thing you can do is simply > try again. > > On Tue, Apr 19, 2011 at 11:25 AM, Roderic Page <r....@bi...> wrote: >> Below is the list of TreeBASE studies that have failed to output Nexml when I've tried to harvest them with a timeout of 10 minutes. Any way to get hold of these? >> >> Rod >> >> S131 >> S132 >> S134 >> S202 >> S613 >> S1085 >> S1158 >> S1183 >> S1197 >> S1302 >> S1303 >> S1306 >> S1307 >> S1308 >> S1309 >> S1310 >> S1311 >> S1312 >> S1313 >> S1314 >> S1315 >> S1316 >> S1317 >> S1318 >> S1319 >> S1320 >> S1321 >> S1322 >> S1326 >> S1330 >> S1936 >> S2039 >> S2078 >> S2372 >> S2373 >> S2376 >> S2377 >> S9993 >> S9997 >> S9998 >> S9999 >> S10071 >> S10287 >> S10316 >> S10335 >> S10433 >> S10507 >> S10508 >> S10511 >> S10541 >> S10603 >> S10613 >> S10635 >> S10665 >> S10689 >> S10736 >> S10888 >> S10917 >> S10940 >> S11032 >> S11080 >> >> >> On 18 Apr 2011, at 13:47, Rutger Vos wrote: >> >>> To give an example of how things should be: I've also done a NeXML >>> dump and split all harvested studies in their constituent trees, >>> matrices and taxa blocks. The largest NeXML tree file (with taxa >>> block) in TreeBASE is 365Kb for a for a 585 taxon tree. To me that >>> seems a reasonable size. The bulk of a matrix file for that set of >>> taxa should be <seq> elements with raw character state sequences, >>> preceded by a taxa block and an nchar list of <char> elements. You can >>> imagine that that's not going to be 13.7 Mb once things are working >>> correctly. >>> >>> On Mon, Apr 18, 2011 at 1:40 PM, Rutger Vos <R....@re...> wrote: >>>> Yeah, I know, some of the studies are serialized incorrectly, >>>> especially the ones with "mixed" data containing both DNA and >>>> categorical data in the same matrix, or unusual state definitions in >>>> some other way. This results in a character state set definition being >>>> written out for every matrix column, and that takes up most of the >>>> file. Another thing is that we're now using owl:sameAs statements to >>>> specify the TreeBASE ID for every character. >>>> >>>> There are a number of these issues, they're bugs, I'm recording them - >>>> it's one of the things we should be fixing during Laurel's project. A >>>> correctly formatted NeXML file is going to be bigger than the >>>> equivalent NEXUS file, but perhaps like a factor of ten or so max, >>>> depending on the amount of metadata (i.e. on the order of 1Mb for >>>> S2012). That is a trade-off that is worth it because it will allow us >>>> to export all the metadata in a single file. 13.7 Mb is obviously >>>> wrong. >>>> >>>> On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: >>>>> I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. >>>>> >>>>> Need I say more...? >>>>> >>>>> Regards >>>>> >>>>> Rod >>>>> >>>>> On 15 Apr 2011, at 13:42, William Piel wrote: >>>>> >>>>>> >>>>>> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >>>>>> >>>>>>> For large studies the Nexml generation simply times out, so I gave up. >>>>>> >>>>>> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >>>>>> >>>>>> But, indeed, I'd like access to a dump too. >>>>>> >>>>>> bp >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>>> application availability and disaster protection. Learn more about boosting >>>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>>> _______________________________________________ >>>>>> Treebase-devel mailing list >>>>>> Tre...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>>> >>>>> >>>>> --------------------------------------------------------- >>>>> Roderic Page >>>>> Professor of Taxonomy >>>>> Institute of Biodiversity, Animal Health and Comparative Medicine >>>>> College of Medical, Veterinary and Life Sciences >>>>> Graham Kerr Building >>>>> University of Glasgow >>>>> Glasgow G12 8QQ, UK >>>>> >>>>> Email: r....@bi... >>>>> Tel: +44 141 330 4778 >>>>> Fax: +44 141 330 2792 >>>>> AIM: rod...@ai... >>>>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>>>> Twitter: http://twitter.com/rdmpage >>>>> Blog: http://iphylo.blogspot.com >>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>> application availability and disaster protection. Learn more about boosting >>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>> _______________________________________________ >>>>> Treebase-devel mailing list >>>>> Tre...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>> >>>> >>>> >>>> >>>> -- >>>> Dr. Rutger A. Vos >>>> School of Biological Sciences >>>> Philip Lyle Building, Level 4 >>>> University of Reading >>>> Reading, RG6 6BX, United Kingdom >>>> Tel: +44 (0) 118 378 7535 >>>> http://rutgervos.blogspot.com >>>> >>> >>> >>> >>> -- >>> Dr. Rutger A. Vos >>> School of Biological Sciences >>> Philip Lyle Building, Level 4 >>> University of Reading >>> Reading, RG6 6BX, United Kingdom >>> Tel: +44 (0) 118 378 7535 >>> http://rutgervos.blogspot.com >>> >> >> --------------------------------------------------------- >> Roderic Page >> Professor of Taxonomy >> Institute of Biodiversity, Animal Health and Comparative Medicine >> College of Medical, Veterinary and Life Sciences >> Graham Kerr Building >> University of Glasgow >> Glasgow G12 8QQ, UK >> >> Email: r....@bi... >> Tel: +44 141 330 4778 >> Fax: +44 141 330 2792 >> AIM: rod...@ai... >> Facebook: http://www.facebook.com/profile.php?id=1112517192 >> Twitter: http://twitter.com/rdmpage >> Blog: http://iphylo.blogspot.com >> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading, RG6 6BX, United Kingdom > Tel: +44 (0) 118 378 7535 > http://rutgervos.blogspot.com > --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |
From: Rutger V. <R....@re...> - 2011-04-19 10:45:07
|
If you've tried to harvest them in a batch, long-running queries from aborted downloads will accumulate so you get more failures later on. I've downloaded several of these, so one thing you can do is simply try again. On Tue, Apr 19, 2011 at 11:25 AM, Roderic Page <r....@bi...> wrote: > Below is the list of TreeBASE studies that have failed to output Nexml when I've tried to harvest them with a timeout of 10 minutes. Any way to get hold of these? > > Rod > > S131 > S132 > S134 > S202 > S613 > S1085 > S1158 > S1183 > S1197 > S1302 > S1303 > S1306 > S1307 > S1308 > S1309 > S1310 > S1311 > S1312 > S1313 > S1314 > S1315 > S1316 > S1317 > S1318 > S1319 > S1320 > S1321 > S1322 > S1326 > S1330 > S1936 > S2039 > S2078 > S2372 > S2373 > S2376 > S2377 > S9993 > S9997 > S9998 > S9999 > S10071 > S10287 > S10316 > S10335 > S10433 > S10507 > S10508 > S10511 > S10541 > S10603 > S10613 > S10635 > S10665 > S10689 > S10736 > S10888 > S10917 > S10940 > S11032 > S11080 > > > On 18 Apr 2011, at 13:47, Rutger Vos wrote: > >> To give an example of how things should be: I've also done a NeXML >> dump and split all harvested studies in their constituent trees, >> matrices and taxa blocks. The largest NeXML tree file (with taxa >> block) in TreeBASE is 365Kb for a for a 585 taxon tree. To me that >> seems a reasonable size. The bulk of a matrix file for that set of >> taxa should be <seq> elements with raw character state sequences, >> preceded by a taxa block and an nchar list of <char> elements. You can >> imagine that that's not going to be 13.7 Mb once things are working >> correctly. >> >> On Mon, Apr 18, 2011 at 1:40 PM, Rutger Vos <R....@re...> wrote: >>> Yeah, I know, some of the studies are serialized incorrectly, >>> especially the ones with "mixed" data containing both DNA and >>> categorical data in the same matrix, or unusual state definitions in >>> some other way. This results in a character state set definition being >>> written out for every matrix column, and that takes up most of the >>> file. Another thing is that we're now using owl:sameAs statements to >>> specify the TreeBASE ID for every character. >>> >>> There are a number of these issues, they're bugs, I'm recording them - >>> it's one of the things we should be fixing during Laurel's project. A >>> correctly formatted NeXML file is going to be bigger than the >>> equivalent NEXUS file, but perhaps like a factor of ten or so max, >>> depending on the amount of metadata (i.e. on the order of 1Mb for >>> S2012). That is a trade-off that is worth it because it will allow us >>> to export all the metadata in a single file. 13.7 Mb is obviously >>> wrong. >>> >>> On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: >>>> I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. >>>> >>>> Need I say more...? >>>> >>>> Regards >>>> >>>> Rod >>>> >>>> On 15 Apr 2011, at 13:42, William Piel wrote: >>>> >>>>> >>>>> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >>>>> >>>>>> For large studies the Nexml generation simply times out, so I gave up. >>>>> >>>>> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >>>>> >>>>> But, indeed, I'd like access to a dump too. >>>>> >>>>> bp >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>>> Consolidation -- Increasing the use of server virtualization is a top >>>>> priority.Virtualization can reduce costs, simplify management, and improve >>>>> application availability and disaster protection. Learn more about boosting >>>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>>> _______________________________________________ >>>>> Treebase-devel mailing list >>>>> Tre...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>>> >>>> >>>> --------------------------------------------------------- >>>> Roderic Page >>>> Professor of Taxonomy >>>> Institute of Biodiversity, Animal Health and Comparative Medicine >>>> College of Medical, Veterinary and Life Sciences >>>> Graham Kerr Building >>>> University of Glasgow >>>> Glasgow G12 8QQ, UK >>>> >>>> Email: r....@bi... >>>> Tel: +44 141 330 4778 >>>> Fax: +44 141 330 2792 >>>> AIM: rod...@ai... >>>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>>> Twitter: http://twitter.com/rdmpage >>>> Blog: http://iphylo.blogspot.com >>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>> Consolidation -- Increasing the use of server virtualization is a top >>>> priority.Virtualization can reduce costs, simplify management, and improve >>>> application availability and disaster protection. Learn more about boosting >>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>> _______________________________________________ >>>> Treebase-devel mailing list >>>> Tre...@li... >>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>> >>> >>> >>> >>> -- >>> Dr. Rutger A. Vos >>> School of Biological Sciences >>> Philip Lyle Building, Level 4 >>> University of Reading >>> Reading, RG6 6BX, United Kingdom >>> Tel: +44 (0) 118 378 7535 >>> http://rutgervos.blogspot.com >>> >> >> >> >> -- >> Dr. Rutger A. Vos >> School of Biological Sciences >> Philip Lyle Building, Level 4 >> University of Reading >> Reading, RG6 6BX, United Kingdom >> Tel: +44 (0) 118 378 7535 >> http://rutgervos.blogspot.com >> > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: r....@bi... > Tel: +44 141 330 4778 > Fax: +44 141 330 2792 > AIM: rod...@ai... > Facebook: http://www.facebook.com/profile.php?id=1112517192 > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com |
From: Roderic P. <r....@bi...> - 2011-04-19 10:26:00
|
Below is the list of TreeBASE studies that have failed to output Nexml when I've tried to harvest them with a timeout of 10 minutes. Any way to get hold of these? Rod S131 S132 S134 S202 S613 S1085 S1158 S1183 S1197 S1302 S1303 S1306 S1307 S1308 S1309 S1310 S1311 S1312 S1313 S1314 S1315 S1316 S1317 S1318 S1319 S1320 S1321 S1322 S1326 S1330 S1936 S2039 S2078 S2372 S2373 S2376 S2377 S9993 S9997 S9998 S9999 S10071 S10287 S10316 S10335 S10433 S10507 S10508 S10511 S10541 S10603 S10613 S10635 S10665 S10689 S10736 S10888 S10917 S10940 S11032 S11080 On 18 Apr 2011, at 13:47, Rutger Vos wrote: > To give an example of how things should be: I've also done a NeXML > dump and split all harvested studies in their constituent trees, > matrices and taxa blocks. The largest NeXML tree file (with taxa > block) in TreeBASE is 365Kb for a for a 585 taxon tree. To me that > seems a reasonable size. The bulk of a matrix file for that set of > taxa should be <seq> elements with raw character state sequences, > preceded by a taxa block and an nchar list of <char> elements. You can > imagine that that's not going to be 13.7 Mb once things are working > correctly. > > On Mon, Apr 18, 2011 at 1:40 PM, Rutger Vos <R....@re...> wrote: >> Yeah, I know, some of the studies are serialized incorrectly, >> especially the ones with "mixed" data containing both DNA and >> categorical data in the same matrix, or unusual state definitions in >> some other way. This results in a character state set definition being >> written out for every matrix column, and that takes up most of the >> file. Another thing is that we're now using owl:sameAs statements to >> specify the TreeBASE ID for every character. >> >> There are a number of these issues, they're bugs, I'm recording them - >> it's one of the things we should be fixing during Laurel's project. A >> correctly formatted NeXML file is going to be bigger than the >> equivalent NEXUS file, but perhaps like a factor of ten or so max, >> depending on the amount of metadata (i.e. on the order of 1Mb for >> S2012). That is a trade-off that is worth it because it will allow us >> to export all the metadata in a single file. 13.7 Mb is obviously >> wrong. >> >> On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: >>> I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. >>> >>> Need I say more...? >>> >>> Regards >>> >>> Rod >>> >>> On 15 Apr 2011, at 13:42, William Piel wrote: >>> >>>> >>>> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >>>> >>>>> For large studies the Nexml generation simply times out, so I gave up. >>>> >>>> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >>>> >>>> But, indeed, I'd like access to a dump too. >>>> >>>> bp >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Benefiting from Server Virtualization: Beyond Initial Workload >>>> Consolidation -- Increasing the use of server virtualization is a top >>>> priority.Virtualization can reduce costs, simplify management, and improve >>>> application availability and disaster protection. Learn more about boosting >>>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>>> _______________________________________________ >>>> Treebase-devel mailing list >>>> Tre...@li... >>>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>>> >>> >>> --------------------------------------------------------- >>> Roderic Page >>> Professor of Taxonomy >>> Institute of Biodiversity, Animal Health and Comparative Medicine >>> College of Medical, Veterinary and Life Sciences >>> Graham Kerr Building >>> University of Glasgow >>> Glasgow G12 8QQ, UK >>> >>> Email: r....@bi... >>> Tel: +44 141 330 4778 >>> Fax: +44 141 330 2792 >>> AIM: rod...@ai... >>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>> Twitter: http://twitter.com/rdmpage >>> Blog: http://iphylo.blogspot.com >>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>> >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Benefiting from Server Virtualization: Beyond Initial Workload >>> Consolidation -- Increasing the use of server virtualization is a top >>> priority.Virtualization can reduce costs, simplify management, and improve >>> application availability and disaster protection. Learn more about boosting >>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>> _______________________________________________ >>> Treebase-devel mailing list >>> Tre...@li... >>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>> >> >> >> >> -- >> Dr. Rutger A. Vos >> School of Biological Sciences >> Philip Lyle Building, Level 4 >> University of Reading >> Reading, RG6 6BX, United Kingdom >> Tel: +44 (0) 118 378 7535 >> http://rutgervos.blogspot.com >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading, RG6 6BX, United Kingdom > Tel: +44 (0) 118 378 7535 > http://rutgervos.blogspot.com > --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |
From: Rutger V. <R....@re...> - 2011-04-18 12:47:30
|
To give an example of how things should be: I've also done a NeXML dump and split all harvested studies in their constituent trees, matrices and taxa blocks. The largest NeXML tree file (with taxa block) in TreeBASE is 365Kb for a for a 585 taxon tree. To me that seems a reasonable size. The bulk of a matrix file for that set of taxa should be <seq> elements with raw character state sequences, preceded by a taxa block and an nchar list of <char> elements. You can imagine that that's not going to be 13.7 Mb once things are working correctly. On Mon, Apr 18, 2011 at 1:40 PM, Rutger Vos <R....@re...> wrote: > Yeah, I know, some of the studies are serialized incorrectly, > especially the ones with "mixed" data containing both DNA and > categorical data in the same matrix, or unusual state definitions in > some other way. This results in a character state set definition being > written out for every matrix column, and that takes up most of the > file. Another thing is that we're now using owl:sameAs statements to > specify the TreeBASE ID for every character. > > There are a number of these issues, they're bugs, I'm recording them - > it's one of the things we should be fixing during Laurel's project. A > correctly formatted NeXML file is going to be bigger than the > equivalent NEXUS file, but perhaps like a factor of ten or so max, > depending on the amount of metadata (i.e. on the order of 1Mb for > S2012). That is a trade-off that is worth it because it will allow us > to export all the metadata in a single file. 13.7 Mb is obviously > wrong. > > On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: >> I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. >> >> Need I say more...? >> >> Regards >> >> Rod >> >> On 15 Apr 2011, at 13:42, William Piel wrote: >> >>> >>> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >>> >>>> For large studies the Nexml generation simply times out, so I gave up. >>> >>> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >>> >>> But, indeed, I'd like access to a dump too. >>> >>> bp >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Benefiting from Server Virtualization: Beyond Initial Workload >>> Consolidation -- Increasing the use of server virtualization is a top >>> priority.Virtualization can reduce costs, simplify management, and improve >>> application availability and disaster protection. Learn more about boosting >>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>> _______________________________________________ >>> Treebase-devel mailing list >>> Tre...@li... >>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>> >> >> --------------------------------------------------------- >> Roderic Page >> Professor of Taxonomy >> Institute of Biodiversity, Animal Health and Comparative Medicine >> College of Medical, Veterinary and Life Sciences >> Graham Kerr Building >> University of Glasgow >> Glasgow G12 8QQ, UK >> >> Email: r....@bi... >> Tel: +44 141 330 4778 >> Fax: +44 141 330 2792 >> AIM: rod...@ai... >> Facebook: http://www.facebook.com/profile.php?id=1112517192 >> Twitter: http://twitter.com/rdmpage >> Blog: http://iphylo.blogspot.com >> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading, RG6 6BX, United Kingdom > Tel: +44 (0) 118 378 7535 > http://rutgervos.blogspot.com > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com |
From: Rutger V. <R....@re...> - 2011-04-18 12:40:28
|
Yeah, I know, some of the studies are serialized incorrectly, especially the ones with "mixed" data containing both DNA and categorical data in the same matrix, or unusual state definitions in some other way. This results in a character state set definition being written out for every matrix column, and that takes up most of the file. Another thing is that we're now using owl:sameAs statements to specify the TreeBASE ID for every character. There are a number of these issues, they're bugs, I'm recording them - it's one of the things we should be fixing during Laurel's project. A correctly formatted NeXML file is going to be bigger than the equivalent NEXUS file, but perhaps like a factor of ten or so max, depending on the amount of metadata (i.e. on the order of 1Mb for S2012). That is a trade-off that is worth it because it will allow us to export all the metadata in a single file. 13.7 Mb is obviously wrong. On Mon, Apr 18, 2011 at 1:03 PM, Roderic Page <r....@bi...> wrote: > I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. > > Need I say more...? > > Regards > > Rod > > On 15 Apr 2011, at 13:42, William Piel wrote: > >> >> On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: >> >>> For large studies the Nexml generation simply times out, so I gave up. >> >> If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. >> >> But, indeed, I'd like access to a dump too. >> >> bp >> >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: r....@bi... > Tel: +44 141 330 4778 > Fax: +44 141 330 2792 > AIM: rod...@ai... > Facebook: http://www.facebook.com/profile.php?id=1112517192 > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com |
From: Roderic P. <r....@bi...> - 2011-04-18 12:04:10
|
I've started trying again to harvest individual Nexml files, and it's still unbelievably slow. We're talking minutes for a study in some cases. The XML for S2012 took about 5 minutes to fetch and is 13.7 Mb in size(!). The NEXUS file is 164Kb. Need I say more...? Regards Rod On 15 Apr 2011, at 13:42, William Piel wrote: > > On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: > >> For large studies the Nexml generation simply times out, so I gave up. > > If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. > > But, indeed, I'd like access to a dump too. > > bp > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |