Re: [Treebase-devel] viz problem with trees for S11267

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Rick,

if you mean whether we currently write out accession numbers for
sequences then no, that's currently not possible. You can do more with
NCBI taxonomy identifiers, though, under the current API.

Rutger

On Thu, Apr 14, 2011 at 2:39 PM, Richard Ree <rr...@fi...> wrote:
> Hi folks,
> Just joined the list, as I am interested in using TB to develop
> collaborative methods for synthesizing plant phylogeny, e.g., by grafting
> clades together and other kinds of agglomeration.  So, I am particularly
> interested in the API and harvesting, especially with respect to taxonomic
> names and classifications, and GenBank identifiers of sequences.
> Speaking of which, how would I go about harvesting the GenBank numbers for a
> given study and associating them with their alignments?  Its that possible
> with the current API?
> I like Rutger's plug-in ideas.  In general, it's easier for someone like me
> to provide a simple web service, rather than contribute directly to TB
> development.
> -Rick
>
> On Thu, Apr 14, 2011 at 7:24 AM, Rutger Vos <R....@re...> wrote:
>>
>> I've been thinking about a redesign lately, and here's what I would do:
>>
>> - make sure we can export all the metadata from TreeBASE in (nexml)
>> files, i.e. Laurel's GSoC project
>>
>> - get all the files for all the studies and create a simple folder
>> structure with those files, more or less along the lines of the
>> phylows urls (e.g phylows/tree/TB2/nexml/T1312), in all file formats
>> we can think of (json, nexus, phylip, newick, phyloxml, fasta, static
>> html, etc...)
>>
>> - create some mod_rewrite rules to map our urls onto the folder
>> structure (so going from phylows/tree/TB2:T1312?format=nexml to the
>> actual file). Performance will obviously be much, much better.
>>
>> - also allow harvesting of those files using rsync and ftp. Now we
>> have data dumps.
>>
>> - create a simple plug in architecture where, given a query string, a
>> remote web service returns a list of hits which it somehow generates
>> from its local, harvested data dump
>>
>> Here's a use case: Laurel wants to implement BLASTing into TreeBASE.
>> So she periodically harvests everything in phylows/matrix/TB2/fasta
>> with an rsync cron job. She runs formatdb on the fasta sequences,
>> creating a standalone BLAST. She then writes a simple cgi script that
>> accepts a target string (e.g.
>> myblast.cgi?tb2.blast=acgctcgcatcgcatcgactacgac) and returns a list of
>> phylows matrix urls from the matching results.
>>
>> On the TreeBASE side, we simply add a search widget that delegates the
>> query to Laurel's service and integrates it into our interfaces
>> (graphical, web services).
>>
>> With an architecture like that it would be so much easier for anyone
>> to add functionality in whatever programming language they like
>> without having to deal with a massive database schema. Of course any
>> of those remote services might have its own little database, but it'd
>> be more along the lines of a three-table SQLite database to store the
>> ITIS taxonomy structure such that we can find all TreeBASE taxa
>> subtended by a given ITIS higher taxon ID (for example).
>>
>> We'd have to implement some core plug ins ourselves, notably one that
>> extracts the metadata (using RDFA2RDFXML) and sticks that in a triple
>> store so we can search on, say, author names, journals, etc. I think
>> that's scalable because it's only a few dozen triples for each study.
>>
>> The idea is a little bit inspired by DAS, which seems to work quite
>> well: http://www.biodas.org/wiki/Main_Page
>>
>> (I'm leaving out the submission part as an exercise for the reader.
>> Presumably there would have to be a restricted area where properly
>> formatted files are uploaded and made available to reviewers.)
>>
>> Rutger
>>
>> p.s. I've done some harvesting a few weeks back, but unless that's
>> created queries that are still running I'm innocent.
>>
>> On Thu, Apr 14, 2011 at 12:28 PM, Roderic Page <r....@bi...>
>> wrote:
>> > TreeBASE  performance has nothing to do with me folks, I pretty much
>> > gave up trying to download data from it a few weeks back. Someone really,
>> > really needs to rethink the way TreeBASE works, because it's virtually
>> > unusable.
>> >
>> > Regards
>> >
>> > Rod
>> >
>> > On 14 Apr 2011, at 04:20, William Piel wrote:
>> >
>> >>
>> >> On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote:
>> >>
>> >>> The trees in S11267 seem to silently fail to render in PhyloWidget.
>> >>> All I get is a single dot. At least try the first three:
>> >>>
>> >>> http://www.treebase.org/treebase-web/search/study/trees.html?id=11267
>> >>>
>> >>> Is this a temporary glitch, a problem with the reconstructed file, or
>> >>> something else? Should I best file this as a bug in the bug tracker?
>> >>>
>> >>>      -hilmar
>> >>>
>> >>
>> >> Thanks for the alert. This is actually a bug in PhyloWidget -- if the
>> >> word "tree" appears in the title of the tree block, PhyloWidget confuses
>> >> this word with the TREE command, and so goofs up the parsing.
>> >>
>> >> I have removed the word "tree" from the tree block, so it works now.
>> >>
>> >> However, you might need to refresh the PhyloWidget window after loading
>> >> to get the tree to pull through -- TreeBASE feels quite slow lately; our API
>> >> must be getting hit by  someone (Rod?).
>> >>
>> >> bp
>> >>
>> >>
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> Benefiting from Server Virtualization: Beyond Initial Workload
>> >> Consolidation -- Increasing the use of server virtualization is a top
>> >> priority.Virtualization can reduce costs, simplify management, and
>> >> improve
>> >> application availability and disaster protection. Learn more about
>> >> boosting
>> >> the value of server virtualization.
>> >> http://p.sf.net/sfu/vmware-sfdev2dev
>> >> _______________________________________________
>> >> Treebase-devel mailing list
>> >> Tre...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/treebase-devel
>> >>
>> >
>> > ---------------------------------------------------------
>> > Roderic Page
>> > Professor of Taxonomy
>> > Institute of Biodiversity, Animal Health and Comparative Medicine
>> > College of Medical, Veterinary and Life Sciences
>> > Graham Kerr Building
>> > University of Glasgow
>> > Glasgow G12 8QQ, UK
>> >
>> > Email: r....@bi...
>> > Tel: +44 141 330 4778
>> > Fax: +44 141 330 2792
>> > AIM: rod...@ai...
>> > Facebook: http://www.facebook.com/profile.php?id=1112517192
>> > Twitter: http://twitter.com/rdmpage
>> > Blog: http://iphylo.blogspot.com
>> > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Benefiting from Server Virtualization: Beyond Initial Workload
>> > Consolidation -- Increasing the use of server virtualization is a top
>> > priority.Virtualization can reduce costs, simplify management, and
>> > improve
>> > application availability and disaster protection. Learn more about
>> > boosting
>> > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
>> > _______________________________________________
>> > Treebase-devel mailing list
>> > Tre...@li...
>> > https://lists.sourceforge.net/lists/listinfo/treebase-devel
>> >
>>
>>
>>
>> --
>> Dr. Rutger A. Vos
>> School of Biological Sciences
>> Philip Lyle Building, Level 4
>> University of Reading
>> Reading
>> RG6 6BX
>> United Kingdom
>> Tel: +44 (0) 118 378 7535
>> http://www.nexml.org
>> http://rutgervos.blogspot.com
>>
>>
>> ------------------------------------------------------------------------------
>> Benefiting from Server Virtualization: Beyond Initial Workload
>> Consolidation -- Increasing the use of server virtualization is a top
>> priority.Virtualization can reduce costs, simplify management, and improve
>> application availability and disaster protection. Learn more about
>> boosting
>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
>> _______________________________________________
>> Treebase-devel mailing list
>> Tre...@li...
>> https://lists.sourceforge.net/lists/listinfo/treebase-devel
>
>
> ------------------------------------------------------------------------------
> Benefiting from Server Virtualization: Beyond Initial Workload
> Consolidation -- Increasing the use of server virtualization is a top
> priority.Virtualization can reduce costs, simplify management, and improve
> application availability and disaster protection. Learn more about boosting
> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
> _______________________________________________
> Treebase-devel mailing list
> Tre...@li...
> https://lists.sourceforge.net/lists/listinfo/treebase-devel
>
>

-- 
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com