From: Steve F. <sfi...@pc...> - 2004-12-09 19:25:01
|
folks- We expect that LoadAnnotatedSeqs will be using bioPerl's Bio::SeqFeature::Tools::Unflattener to create the feature hierarchy from genbank/embl records. (the tigr xml already has that info). the unflattener (written by Chris Mungall) has a "smart" algorithm to infer containment relationships from those implicit in the sloppy genbank syntax. here is the documentation for the unflattener: http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/SeqFeature/Tools/Unflattener.pm?rev=1.25&cvsroot=bioperl&content-type=text/vnd.viewcvs-markup if you are interested, you might want to check out the algorithm. by default, the unflattener only handles: Gene, RNA, CDS, exon, intron and psuedogenes probably we'll start out either using that default, or, modifying it to include, say, polyA_signals, etc. we would leave to future iterations of the plugin the ability to configure the plugin to specify what containment to use. steve |