[Gusdev-gusdev] building gene models in LoadAnnotatedSeqs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

folks-

We expect that LoadAnnotatedSeqs will be using bioPerl's 
Bio::SeqFeature::Tools::Unflattener to create the feature hierarchy from 
genbank/embl records.   (the tigr xml already has that info).

the unflattener (written by Chris Mungall) has a "smart" algorithm to 
infer containment relationships from those implicit in the sloppy 
genbank syntax.

here is the documentation for the unflattener:  
http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/SeqFeature/Tools/Unflattener.pm?rev=1.25&cvsroot=bioperl&content-type=text/vnd.viewcvs-markup

if you are interested, you might want to check out the algorithm.

by default, the unflattener only handles:
  Gene, RNA, CDS, exon, intron and psuedogenes

probably we'll start out either using that default, or, modifying it to 
include, say, polyA_signals, etc.

we would leave to future iterations of the plugin the ability to 
configure the plugin to specify what containment to use.

steve