From: Scott C. <ca...@cs...> - 2005-08-01 19:45:49
|
On Fri, 2005-07-29 at 17:20 -0700, Hilmar Lapp wrote: > On Jul 29, 2005, at 8:17 AM, Scott Cain wrote: > > > > > The main section of affected code in gmod is the GFF bulk loader, but > > after we make the changes to the bioperl API, it shouldn't be too hard > > to fix the loader. In fact, some of those changes may have already > > started. I remember a few weeks before I release the gmod/chado > > package, Hilmar sent out an announcement that he made some changes. > > You mean around the time of ISMB? I fixed the ontology modules ... they > should actually work better now not worse unless you assumed the > presence of some bugs ;) I guess I must have been assuming bugs :-) I didn't look at diffs, or in much detail what the exact problem was. Since this is the last release that will be using Bio::Onotology, and it is an alpha release, I was not too concerned. > > > While I should have paid attention then, I was busy getting my release > > together, and everything seemed to work, so I ignored it. > > Unfortunately, the reason things continued to work was that I forgot to > > update my bioperl-live, and as a result, the gmod release doesn't work > > with bioperl-live. > > Scott, what would really help sometimes is if in such a situation you > run the bioperl test suite and report the result if there are any > failures, especially those that appear potentially connected to your > problem. Last time the gmod ontology loader ceased to work the problem > would have been readily exposed by the ontology tests in bioperl. It > just helps in zooming in on the problem. I run make test frequently; what I do less often is pay close attention to the result. When working with bioperl-live, one gets a little numb to test failures :-/ > > I'd be eager to help make bioperl work with gmod and vice versa and I'm > sure many others are too, but it'll be difficult if we don't work > towards this collaboratively. For this I really liked the spirit of > Chris' proposal - that's the way to make this work. > > > [...] > > The other section of code that could have been affected but won't be is > > the ontology loader. The current ontology loader depends on > > Bio::Ontology, but I was already planning on migrating to go-perl for > > loading ontologies anyway, so that won't be a problem. > > I'm closing in on the last bugs in the go-perl integration. It remains > to be seen how fast the result is as Chris made me aware in Detroit, > but if it works this will give you both worlds at your choosing. > > -hilmar > > > > > So, who wants to take the lead on this? > > > > Thanks, > > Scott > > > > > > On Thu, 2005-07-28 at 12:42 -0700, Chris Mungall wrote: > >> I think the answer may be even more complicated than this. > >> > >> Lurkers and contributors to the bioperl mailing list may have noticed > >> that > >> there has been some major obstacles in progressing lately, > >> particularly in > >> getting a stable release of the code out. bp1.4 is fairly old, 1.5 is > >> a > >> developers release, though this is the one required by GMOD. > >> > >> My understanding is that this bottleneck can be traced back to > >> changes in > >> the SeqFeature and Annotation model. These changes appear to be > >> required > >> by Bio::SeqFeature::Annotated which is produced by Bio::FeatureIO::gff > >> (which in turn is used by the GMOD bulk loader, which is the main > >> reason > >> GMOD requires 1.5, I believe?). Unfortunately, these changes also > >> break > >> existing code and have a severe negative impact on memory usage. > >> > >> Before advising Cyril and others to switch to BFIO::gff I think it's > >> important to make sure there is a clear path forward with bioperl. My > >> impression is that there is something of a stalemate here. The bioperl > >> developers would like to retract the aforementioned changes, but they > >> believe they cannot do this without breaking GMOD code. They are also > >> extremely uncomfortable about leaving these changes in. Everyone > >> gives up > >> and starts coding around bioperl. > >> > >> Here is why the changes were introduced: > >> > >> BioPerl has a 'scruffy' typing model, whereby feature types > >> (primary_tag > >> in bioperl) and featureprop types (tags in bioperl) are labels or > >> strings. > >> In contrast, Chado forces all types to be some class or relation in an > >> ontology. > >> > >> Now obviously I'm rather partial to the Chado model, but that doesn't > >> mean > >> I think it should be forced upon bioperl. I often use bioperl in > >> scruffy > >> mode (on scruffy data); or in some combination whereby I map the > >> scruffy > >> types to ontologies in some non-bioperl code. When using bioperl as a > >> middleware component over a nicely organised database, ontology-typed > >> mode > >> is definitely best. However, the majority of bioperl users (including > >> myself) spend a large proportion of their time working with scruffy > >> data, > >> in which case lightweight scruffy types are more appropriate. > >> > >> It seems that there is a perfectly simple way of reconciling both > >> approaches. We revert bioperl back to the simpler scruffy model. The > >> majority of users and developers breathe a sigh of relief. We then > >> extend > >> SeqFeatureI with something like SeqFeatureAnnotatedI. This forces > >> types to > >> be stored as OntologyTerms (and I haven't even touched on some of the > >> problems here, but at least we are insulating the standard bioperl > >> layer > >> that 99% of users use from these issues). All classes implementing > >> SFAI > >> will necessarily implement SFI, and the primary_tag and tag_values > >> methods > >> will be supported (not deprecated) as simple delegations to the > >> OntologyTerm objects. > >> > >> We can then modify BFIO::gff (which is an incredibly useful piece of > >> code) > >> and get rid of all the dependencies on SO and Bio::Ontology* and > >> instead > >> allow the user of this module to plug in their own resolver/validator > >> - so > >> they can choose whether they just want fast scruffy lightweight SFI > >> features, or whether they want ontology-typed SFAI features. If the > >> latter, then they can choose their own resolver strategy - by a user > >> supplied hash, by a copy of SO auto-downloaded from sourceforge, by a > >> local chado db, by the genbank->SO mapping table, during parsing vs > >> post-parsing, whatever. In fact there is already > >> Bio::SeqFeature::Tools::TypeMapper, but currently this is mostly > >> concerned > >> with helping Bio::SeqFeature::Tools::Unflattener convert scruffy > >> genbank > >> to something sensible. > >> > >> GMOD (and perhaps biosql) would use SFAI, everyone else would use the > >> simpler SFI. Someone can even get a stable 1.6 release out before all > >> the > >> SFAI details such as how the resolver would work are finalised. I'd > >> really > >> like to see 1.6 include a simpler BFIO::gff that can optionally > >> produces > >> features that aren't SeqFeature::Annotateds, but that's negotiable. > >> > >> There's vast swathes of both GMOD and BioPerl code I'm not familiar > >> with, > >> so it's possible my analysis above is flawed in some way. If it is, > >> then > >> it's up to someone from either camp to speak up! If not, then there's > >> no > >> excuses for the relevant people to start sorting out this mess by > >> commencing with the solution outlined above. > >> > >> Cheers > >> Chris > >> > >>> > >>> Scott > >>> > >>> > >>> On Thu, 2005-07-28 at 18:37 +0200, Cyril Pommier wrote: > >>>> Hello, > >>>> We are going to store analysis results in chado, and we are of > >>>> course > >>>> very interressed by these futur evolutions of GFF3/chado. > >>>> So we would like to make sure that the parsers and conversions > >>>> programs > >>>> we are writing now will be compatible with the futur GFF3. > >>>> > >>>> We are using Bio::SeqFeature::Generic objects that we write with > >>>> Bio::Tools::GFF. > >>>> > >>>> Do you think that Bio::Tools::GFF will be able to handle the new > >>>> 'type' > >>>> column or is it better to switch to Bio::FeatureIO::gff ? > >>>> > >>>> Thanks in advance for any advice. > >>>> > >>>> Cyril > >>>> > >>>> Don Gilbert wrote: > >>>> > >>>>> > >>>>> Scott, > >>>>> > >>>>> Your notes in gmod_bulk_load_gff3.pl suggest it is headed in > >>>>> same direction I suggest below. More about these todo points > >>>>> > >>>>>> - address flybase"s use of of analysisfeature combined with > >>>>>> feature to > >>>>>> give source-type information (in GFF terms). This will need to > >>>>>> be addressed in the GBrowse adaptor. > >>>>>> - modify the bulk loader to allow "mixed" GFF3 files (that is, > >>>>>> containing > >>>>>> both analysis results and annotations). See perldoc > >>>>>> gmod_bulk_load_gff3.pl > >>>>>> for more info > >>>>> > >>>>> > >>>>> Use of chado's analysisfeature table is something others who know > >>>>> it better can comment on. But after working with it for a while > >>>>> it makes sense to me to use in this way: > >>>>> > >>>>> For a future GFF -> Chado loader, treat analysis features such as > >>>>> gene finding results, BLAST, sim4 as 'analysisfeature type' rather > >>>>> than feature CV term type (the ones that now end up with a generic > >>>>> 'match' cvterm). In these cases the Analysis table is populated > >>>>> with > >>>>> program:database_sourcename > >>>>> as the basis of this 'analysisfeature type', such as > >>>>> match:blastx:na_pe.dros > >>>>> match:sim4:DGC > >>>>> match:genie:dummy (or maybe exon:genie) > >>>>> > >>>>> The program:database fits neatly in GFF source field, as > >>>>> #ref source type start stop ... > >>>>> chr1 blastx:na_pe.dros match 1 100 ... > >>>>> chr1 sim4:DGC match 1 100 ... > >>>>> > >>>>> These can be treated in database adaptor analogously to the CVterm > >>>>> table feature types. See at end a list of current GFF feature > >>>>> type:source from worm, rice, yeast, fly MODs. Fly and rice use a > >>>>> syntax like above and worm gff uses BLAT_EMBL_BEST, instead of > >>>>> BLAT:EMBL_BEST. > >>>>> > >>>>> From POD of your bulk_load_gff3.pl > >>>>>> Analysis > >>>>>> If you are loading analysis results (ie, BLAT results, gene > >>>>>> predictions), you should specify the -a flag. If no arguments are > >>>>>> supplied with the -a, then the loader will assume that the results > >>>>>> belong to an analysis set with a name that is the concatenation of > >>>>>> the source (column 2) and the method (column 3) with an underscore > >>>>>> in between. > >>>>> > >>>>> "... then the loader will assume that the results belong to an > >>>>> analysis table row with a program name and database source name > >>>>> taken from Source (column 2, colon separated program:sourcename), > >>>>> with a SOFA feature type taken from Method (column 3). If > >>>>> sourcename doesn't apply, e.g. genefinder, don't add or use > >>>>> 'dummy'. > >>>>> Use the generic 'match' SOFA type if others don't apply." > >>>>> [see also http://song.sourceforge.net/gff3-jan04.shtml#ALIGNMENTS] > >>>>> > >>>>> Note that sourcename of database is a common attribute (all those > >>>>> blasts, blats, sim4, ... are run on several different databases). > >>>>> > >>>>> For that underscore between method and source, where does that go > >>>>> into > >>>>> database? It is used as parts of program or database sourcename > >>>>> names, > >>>>> so it may be problematic to add one if not needed. > >>>>> > >>>>> Oh, I see now from bulk_load_gff3.PLS, you are creating a 'Name' > >>>>> entry > >>>>> for analysis table. This probably is less useful than using Program > >>>>> and Sourcename fields as flybase does, which comes from the common > >>>>> usage where people run various programs, with various database > >>>>> sources > >>>>> and want to plop the results into a database easily. These go into > >>>>> those > >>>>> two fields directly, no need to create or parse a Name entry > >>>>> (which can be and is null in flybase data). > >>>>> > >>>>>> my $search_analysis > >>>>>> = $db->prepare("SELECT analysis_id FROM analysis WHERE name=?"); > >>>>> > >>>>> I think it would be better as > >>>>> my $search_analysis > >>>>> = $db->prepare("SELECT analysis_id FROM analysis WHERE program=? > >>>>> and > >>>>> sourcename=?"); > >>>>> > >>>>>> Otherwise, the argument provided with -a will be taken > >>>>>> as the name of the analysis set. Either way, the analysis set must > >>>>>> already be in the analysis table. The easist way to do this is to > >>>>>> insert it directly in the psql shell: > >>>>>> > >>>>>> INSERT INTO analysis (name, program, programversion) > >>>>>> VALUES ('genscan 2005-2-28','genscan','5.4'); > >>>>> > >>>>> My choice would be to populate the analysis table from GFF data, > >>>>> rather > >>>>> than expect prepraration by user (or as another option). > >>>>> > >>>>> INSERT INTO analysis (program, sourcename) > >>>>> VALUES ('tblastx','na_baylorf1_scfchunk.dpse'); > >>>>> INSERT INTO analysis (program, sourcename) > >>>>> VALUES ('sim4','na_gb.dmel'); > >>>>> INSERT INTO analysis (program, sourcename, programversion) > >>>>> VALUES ('genie_masked','dummy', '1.0'); > >>>>> > >>>>>> There are other columns in the analysis table that are optional; > >>>>>> see > >>>>>> the schema documentation and '\d analysis' in psql for more > >>>>>> information. > >>>>>> > >>>>> .... > >>>>>> A planned addtion to the functionality of handling analysis > >>>>>> results > >>>>>> is to allow "mixed" GFF files, where some lines are analysis > >>>>>> results > >>>>>> and some are not. > >>>>> > >>>>> This is the case for drosophila GFF now (see others also below). If > >>>>> you make the default assumption that if ($method =~ /.*match/) and > >>>>> ($source =~ m/([^:]+):(.+)/), you should get all/most of > >>>>> analysisfeature types, and probably not anything else. > >>>>> > >>>>>> Additionally, one will be able to supply lists of > >>>>>> types (optionally with sources) and their associated entry in the > >>>>>> analysis table. The format will probably be tag value pairs: > >>>>>> > >>>>>> --analysis match:Rice_est=rice_est_blast, \ > >>>>>> match:Maize_cDNA=maize_cdna_blast, \ > >>>>>> mRNA=genscan_prediction,exon=genscan_prediction > >>>>> > >>>>> My suggestion for this (as per GFF source,type columns) would be > >>>>> --analysis match:program:sourcename ... > >>>>> --analysis match:blast:Rice_est,match:blast:Maize_cDNA,\ > >>>>> mRNA:genscan:dummy, exon:genscan:dummy > >>>>> > >>>>> I guess the 'dummy' data sourcename need not be added; flybase > >>>>> uses it > >>>>> to keep that field not-null, but it isn't required by the schema. > >>>>> > >>>>> Here are some snippets from the ChadoFC adaptor I modified > >>>>> from yours (will get into cvs.sf.net 'real soon'), showing that > >>>>> it isn't much work to add this as an analog to how cvterm types > >>>>> are used. > >>>>> > >>>>> -- Don > >>>>> > >>>>> ## Bio::DB::Das::ChadoFC.pm, part of new() - load analysis types > >>>>> ## treat similar to CV table types > >>>>> > >>>>> sub getAnalysisFeatureHash > >>>>> { > >>>>> my $self= shift; > >>>>> > >>>>> my $dbh= $self->dbh(); > >>>>> my $sth = $dbh->prepare("select analysis_id,program,sourcename from > >>>>> analysis") > >>>>> or warn "unable to prepare select cvterms"; > >>>>> $sth->execute or $self->throw("unable to select cvterms"); > >>>>> > >>>>> my(%term2name,%name2term) = ({},{}); > >>>>> > >>>>> while (my $hashref = $sth->fetchrow_hashref) { > >>>>> > >>>>> ## this is dgg syntax of analysis feature names for GFF > >>>>> ## all have generic 'match' method and program:source as 'source' > >>>>> ## a problem, want other main types: EST_match:xxx, mRNA:genie .. > >>>>> etc. > >>>>> my $anfeat= > >>>>> "match:".$hashref->{program}.":".$hashref->{sourcename}; > >>>>> > >>>>> $term2name{ $hashref->{analysis_id} } = $anfeat; > >>>>> $name2term{ $anfeat } = $hashref->{analysis_id}; > >>>>> } > >>>>> $self->an_term2name(\%term2name); > >>>>> $self->an_name2term(\%name2term); > >>>>> } > >>>>> > >>>>> ## Das::ChadoFC::Segment snippets > >>>>> sub features { > >>>>> $self->{has_anatype}=0; > >>>>> my $sql_range = ''; > >>>>> my ($interbase_start,$rend,$srcfeature_id,$sql_types); > >>>>> unless ($feature_id) { > >>>>> $sql_range = $self->sql_range($rangetype); > >>>>> > >>>>> $sql_types = $self->sql_types($types, -1); # dgg > >>>>> > >>>>> $srcfeature_id = $self->{srcfeature_id}; > >>>>> } > >>>>> ... > >>>>> elsif($self->{has_anatype}) { > >>>>> $from_part .= "left join analysisfeature af using (feature_id) "; > >>>>> } > >>>>> > >>>>> > >>>>> sub sql_types > >>>>> .. > >>>>> $valid_type = $factory->name2term($temp_type); > >>>>> $is_anatype= 0; > >>>>> unless ($valid_type) { > >>>>> $valid_type = $factory->an_name2term($temp_type); > >>>>> $self->{has_anatype}= $is_anatype= 1 if ($valid_type); > >>>>> } > >>>>> .. > >>>>> ## leave out extra invalid types > >>>>> if (!$valid_type) { > >>>>> ### skip > >>>>> } elsif ($temp_dbxref) { > >>>>> $sql_types .= $orsql."(f.type_id = $valid_type and fd.dbxref_id = > >>>>> $temp_dbxref)"; > >>>>> } elsif($is_anatype) { > >>>>> $sql_types .= $orsql."(af.analysis_id = $valid_type)"; #<<< > >>>>> } else { > >>>>> $sql_types .= $orsql."(f.type_id = $valid_type)"; > >>>>> } > >>>>> > >>>>> > >>>>> Lists of GFF feature type:source from some current MOD data > >>>>> where * are probably analysisfeature types (program:database) > >>>>> > >>>>> rice gff type:source > >>>>> ftp://ftp.gramene.org/pub/gramene/release17/data/ > >>>>> sequence_annotation/ > >>>>> gff3/ > >>>>> -------------------- > >>>>> CDS:known > >>>>> CDS:tigr > >>>>> EST:cmap > >>>>> EST_match:Barley (? might be EST_match:someprogram:Barley) > >>>>> EST_match:Maize > >>>>> EST_match:Millet > >>>>> EST_match:Rice > >>>>> EST_match:Sorghum > >>>>> EST_match:Wheat > >>>>> cDNA_match:Rice > >>>>> cross_genome_match:Maize > >>>>> cross_genome_match:Rice > >>>>> cross_genome_match:Sorghum > >>>>> * exon:FgenesH:Monocot > >>>>> exon:known > >>>>> exon:tigr > >>>>> five_prime_UTR:tigr > >>>>> gene:known > >>>>> gene:tigr > >>>>> * mRNA:FgenesH:Monocot > >>>>> mRNA:known > >>>>> mRNA:tigr > >>>>> microsatellite:cmap > >>>>> three_prime_UTR:known > >>>>> three_prime_UTR:tigr > >>>>> transposable_element_insertion_site:cmap > >>>>> > >>>>> worm gff type:source > >>>>> ftp://ftp.wormbase.org/pub/wormbase/species/elegans/ > >>>>> genome_feature_tables/GFF3/ > >>>>> ---------------------- > >>>>> CDS:Coding_transcript > >>>>> * CDS:Genefinder > >>>>> CDS:Transposon_CDS > >>>>> CDS:history > >>>>> * CDS:twinscan > >>>>> * EST_match:BLAT_EST_BEST (~ EST_match:BLAT:EST_BEST) > >>>>> * EST_match:BLAT_EST_OTHER > >>>>> PCR_product:GenePair_STS > >>>>> PCR_product:Orfeome > >>>>> RNAi_reagent:RNAi_primary > >>>>> RNAi_reagent:RNAi_secondary > >>>>> SNP:Allele > >>>>> binding_site:binding_site > >>>>> * cDNA_match:BLAT_mRNA_BEST (~ cDNA_match:BLAT:mRNA_BEST ) > >>>>> * cDNA_match:BLAT_mRNA_OTHER > >>>>> clone_end:. > >>>>> clone_start:. > >>>>> complex_substitution :Allele > >>>>> deletion:Allele > >>>>> exon:Coding_transcript > >>>>> * exon:Genefinder > >>>>> exon:Non_coding_transcript > >>>>> exon:Pseudogene > >>>>> exon:Transposon_CDS > >>>>> exon:history > >>>>> exon:miRNA > >>>>> exon:rRNA > >>>>> exon:scRNA > >>>>> exon:snRNA > >>>>> exon:snoRNA > >>>>> exon:tRNA > >>>>> * exon:tRNAscan-SE-1.23 > >>>>> * exon:twinscan > >>>>> experimental_result_region:Expr_profile > >>>>> experimental_result_region:cDNA_for_RNAi > >>>>> * expressed_sequence_match:BLAT_OST_BEST (~ > >>>>> expressed_sequence_match:BLAT:OST_BEST ) > >>>>> * expressed_sequence_match:BLAT_OST_OTHER > >>>>> five_prime_UTR:Coding_transcript > >>>>> gene:Coding_transcript > >>>>> gene:gene > >>>>> gene:history > >>>>> gene:landmark > >>>>> insertion:Allele > >>>>> inverted_repeat:inverted > >>>>> mRNA:Coding_transcript > >>>>> * mRNA:Genefinder > >>>>> mRNA:Transposon_CDS > >>>>> mRNA:history > >>>>> * mRNA:twinscan > >>>>> miRNA:miRNA > >>>>> nc_primary_transcript:Non_coding_transcript > >>>>> * nucleotide_match:BLAT_EMBL_BEST (~ > >>>>> nucleotide_match:BLAT:EMBL_BEST ) > >>>>> * nucleotide_match:BLAT_EMBL_OTHER > >>>>> * nucleotide_match:BLAT_TC1_BEST > >>>>> * nucleotide_match:BLAT_TC1_OTHER > >>>>> * nucleotide_match:BLAT_ncRNA_BEST > >>>>> * nucleotide_match:BLAT_ncRNA_OTHER > >>>>> * nucleotide_match:TEC_RED > >>>>> * nucleotide_match:waba_coding > >>>>> * nucleotide_match:waba_strong > >>>>> * nucleotide_match:waba_weak > >>>>> oligo:. > >>>>> operon:operon > >>>>> polyA_signal_sequence:polyA_signal_sequence > >>>>> polyA_site:polyA_site > >>>>> processed_transcript:gene > >>>>> protein_coding_primary_transcript:Coding_transcript > >>>>> * protein_match:wublastx > >>>>> pseudogene:Pseudogene > >>>>> pseudogene:history > >>>>> rRNA:rRNA > >>>>> reagent:Oligo_set > >>>>> region:. > >>>>> region:Genbank > >>>>> region:Genomic_canonical > >>>>> region:Link > >>>>> * repeat_region:RepeatMasker > >>>>> scRNA:scRNA > >>>>> sequence_variant:. > >>>>> sequence_variant:Allele > >>>>> snRNA:snRNA > >>>>> snoRNA:snoRNA > >>>>> substitution:Allele > >>>>> tRNA:tRNA > >>>>> * tRNA:tRNAscan-SE-1.23 > >>>>> tandem_repeat:tandem > >>>>> three_prime_UTR:Coding_transcript > >>>>> trans_splice_acceptor_site:SL1 > >>>>> trans_splice_acceptor_site:SL2 > >>>>> transcript:SAGE_transcript > >>>>> * translated_nucleotide_match:BLAT_NEMATODE (~ > >>>>> translated_nucleotide_match:BLAT:NEMATODE ) > >>>>> transposable_element:Transposon > >>>>> transposable_element:Transposon_CDS > >>>>> transposable_element_insertion_site:Allele > >>>>> transposable_element_insertion_site:Mos_insertion_allele > >>>>> > >>>>> > >>>>> fly gff type:source > >>>>> ftp://ftp.flybase.net/genomes/dmel/current/gff/ > >>>>> ----------------------- > >>>>> BAC:. > >>>>> CDS:. > >>>>> aberration_junction:. > >>>>> chromosome:. > >>>>> chromosome_arm:. > >>>>> chromosome_band:. > >>>>> enhancer:. > >>>>> exon:. > >>>>> five_prime_UTR:. > >>>>> gene:. > >>>>> insertion_site:. > >>>>> intron:. > >>>>> mRNA:. > >>>>> * match:RNAiHDP > >>>>> * match:assembly:path > >>>>> * match:blastx:aa_SPTR.dmel > >>>>> * match:blastx:aa_SPTR.insect > >>>>> * match:blastx:aa_SPTR.othinv > >>>>> * match:blastx:aa_SPTR.othvert > >>>>> * match:blastx:aa_SPTR.plant > >>>>> * match:blastx:aa_SPTR.primate > >>>>> * match:blastx:aa_SPTR.rodent > >>>>> * match:blastx:aa_SPTR.worm > >>>>> * match:blastx:aa_SPTR.yeast > >>>>> * match:genscan > >>>>> * match:repeatmasker > >>>>> * match:sim4:na_ARGs.dros > >>>>> * match:sim4:na_ARGsCDS.dros > >>>>> * match:sim4:na_DGC_dros > >>>>> * match:sim4:na_dbEST.diff.dmel > >>>>> * match:sim4:na_dbEST.same.dmel > >>>>> * match:sim4:na_gadfly_dmel_r2 > >>>>> * match:sim4:na_gb.dmel > >>>>> * match:sim4:na_gb.tpa.dmel > >>>>> * match:sim4:na_smallRNA.dros > >>>>> * match:sim4:na_transcript_dmel_r31 > >>>>> * match:sim4:na_transcript_dmel_r32 > >>>>> * match:tRNAscan-SE:. > >>>>> * match:tblastx:na_agambiae > >>>>> * match:tblastx:na_dbEST.insect > >>>>> * match:tblastx:na_dpse > >>>>> * match_part:RNAiHDP > >>>>> * match_part:assembly:path > >>>>> * match_part:blastx:aa_SPTR.dmel > >>>>> * match_part:blastx:aa_SPTR.insect > >>>>> * match_part:blastx:aa_SPTR.othinv > >>>>> * match_part:blastx:aa_SPTR.othvert > >>>>> * match_part:blastx:aa_SPTR.plant > >>>>> * match_part:blastx:aa_SPTR.primate > >>>>> * match_part:blastx:aa_SPTR.rodent > >>>>> * match_part:blastx:aa_SPTR.worm > >>>>> * match_part:blastx:aa_SPTR.yeast > >>>>> * match_part:genscan > >>>>> * match_part:repeatmasker > >>>>> * match_part:sim4:na_ARGs.dros > >>>>> * match_part:sim4:na_ARGsCDS.dros > >>>>> * match_part:sim4:na_DGC_dros > >>>>> * match_part:sim4:na_dbEST.diff.dmel > >>>>> * match_part:sim4:na_dbEST.same.dmel > >>>>> * match_part:sim4:na_gadfly_dmel_r2 > >>>>> * match_part:sim4:na_gb.dmel > >>>>> * match_part:sim4:na_gb.tpa.dmel > >>>>> * match_part:sim4:na_smallRNA.dros > >>>>> * match_part:sim4:na_transcript_dmel_r31 > >>>>> * match_part:sim4:na_transcript_dmel_r32 > >>>>> * match_part:tRNAscan-SE:. > >>>>> * match_part:tblastx:na_agambiae > >>>>> * match_part:tblastx:na_dbEST.insect > >>>>> * match_part:tblastx:na_dpse > >>>>> mature_peptide:. > >>>>> ncRNA:. > >>>>> oligo:. > >>>>> point_mutation:. > >>>>> polyA_site:. > >>>>> protein_binding_site:. > >>>>> pseudogene:. > >>>>> region:. > >>>>> regulatory_region:. > >>>>> rescue_fragment:. > >>>>> scaffold:. > >>>>> sequence_variant:. > >>>>> snRNA:. > >>>>> snoRNA:. > >>>>> tRNA:. > >>>>> three_prime_UTR:. > >>>>> transcription_start_site:. > >>>>> transposable_element:. > >>>>> transposable_element_insertion_site:. 3116 > >>>>> > >>>>> > >>>>> yeast gff type:source count > >>>>> ftp://genome-ftp.stanford.edu/pub/yeast/data_download/ > >>>>> chromosomal_feature/saccharomyces_cerevisiae.gff > >>>>> ------------------------- > >>>>> ARS:SGD > >>>>> CDS:SGD > >>>>> binding_site:SGD > >>>>> centromere:SGD > >>>>> chromosome:SGD > >>>>> gene:SGD > >>>>> insertion:SGD > >>>>> intron:SGD > >>>>> ncRNA:SGD > >>>>> nc_primary_transcript:SGD > >>>>> nucleotide_match:SGD > >>>>> pseudogene:SGD > >>>>> rRNA:SGD > >>>>> region:SGD > >>>>> region:landmark > >>>>> repeat_family:SGD > >>>>> repeat_region:SGD > >>>>> snRNA:SGD > >>>>> snoRNA:SGD > >>>>> tRNA:SGD > >>>>> telomere:SGD > >>>>> transposable_element:SGD > >>>>> transposable_element_gene:SGD > >>>>> > >>>>> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405 > >>>>> -- gil...@in... -- http://marmot.bio.indiana.edu/ > >>>>> > >>>>> > >>>>> > >>>>> ------------------------------------------------------- > >>>>> This SF.Net email is sponsored by the 'Do More With Dual!' webinar > >>>>> happening > >>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the latest > >>>>> in dual > >>>>> core and dual graphics technology at this free one hour event > >>>>> hosted > >>>>> by HP, AMD, and NVIDIA. To register visit > >>>>> http://www.hp.com/go/dualwebinar > >>>>> _______________________________________________ > >>>>> Gmod-gbrowse mailing list > >>>>> Gmo...@li... > >>>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > >>>>> > >>>> > >>>> > >>> -- > >>> --------------------------------------------------------------------- > >>> --- > >>> Scott Cain, Ph. D. > >>> ca...@cs... > >>> GMOD Coordinator (http://www.gmod.org/) > >>> 216-392-3087 > >>> Cold Spring Harbor Laboratory > >>> > >>> > >>> > >>> ------------------------------------------------------- > >>> SF.Net email is Sponsored by the Better Software Conference & EXPO > >>> September > >>> 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices > >>> Agile & Plan-Driven Development * Managing Projects & Teams * > >>> Testing & QA > >>> Security * Process Improvement & Measurement * > >>> http://www.sqe.com/bsce5sf > >>> _______________________________________________ > >>> Gmod-devel mailing list > >>> Gmo...@li... > >>> https://lists.sourceforge.net/lists/listinfo/gmod-devel > >>> > >> > >> > >> > >> > >> ------------------------------------------------------- > >> SF.Net email is Sponsored by the Better Software Conference & EXPO > >> September > >> 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices > >> Agile & Plan-Driven Development * Managing Projects & Teams * Testing > >> & QA > >> Security * Process Improvement & Measurement * > >> http://www.sqe.com/bsce5sf > >> _______________________________________________ > >> Gmod-devel mailing list > >> Gmo...@li... > >> https://lists.sourceforge.net/lists/listinfo/gmod-devel > > -- > > ----------------------------------------------------------------------- > > - > > Scott Cain, Ph. D. > > ca...@cs... > > GMOD Coordinator (http://www.gmod.org/) > > 216-392-3087 > > Cold Spring Harbor Laboratory > > > > _______________________________________________ > > Bioperl-l mailing list > > Bio...@po... > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. ca...@cs... GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory |