From: axiom7 <ax...@me...> - 2008-11-06 17:48:44
|
Hi, I have filed the anomaly in the gmod project as you suggested. I didn't use the flybase data source, as I was following the directions from gmod for the genbank2chado package. I will try the other source(s) you suggested and get back to you. Thanks Scott. Susan Scott Cain-3 wrote: > > Hi Susan, > > I can certainly see what is wrong; the fix is another matter: GFF3 > lines are only allowed to have a single ID, but the mRNA line you > pointed to has two: CG17683.t01 and CG17683.t06. Why this happened is > not clear to me; I would have to assume a bug in bp_genebank2gff3.pl. > If you could file this as a bug in the gmod project (as part of > Chado), I should be able to look at it in the next few days: > > https://sourceforge.net/tracker2/?group_id=27707&atid=391291 > > On another track, why aren't you using the Dmel GFF3 from flybase: > > > ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.9_FB2008_06/gff/ > > (Full disclosure: I haven't tried to load the flybase GFF into a Chado > instance recently, so I can't comment on whether it will really work > on not--but it has a much better chance). Or, using the flybase > database dump of Chado: > > ftp://ftp.flybase.net/releases/current/psql/ > > Scott > > > On Thu, Nov 6, 2008 at 11:07 AM, axiom7 <ax...@me...> wrote: >> >> Hi, >> >> I have downloaded the Drosophila melanogaster *.gbk.gz files from >> bio-mirror.net/biomirror/ncbigenomes/Drosophila_melanogaster and run >> bp_genebank2gff3.pl on them to create the *.gbk.gz.gff files. However, >> the >> load fails immediately: >> >> perl bin/gmod_bulk_load_gff3.pl --dbname dev_chado_01c -dbxref GeneID >> --organism fromdata --gff >> data/Drosophila_melanogaster/CHR_2/NT_033778.gbk.gz.gff >> (Re)creating the uniquename cache in the database... >> Creating table... >> Populating table... >> Creating indexes...Done. >> Preparing data for inserting into the dev_chado_01c database >> (This may take a while ...) >> Organism Drosophila melanogaster from data >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Error in line: >> NT_033778 GenBank mRNA 18442 18629 . + . >> ID=CG17683.t01,CG17683.t06;Parent=CG17683,CG17683;locus_tag=Dmel_CG17683;gene=CG17683;product=CG17683-RA%2C >> transcript variant >> A;Dbxref=GI:116007463,FLYBASE:FBgn0040002,GeneID:3355011;transcript_id=NM_001042963.1 >> >> A feature may have at most one ID value >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /oracle/genbank2chado/lib/Bio/Root/Root.pm:359 >> STACK: Bio::FeatureIO::gff::_handle_feature >> /oracle/genbank2chado/lib/Bio/FeatureIO/gff.pm:696 >> STACK: Bio::FeatureIO::gff::next_feature >> /oracle/genbank2chado/lib/Bio/FeatureIO/gff.pm:165 >> STACK: bin/gmod_bulk_load_gff3.pl:819 >> ----------------------------------------------------------- >> Issuing rollback() for database handle being DESTROY'd without explicit >> disconnect(). >> >> The "head" command on the file is as follows, which shows the script >> failing >> on the first mRNA line: >> >> head data/Drosophila_melanogaster/CHR_2/NT_033778.gbk.gz.gff >> ##gff-version 3 >> # sequence-region NT_033778 1 21146708 >> # conversion-by bp_genbank2gff3.pl >> # organism Drosophila melanogaster >> # date 14-MAY-2008 >> # Note Drosophila melanogaster chromosome 2R. >> NT_033778 GenBank chromosome 1 21146708 . + >> . ID=NT_033778;mol_type=genomic >> DNA;date=14-MAY-2008;comment1=REVIEWED >> REFSEQ: This record has been curated by FlyBase. The reference sequence >> was >> derived from AE013599. On Oct 10%2C 2006 this sequence version replaced >> gi:56407907. COMPLETENESS: full length. ;Note=Drosophila melanogaster >> chromosome >> 2R.;Alias=2R;chromosome=2R;Dbxref=taxon:7227;organism=Drosophila >> melanogaster >> NT_033778 GenBank region 1 1285689 . + . >> ID=GenBank:region:NT_033778:1:1285689;Note=Heterochromatic sequence >> NT_033778 GenBank gene 18442 20468 . + . >> ID=CG17683;locus_tag=Dmel_CG17683;gene=CG17683;Note=CG17683%3B Annotated >> by >> Drosophila Heterochromatin Genome Project%2C Lawrence Berkeley National >> Lab%2C http://www.dhgp.org;Dbxref=FLYBASE:FBgn0040002,GeneID:3355011 >> NT_033778 GenBank mRNA 18442 18629 . + . >> ID=CG17683.t01,CG17683.t06;Parent=CG17683,CG17683;locus_tag=Dmel_CG17683;gene=CG17683;product=CG17683-RA%2C >> transcript variant >> A;Dbxref=GI:116007463,FLYBASE:FBgn0040002,GeneID:3355011;transcript_id=NM_001042963.1 >> >> I obtained the scripts from >> rsync://eugenes.org/argos/gmod/web/gmod/genbank2chado: >> >> head bin/bp_genbank2gff3.pl >> #!/usr/bin/perl -w >> >> #$Id: genbank2gff3.PLS,v 1.11 2007/03/19 16:42:05 bosborne Exp $; >> >> >> head bin/gmod_bulk_load_gff3.pl >> #!/usr/bin/perl >> >> >> =item dgg notes, 2007 march >> >> Can anybody see what is wrong with this? >> >> Thanks. >> Susan >> >> >> >> -- >> View this message in context: >> http://www.nabble.com/gmod_bulk_load_gff3-of-Drosophila-melanogaster-fails-tp20364068p20364068.html >> Sent from the gmod-devel mailing list archive at Nabble.com. >> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win great >> prizes >> Grand prize is a trip for two to an Open Source event anywhere in the >> world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> Gmod-devel mailing list >> Gmo...@li... >> https://lists.sourceforge.net/lists/listinfo/gmod-devel >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain > dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the > world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Gmod-devel mailing list > Gmo...@li... > https://lists.sourceforge.net/lists/listinfo/gmod-devel > > -- View this message in context: http://www.nabble.com/gmod_bulk_load_gff3-of-Drosophila-melanogaster-fails-tp20364068p20366192.html Sent from the gmod-devel mailing list archive at Nabble.com. |