From: Scott C. <sc...@sc...> - 2010-07-26 15:20:37
|
Hi Dave, Please keep your responses on the list so they can be archived. I'm also cc'ing Nathan Liles, who did the work on the genbank2gff3 script to deal with bacterial genomes. Perhaps Nathan can take a look at this genbank entry and see more quickly what the problem is. Thanks, Scott On Sun, Jul 25, 2010 at 8:26 AM, David Breimann <dav...@gm...> wrote: > Scott, > > I cloned the latest version of bioperl from github (I'm not sure what you > mean by developers version; I thought the dev branch is obsolete but I'm not > sure; anyway - I got the version from bioperl-live). > bp_genbank2gff3.pl fails exactly on features which are on the margin, e.g. > "Ranges not in correct order. Strange ensembl genbank entry? Range: > [207497,208369] [1,687]". > > Thanks, > Dave > > On Fri, Jul 23, 2010 at 6:10 PM, Scott Cain <sc...@sc...> wrote: >> Hi David, >> >> The NCBI GFF3 is notoriously bad and doesn't pass validation at the >> GFF3 validator: >> >> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online >> >> The most notable problems actually have to do with the relationships >> between features. For example, in the first few lines: >> >> NC_007777.1 RefSeq gene 35 1723 . + . >> locus_tag=Francci3_0001;db_xref=GeneID:3902947 >> NC_007777.1 RefSeq CDS 35 1720 . + 0 >> locus_tag=Francci3_0001;transl_table=11;product=chromosomal >> replication initiator protein >> >> DnaA;protein_id=YP_479125.1;db_xref=GI:86738725;db_xref=InterPro:IPR001957;db_xref=InterPro:IPR003593;db_xref=InterPro:IPR013159;db_xref=InterPro:IPR013317;db_xref=GeneID:3902947;exon_number=1 >> >> While there is not anything technically wrong with these two lines, >> there is what you might call a logic error: the CDS should have the >> gene as a parent. Without that information, a genome browser is going >> to have a difficult time displaying the data appropriately. Feel free >> to complain to the folks at NCBI that there GFF3 is really bad (I've >> done that a few times, but I think they are ignoring me :-) >> >> So, the question is, what should you use? The best option I can >> suggest to you is the genbank2gff3 script that comes with BioPerl, >> called bp_genbank2gff3.pl. If you get the developers version from >> github, you can use a version of that script that has been fixed to >> work appropriately with bacterial/circular genomes. >> >> Scott >> >> >> On Fri, Jul 23, 2010 at 10:54 AM, David Breimann >> <dav...@gm...> wrote: >>> I am trying to set up my first genome, after successfully playing with >>> the tutorial examples. and I run into some problems. >>> >>> I use a fasta and a gff file from NCBI: >>> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Frankia_CcI3/NC_007777.fna >>> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Frankia_CcI3/NC_007777.gff >>> >>> Setting up the sequence file seems to pass OK, but when I run >>> flatfile-to-json.pl with the GFF I get an error: >>> >>> >>> ../../../jbrowse/bin/flatfile-to-json.pl --gff NC_007777.gff >>> --tracklabel test -key test >>> >>> working on seq gi|86738724|ref|NC_007777.1| >>> Use of uninitialized value in string eq at >>> ../../../jbrowse/bin/flatfile-to-json.pl line 179, <GEN2> line 24. >>> >>> What's wrong? >>> >>> Thank you, >>> David >>> >>> >>> ------------------------------------------------------------------------------ >>> This SF.net email is sponsored by Sprint >>> What will you do first with EVO, the first 4G phone? >>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first >>> _______________________________________________ >>> Gmod-ajax mailing list >>> Gmo...@li... >>> https://lists.sourceforge.net/lists/listinfo/gmod-ajax >>> >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D. scott at scottcain >> dot net >> GMOD Coordinator (http://gmod.org/) 216-392-3087 >> Ontario Institute for Cancer Research >> > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research |