From: James C. <ca...@gm...> - 2010-01-12 10:29:09
|
I have the following file: """ ##gff-version 3 ##sequence-region CYP2C8 1 36725 CYP2C8 src Feature 4100 4291 . - . ID=37 CYP2C8 src Feature 6125 6213 . - . ID=38 CYP2C8 src Feature 8093 8330 . - . ID=39 CYP2C8 src Feature 11013 11204 . - . ID=40 CYP2C8 src Feature 23538 23764 . - . ID=41 CYP2C8 src Feature 20003 20213 . - . ID=42 CYP2C8 src Feature 32411 32944 . - . ID=43 CYP2C8 src Feature 34400 35725 . - . ID=44 """ Running "flatfile-to-json.pl --gff /tmp/e.gff --tracklabel 'feature' --cssclass feature --out browsers/human-gene-test --type Feature" produces the following track: "featureNCList":[[4099,4291,-1,"37"],[6124,6213,-1,"38"],[8092,8330,-1,"39"],[11012,11204,-1,"40"],[23537,23764,-1,"41"]] You can see that everything above 30,000 bases is ignored. Changing the sequence-region pragma to be a larger region fixes the problem: ##sequence-region CYP2C8 1 46725 What is going on here? Why do I not get all the features I declared? thanks, James |
From: Mitch S. <mit...@be...> - 2010-01-12 18:10:14
|
On 01/12/2010 02:29 AM, James Casbon wrote: > Changing the sequence-region pragma to be a larger region fixes the problem: > ##sequence-region CYP2C8 1 46725 > > What is going on here? Why do I not get all the features I declared? > Thanks for providing such a nice clean test case. This sounds like it might be related to an issue in Bio::DB::SeqFeature::Store::memory that was fixed last year after bioperl 1.6 was released. The fix is in bioperl 1.6.1, though--what version are you using? Also, JBrowse used to need the sequence-region line, but in the current code it's no longer required. So one work-around if you can't upgrade bioperl would be to leave out the sequence-region line, I think. Regards, Mitch |
From: James C. <ca...@gm...> - 2010-01-13 09:49:49
|
2010/1/12 Mitch Skinner <mit...@be...>: > On 01/12/2010 02:29 AM, James Casbon wrote: >> >> Changing the sequence-region pragma to be a larger region fixes the >> problem: >> ##sequence-region CYP2C8 1 46725 >> >> What is going on here? Why do I not get all the features I declared? >> > > Thanks for providing such a nice clean test case. > > This sounds like it might be related to an issue in > Bio::DB::SeqFeature::Store::memory that was fixed last year after bioperl > 1.6 was released. The fix is in bioperl 1.6.1, though--what version are you > using? > > Also, JBrowse used to need the sequence-region line, but in the current code > it's no longer required. So one work-around if you can't upgrade bioperl > would be to leave out the sequence-region line, I think. I think you are right that it may be a problem with bioperl, but I am fairly certain I am on 1.61, although this is confusing: $ perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' 1.006001 Is there any way to load data that doesn't go via bioperl? My experience getting GFF3 to parse has not been joyful. cheers, James |
From: Mitch S. <mit...@be...> - 2010-01-15 16:03:25
Attachments:
bdsfsm-filter_by_location.patch
|
On 01/13/2010 01:49 AM, James Casbon wrote: > > I think you are right that it may be a problem with bioperl, but I am > fairly certain I am on 1.61, although this is confusing: > $ perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' > 1.006001 > Yeah, that's 1.6.1; I think it's formatted that way so that textual comparisons of version strings give the right answer (e.g., when comparing 1.6.2 to 1.6.12). > Is there any way to load data that doesn't go via bioperl? My > experience getting GFF3 to parse has not been joyful. > Well, the idea was to use bioperl to deal with parsing GFF3 so that we wouldn't have to write our own GFF3 parser. The problem was in bioperl in this particular case, but (if you haven't already) you could also see if your GFF3 validates: http://dev.wormbase.org/db/validate_gff3/validate_gff3_online The bug where some features at the end of the refseq wouldn't show up has been fixed in bioperl, although it'll be a little while before the fix gets released. I've attached a patch that you could apply to your bioperl installation, or you could get this file: http://code.open-bio.org/svnweb/index.cgi/bioperl/checkout/bioperl-live/trunk/Bio/DB/SeqFeature/Store/memory.pm?rev=16695 and replace the version you currently have installed. That said, you can also use BED, or a database (although the database would have to be one that has a bioperl Bio::DB::(whatever) interface, like chado, Bio::DB::SeqFeature::Store, or Bio::DB::GFF). I've been thinking about writing something to import data into JBrowse that doesn't use bioperl, but some of the interfaces in the JBrowse code have to be reworked a bit before that can happen. Regards, Mitch |
From: James C. <ca...@gm...> - 2010-01-15 16:56:16
|
2010/1/15 Mitch Skinner <mit...@be...>: > Well, the idea was to use bioperl to deal with parsing GFF3 so that we > wouldn't have to write our own GFF3 parser. The problem was in bioperl in > this particular case, but (if you haven't already) you could also see if > your GFF3 validates: > http://dev.wormbase.org/db/validate_gff3/validate_gff3_online > > The bug where some features at the end of the refseq wouldn't show up has > been fixed in bioperl, although it'll be a little while before the fix gets > released. I've attached a patch that you could apply to your bioperl > installation, or you could get this file: > > http://code.open-bio.org/svnweb/index.cgi/bioperl/checkout/bioperl-live/trunk/Bio/DB/SeqFeature/Store/memory.pm?rev=16695 > > and replace the version you currently have installed. Thanks, very much, Mitch. I'll try that out and let you know. To be honest, the parser has given me a lot of hassle with other stuff (line endings and memory use) so I might try out BED and see if that is any better. James |