- priority: 5 --> 7
- assigned_to: scottcain --> lstein
Hello,
For my current position of Software engineer in a french laboratory called INRA I happened to use GMOD Gbrowse 1.70. Actually, I am a member of the development team of a wheat annotation pipeline (TriAnnot pipeline: http://urgi.versailles.inra.fr/projects/TriAnnot/index.php\) that generates GFF3 files.
It turns out that this pipeline can be used by the scientific community through a web interface and that the final GFF3 file created at the end of each full analysis is stored in a specific directory of our cluster. This storage is a temporary storage, users can download their results or visualize it in our Gbrowse for a short period of time only (2 weeks).
To facilitate and automate the cleaning of "old" results we decided to conserve flat GFF files instead of using Gbrowse on a Postgres/MySQL/CHADO database. Therefore in the Gbrowse config file we set the adaptor to "Bio::DB::SeqFeature::Store" and set the db_args option to "-adaptor memory -dir '/home/projects/triannot/GFF/'".
I have tested the following cases to check if this gbrowse configuration works fine:
- Only one GFF file that contains a sequence (in fasta format in the bottom part of the file) is present in the directory --> No problem: my popup windows can display all information about a feature and Gbrowse's plugin like BatchDumper have access to the feature sequence.
- Several GFF files in the directory but only 1 GFF that contains a sequence --> No problem.
- Several GFF files with MORE THAN 1 GFF that contains a sequence --> Malfunctioning
In the last case it becomes impossible to access to the sequence of each supernumerary GFF file (Gbrowse can access the sequence of the first GFF file (with a sequence in it) but not the sequence of the others). In addition to this problem my apache error log becomes a lot bigger and if i read it there is a lot of lines like this:
[Mon May 03 17:02:43 2010] [error] [client 127.0.0.1] print() on closed filehandle GEN3 at
[Mon May 03 17:02:43 2010] [error] [client 127.0.0.1] \t/usr/local/lib/perl5/site_perl/5.8.5/Bio/DB/SeqFeature/Store/memory.pm line 598, <GEN5> line 888 (#1)
This problems seems to be directly related to our choice of db_args so I give a look to the correspondent BioPerl module (memory.pm).
Line 598 is included in a function with a strange comment line:
# this is ugly
sub _insert_sequence {
my $self = shift;
my ($seqid,$seq,$offset) = @_;
my $dna_fh = $self-private_fasta_file or return;
if ($offset == 0) {
# start of the sequence
print $dna_fh "$seqid\n";
}
print $dna_fh $seq,"\n";
}
This problem really lock the development of our project, can you try to correct it please ?
Please note that this problem has already been submitted to the GMOD Helpdesk and that I now add this bug to the bug tracking system just to try to increase the priority level of this bug.
Thank you for your help,
Best regards,
Aurélien