Menu

#19 Quality file name search is incorrect (tarchive2amos)

open
nobody
converters (5)
5
2011-05-10
2011-05-10
No

I have some fasta files from [1] that match what appears to be the assumed pattern for seq/qual, but not the pattern that is explicitly specified in the code. As a result, I ended up with an AMOS bank without quality, even though quality files existed. This was not obvious to me because there was no warning about missing quality files in the log.

[1] http://genome.wustl.edu/pub/organism/Invertebrates/Schmidtea_mediterannea/assembly/Schmidtea_mediterranea-2.0/input/

The files are entered on the command line in the format '<dirname>/<library>.<variant>.fasta.screen.clip.(seq|qual)'. This is a problem with the current code, because if 'fasta.' appears *anywhere* in the filename, then the fasta portion is assumed to be the prefix, and replaced by 'qual'. Also, I'm not convinced that the current search takes directories into account, because the search doesn't necessarily begin at the start of the file string.

I propose changing the order of file pattern searches (so that the more specific formats are checked first), making the 'fasta' prefix search more specific, and adding the ability for the user to alter the log level. I have attached a patch that makes these changes.

Discussion

  • David Eccles (gringer)

    patch to change order of file searches

     
  • David Eccles (gringer)

    sorry, there should also be a 'my $suffix' in the code (insert at line 302).

     

Log in to post a comment.