Inconsistent handling of fasta header comments results in a fatal error when running mrfast against a reference containing fasta header.
Error Message:
35439200 sequences are read in 248.93. (0 discarded) [Mem:15869.31 M]
Error: Cannot Open the file ~/workspace/1 dna:chromosome chromosome:GRCh37:1:1:249250621:130f73bdf2efe564a958f359fb75e3623.output01.tmp
Relevant info:
The first line of the reference in question is:
1 dna:chromosome chromosome:GRCh37:1:1:249250621:1
The file ~/workspace/130f73bdf2efe564a958f359fb75e3623.output01.tmp exists (so at least part of the program is treating fasta headers correctly)
Steps to reproduce:
1) Download and decompress ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
2) Generate sample fastq using read simulator
3) Run mrfast against simulated reads
Without markdown transformation: