Hello,
I have PE data (files named "lane1.1.fq" and "lane1.2.fq") and have set "MOCAT_paired_end" to "yes" in the config file and after running "MOCAT.pl -sf samples -rtf" the "reads.processed" outpuf folder has all of the reads in the "lane1.single.fq.gz" file and the paired files are empty. (The sample name is in the "samples" file.) This becomes a problem later when I run the assembly step because it finds no reads in the PE files and gives an error, with the assembly log file for this sample saying:
"
Begin Program SOAPaligner/soap2
Thu Jan 4 21:06:36 2018
...
Load Index Table ...
Load Index Table OK
Begin Alignment ...
File Error: unrecognized file
ERROR & EXIT: Insert sizes could not be calculated. Most likely the number of processed reads is too low.
There were not enough reads mapping to the mapping db to estimate the distance between two paired reads.
To solve this, calculate insert sizes differently. at ...../MOCATAssembly_aux.pl line 146.
"
Thank you for your assistance,
Peter Bazeley
It's possible the headers in your fast files aren't of standard header
format. Look at the Wikipedia page for the fast format and ensure the reads
are named correctly to one of the formats.
On Fri, Jan 5, 2018 at 18:01 Peter Bazeley theoark@users.sf.net wrote:
Related
Tickets: #76
My headers (1 from each PE file) looks like this:
@K00136:435:HGL7GBBXX:4:1104:16985:30433 1:N:0
@K00136:435:HGL7GBBXX:4:1104:16985:30433 2:N:0
This looks similar to the wikipedia page for Casava 1.8:
"
With Casava 1.8 the format of the '@' line has changed:
@EAS139:136:FC706VJ:2:2104:15343:197393 1:Y:18:ATCACG
"
except no index sequence at the end. Should I add a dummy sequence at the end?
Thanks for your help,
Peter
Or should I convert to this format:
@HWUSI-EAS100R:6:73:941:1973#0/1
I'd to the latter with /1 and /2.
On Fri, Jan 5, 2018 at 22:56 Peter Bazeley theoark@users.sf.net wrote:
Related
Tickets: #76
Thanks, this solved the issue. Appreciate your help.