Error when generating chimeras
Brought to you by:
floflooo
Hi, I've tried using grinder to simulate an amplicon dataset, and have run into problems. When I try to generate chimeras (option -chimera_perc 10), much of the time I get a message:
Error: Could not find another sequence that contains kmer TGGGATGATA
... (kmer sequence varies), and apparently at that point the read generation stops. As a result, the length of the multifasta/multifastq file can be variable, and often quite a bit lower than that specified with -total_reads option. Adding parameter -chimera_kmer with value >10 seems to increase the likelihood of error occurence, setting it to 0 eliminates the problem.
Any suggestions of what the problem might be about, and how to eliminate it?
Thanks,
Piotr
Anonymous
Hi Piotr,
The problem is that using kmer-based chimeras requires reference sequences to share some kmers. In your case, it appears that your sequences don't share enough kmers to produce the desired amount of chimeras. I suggest that you either provide more sequences, sequences that are more similar, or decrease the kmer value.
Best,
Florent