Simulate longer Indels
Status: Beta
Brought to you by:
lh3lh3
Hi Heng Li,
I'd like to simulate reads where the mutations include longer insert/deletes and rather than write another program I thought it might be better to modify maq. The basic idea is to add a gap extend probability so that if a gap is to added we use rand() and Pextend to establish a length.
I've looked at maq code and think it could be done by using a 16-bit version of t_seq and using 4-bits to encode a mutation type, 0 delete, 1-6 for 1 to 6 bp encoded in last 12bits, 15 delete, 14 substitution. The remaining 12 bits would be 2-bits for the original character and 10 bits for up to 5 inserted bases.
I'm happy to do the changes and then submit back to you if you like.
Colin
Modified version of simulate.c
Logged In: YES
user_id=1613561
Originator: YES
Hi Heng Li,
I've gone ahead and made the changes. The code is attached if you'd like to use it.
Cheers, Colin
File Added: simulate.c
Logged In: YES
user_id=1602510
Originator: NO
Hi Colin,
Thank you very much for this. I am on the CSHL meeting, but I promise to have a look when I am back in UK.
Cheers,
Heng
Logged In: YES
user_id=1613561
Originator: YES
Hi Heng,
There's a bug in my code. I forgot to add X: into the optarg list for simulate. I did all my testing with default value for X.
I've also found that simustat needs to allow more slop in the alignment location for correct/incorrect decision.
Cheers, Colin
Logged In: NO
Hi Heng,
Did you ever use this code?
Colin