Menu

#1 Simulate longer Indels

open
nobody
None
5
2008-05-02
2008-05-02
No

Hi Heng Li,

I'd like to simulate reads where the mutations include longer insert/deletes and rather than write another program I thought it might be better to modify maq. The basic idea is to add a gap extend probability so that if a gap is to added we use rand() and Pextend to establish a length.
I've looked at maq code and think it could be done by using a 16-bit version of t_seq and using 4-bits to encode a mutation type, 0 delete, 1-6 for 1 to 6 bp encoded in last 12bits, 15 delete, 14 substitution. The remaining 12 bits would be 2-bits for the original character and 10 bits for up to 5 inserted bases.

I'm happy to do the changes and then submit back to you if you like.

Colin

Discussion

  • Colin Hercus

    Colin Hercus - 2008-05-02

    Modified version of simulate.c

     
  • Colin Hercus

    Colin Hercus - 2008-05-02

    Logged In: YES
    user_id=1613561
    Originator: YES

    Hi Heng Li,

    I've gone ahead and made the changes. The code is attached if you'd like to use it.

    Cheers, Colin
    File Added: simulate.c

     
  • lh3

    lh3 - 2008-05-05

    Logged In: YES
    user_id=1602510
    Originator: NO

    Hi Colin,

    Thank you very much for this. I am on the CSHL meeting, but I promise to have a look when I am back in UK.

    Cheers,

    Heng

     
  • Colin Hercus

    Colin Hercus - 2008-05-06

    Logged In: YES
    user_id=1613561
    Originator: YES

    Hi Heng,

    There's a bug in my code. I forgot to add X: into the optarg list for simulate. I did all my testing with default value for X.

    I've also found that simustat needs to allow more slop in the alignment location for correct/incorrect decision.

    Cheers, Colin

     
  • Nobody/Anonymous

    Logged In: NO

    Hi Heng,

    Did you ever use this code?

    Colin

     

Log in to post a comment.

MongoDB Logo MongoDB