Assembly of 454 reads (in SFF format)

  • Nobody/Anonymous


    is it possible to preprocess a bunch of
    454 reads in SFF format (i.e. 80,000 reads
    from a 150 kb BAC) within PREGAP4 and
    then to PHRAP assemble them in GAP4? and
    does it make sense to do so (with respect
    to error model, etc.)

    TIA for your feedback,

    • James Bonfield

      James Bonfield - 2006-03-24

      Phrap is astoundingly bad at assembling 454 reads, possibly due to the default alignment penalties and score matrix. Ideally mismatches should be penalised very high and the gap penalty should be much lower. If you do use phrap you may also want to use the latest "shuffle pads" code in gap4, based around the ReAligner algorithm. Otherwise you'll see lots of misalignments.

      454's own assembler looks to do a good job though and due to working in flow-space (where there should be zero indels) it's superbly fast.

      We've massaged their ace output into caf format and then used caftools to convert to a Gap4 database, but it's not pretty and we tend to lose bits on the way sometimes. Getting a decent interface is still work in progress.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks