454 sequence data

  • Sander Peters

    Sander Peters - 2009-03-31

    Has anyone tried to assemble 454 sequence data using gap4 and if yes how many reads and how did it perform?


    Plant Research International,
    The Netherlands.

    • Sven Klages

      Sven Klages - 2009-04-08

      Use MIRA [1] for assembling 454 data if you like to work with gap4.
      It produces CAF [2] (among others) which can easily be converted to gap4.

      AFAIK gap4 cannot assemble directly 454 data.

      [1] = http://chevreux.org/projects_mira.html
      [2] = http://www.sanger.ac.uk/Software/formats/CAF/

      But it gets slow with many reads (I have assembled some 450,000 ESTs) and
      even slower with many tags.

      That's why we're eagerly waiting for (working) gap5 .. :-)


    • James Bonfield

      James Bonfield - 2009-04-09

      For what it's worth, I'm not intending gap5 to be a sequence assembler. I'll have to add some basic assembling capabilities at some stage to assemble in finishing reads, but writing a full blown short-read assembler is something that multiple teams of people are toiling over and there's no way I can compete solo and still be writing an editor and viewer at the same time.

      Instead my plan is to be able to import already assembled data in a variety of formats.


      PS. As for assembly of 454 data, I think the celera assembly has a mixed assembly mode as does newbler. We have 454 (ace) to gap4 conversion tools here, but they're being worked on again and I don't have access to the latest source for those yet.

      Older variants though can be found at

    • Giuseppe D'Auria

      Actually my starting database obtained by MIRA (http://chevreux.org/projects_mira.html) is a CAF or a ACE format.
      I start my conversions form CAF2GAP-->GAP2BAF-->tg_index>GAP5.
      there is some direct way to do this?
      thank you


    • Giuseppe D'Auria

      sorry for my previous mail, I just have the CAF but I do not know how to give this to the caf2baf, from the README seems that it needs the uotput coming from gap2caf via pipe

      gap2caf -project xx -verion 0 -ace xxx.gap | caf2baf > xxx.baf

      but if I already have the caf, how to give this to caf2baf?

      thank you and sorry for the confusion.

    • James Bonfield

      James Bonfield - 2009-04-09

      caf2baf < in > out

      or less efficiently: cat in | caf2baf > out

      The reason I produced baf (which probably isn't likely to stay around, but it's a stop-gap for now until a decent caf2sam comes along) is because caf is simply HIDEOUS to parse. Specifically there are no requirements on data being in a specific order, so the only real way to parse it involves loading the entire file into memory. It's too inefficient for modern assemblies.



Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks