Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

454 sequence data

2009-03-31
2013-04-18
  • Sander Peters
    Sander Peters
    2009-03-31

    Has anyone tried to assemble 454 sequence data using gap4 and if yes how many reads and how did it perform?

    Best,

    Sander,
    Plant Research International,
    The Netherlands.

     
    • Sven Klages
      Sven Klages
      2009-04-08

      Use MIRA [1] for assembling 454 data if you like to work with gap4.
      It produces CAF [2] (among others) which can easily be converted to gap4.

      AFAIK gap4 cannot assemble directly 454 data.

      [1] = http://chevreux.org/projects_mira.html
      [2] = http://www.sanger.ac.uk/Software/formats/CAF/

      But it gets slow with many reads (I have assembled some 450,000 ESTs) and
      even slower with many tags.

      That's why we're eagerly waiting for (working) gap5 .. :-)

      cheers,
      Sven

       
    • James Bonfield
      James Bonfield
      2009-04-09

      For what it's worth, I'm not intending gap5 to be a sequence assembler. I'll have to add some basic assembling capabilities at some stage to assemble in finishing reads, but writing a full blown short-read assembler is something that multiple teams of people are toiling over and there's no way I can compete solo and still be writing an editor and viewer at the same time.

      Instead my plan is to be able to import already assembled data in a variety of formats.

      James

      PS. As for assembly of 454 data, I think the celera assembly has a mixed assembly mode as does newbler. We have 454 (ace) to gap4 conversion tools here, but they're being worked on again and I don't have access to the latest source for those yet.

      Older variants though can be found at
      http://genome.imb-jena.de/software/roche454ace2caf/

       
    • Hi
      Actually my starting database obtained by MIRA (http://chevreux.org/projects_mira.html) is a CAF or a ACE format.
      I start my conversions form CAF2GAP-->GAP2BAF-->tg_index>GAP5.
      there is some direct way to do this?
      thank you

      Giuseppe

       
    • sorry for my previous mail, I just have the CAF but I do not know how to give this to the caf2baf, from the README seems that it needs the uotput coming from gap2caf via pipe

      gap2caf -project xx -verion 0 -ace xxx.gap | caf2baf > xxx.baf

      but if I already have the caf, how to give this to caf2baf?

      thank you and sorry for the confusion.

       
    • James Bonfield
      James Bonfield
      2009-04-09

      caf2baf < in > out

      or less efficiently: cat in | caf2baf > out

      The reason I produced baf (which probably isn't likely to stay around, but it's a stop-gap for now until a decent caf2sam comes along) is because caf is simply HIDEOUS to parse. Specifically there are no requirements on data being in a specific order, so the only real way to parse it involves loading the entire file into memory. It's too inefficient for modern assemblies.

      James