Input MIRA assembly results into Gap4/5

2010-07-12
2013-04-18
  • Hi,

    I'm trying to visualize a MIRA assembly with Gap4 and Gap5 (Staden Package version 2.0.0b6). However I have the following problems:

    - For Gap4, I use the program caf2gap (from CAFTOOLS version 2.0.2) to convert the MIRA CAF assembly into a Gap4 database, but I get the error "!! FATAL ERROR: system error 12 Cannot allocate memory !! Memory allocation failure when requesting -2147483648 bytes Aborted". I have a hybrid 454/Sanger assembly of 4,007,215 reads, and work in an Ubuntu 64bits Server with 64 GiB RAM. RAM is not completely used when I get the error.

    - For Gap5, I use the ACE file from MIRA to create the Gap5 database using tg_index, but Gap5 doesn't display any MIRA tags :-(  Now I'm trying to convert the CAF file from MIRA to BAF using the script caf2baf.pl from the Staden Package with the idea to create the Gap5 database from the BAF file and see if it displays MIRA tags, but it's been running for more than three day now…

    Do you have any idea? I have the following formats for the assembly: ACE (but without tags), CAF, MAF, Fasta, TCS. Any help will be greatly appreciated.

    Thank you very much in advance.

    Best regards,
    Sònia

     
  • James Bonfield
    James Bonfield
    2010-07-12

    I did think I supported ace tags, but it looks like I'm wrong there.  I'll look into it, but example data would be welcomed.  Do you have a snippet of the ACE file and the tags within it? Nothing huge - just a couple reads. Or is the problem simply that the ACE output by MIRA doesn't include tags?

    As for caf2baf.pl, unfortunately this uses the caftools perl module which is hideously inefficient. It's much the same reason I didn't add native CAF support into tg_index either, as it essentially requires either multiple passes and excessive random accessing or loading the entire caf file into memory (which is how the perl caftools package works).

    The caf2gap crash is because it's trying to allocate -MAXINT bytes (2^31), but I'm not sure why. I'm guessing it's an error code that hasn't been checked and has then shown up in a strange memory allocation. Is it possible to temperarily obtain a copy of your data? (I understand if not.)   Do you still get the same error if you shrink the data down to a smaller set? It maybe some specific thing in the CAF file which is triggering this rather than the size.  

     
  • I got the same crash from caf2gap when I used the one on their web page.  When I downloaded and compiled the latest version on their ftp server (ftp.sanger.ac.uk/pub/PRODUCTION_SOFTWARE/src/) it started working.

     
  • TomG
    TomG
    2011-03-11

    Hello,

    Any answer to this? I'm getting the exact issue that Sonia got when running caf2gap. This is on a 64-bit RHEL5.5 with 24GB memory. The .caf file is 3.2GB. It works fine on smaller files, but getting error when trying to allocate bytes to 2^31. Got caf2gap included in caftools downloaded from ftp.sanger.ac.uk/pub/PRODUCTION_SOFTWARE/src. I tried caftools 2.0, 2.0.1, and 2.0.2 all with the same result:

    !! FATAL ERROR: system error 12 Cannot allocate memory
    !! Memory allocation failure when requesting -2147483648 bytes
    Aborted

     
  • The latest version of gap5 can just read caf files natively, so use e.g. "tg_index -o my_db foo.caf" and then "gap5 my_db.0 &".

    I'll agree though that CAFtools is hideously inefficient with large files. It's not something I support though and neither was it something written by the developers of the Staden Package. Much like Gap4, it was designed in an era of modest sized Sanger-method sequencing projects.

    James

    PS. That said, if you have a large memory machine it may still be possible to get caftools to work better though. Check to make sure that it's using large file support. I'd hope that it builds using -D_FILE_OFFSET_BITS=64 or something similar. The exact runes needed vary from system to system though.

     
  • TomG
    TomG
    2011-03-11

    Thanks James. I was able to convert my caf for gap5 using tg_index.

     
  • Is this a related issue?

    When I search with the tag box selected, I get the following error (also with gap4) - I am using a version downloaded from svn around the 7th March. My genome is a Salmonella genome, sequenced using 454 with no read pairs (sigh!) and 730K reads, usually assembled with MIRA - but I assembled with other assembler to see if it was an assembler issue.

    It doesn't seem to matter which assembler generated the gap5 database - I have tried it with Mosaik (BAM or ACE file), MIRA (CAF or ACE file) or a Newbler-generated ACE file - all converted appropriately with tg_index. When I select  Search->'tag type' and hit the Tag Type button below it, I get:

    invalid command name "tablelist"
    invalid command name "tablelist"
        while executing
    "tablelist .e1.search.type_frame.r.tag.select.type.list -columns {0 "Index"
       0 "Code"
       0 "Tag name"} -labelcommand tablelist::sortByColumn …"
        ("eval" body line 1)
        invoked from within
    "eval [list tablelist $path.list  -columns {0 "Index"
       0 "Code"
       0 "Tag name"}  -labelcommand tablelist::sortByColumn  -exportselection 0  …"
        (procedure "tag_checklist" line 23)
        invoked from within
    "tag_checklist $w.type $arr -selectmode browse -width 35 -height 10"
        (procedure "tag_editor_select_type" line 34)
        invoked from within
    "tag_editor_select_type .e1.search.type_frame.r.tag   {show_help gap5 Editor-Searching}  {search_set_tag_type .e1.search.option_fra…"
        invoked from within
    ".e1.search.option_frame.tag.button invoke"
        ("uplevel" body line 1)
        invoked from within
    "uplevel #0

    • "
          (procedure "tk::ButtonUp" line 22)
          invoked from within
      "tk::ButtonUp .e1.search.option_frame.tag.button"
          (command bound to event)

    You are most welcome to an assembly.

    HTH
    John

     
  • That sounds like I forgot something from the binary distribution, or else a failure to find it when it's running.

    The binary install should have it in staden.x86_64/share/tcltk/tklib0.5/tablelist/. If you build from source you need to have the tklib package (note tklib isn't the same as Tk, but a set of packages).

     
  • John Nash
    John Nash
    2011-03-15

    Yay - I remembered to log in this time :)

    I just checked, and it's all there - I believe.

    $ cd /usr/molbin/staden
    $ ls
    doc  man  staden  tcltk

    $ cd tcltk/
    $ ls
    tcl8.4  tk8.4  tklib0.5
    $ cd tklib0.5/
    $ ls
    autoscroll  crosshair  datefield  history  khim          plotchart  tablelist   widget
    canvas      ctext      diagrams   ico      ntext         style      tkpiechart
    chatwidget  cursor     getstring  ipentry  pkgIndex.tcl  swaplist   tooltip
    $ cd tablelist/
    $ ls
    pkgIndex.tcl  scripts  tablelistPublic.tcl  tablelist.tcl  tablelist_tile.tcl
    $ ls
    pkgIndex.tcl  scripts  tablelistPublic.tcl  tablelist.tcl  tablelist_tile.tcl

    I have itk, itcl and iwidgets as well as Tk and Tcl all installed, and it all configured, compiled and installed nicely.

    J

    PS I'll install the binary from the latest beta and see

     
  • John Nash
    John Nash
    2011-03-15

    Ok… I have good news and bad news, and to correct an error.

    Selecting TAGs as described above crashes with the latest beta but WORKS with the latest svn version downloaded and compiled today.  I got my versions mixed up (I have both the latest  beta and an svn version installed, and I symlink the beta or the svn version's subdirectory to /usr/molbin/staden as required). Sorry for blaming the svn version when I should have blamed the latest beta. It was a senior's moment.

    John

     
  • tgoldman
    tgoldman
    2011-08-18

    Hi,

    I have an ace file that I would like to edit with gap4. Is there an ace2gap converter? Or better yet, does gap5 have the ability to run something similar to the gap4 Pick PCR Primers feature?
    I was able to convert the ace to a gap5 db, but I would like to design primers for gap closing and it doesn't look like I can do that yet with gap5.

    Thanks.

     
  • MIRA should provide you with a CAF file.  Otherwise you could try miniphrap2gap available ftp://ftp.sanger.ac.uk/pub4/resources/software/caf/miniphrap2gap.tgz.