Menu

#207 Insert size estimation failed

gatekeeper
open
None
5
2015-02-03
2012-06-26
Satishkumar
No

ERROR: Failed with signal ABRT (6)

runCA failed.

----------------------------------------
Stack trace:

at /usr/local/src/wgs/Linux-amd64/bin/runCA line 1237
main::caFailure('Insert size estimation failed', '/projects/oidproject/rawdata_201244_newdata/Pacbio_stec/newtr...') called at /usr/local/src/wgs/Linux-amd64/bin/runCA line 4550
main::postUnitiggerConsensus() called at /usr/local/src/wgs/Linux-amd64/bin/runCA line 5883

----------------------------------------
Last few lines of the relevant log file (/projects/oidproject/rawdata_201244_newdata/Pacbio_stec/newtry/try_662012/asm/5-consensus-insert-sizes/estimates.out):

tigStore: MultiAlignMatePairAnalysis.C:173: void matePairAnalysis::evaluateTig(MultiAlignT*): Assertion `0' failed.

----------------------------------------
Failure message:

Discussion

  • Satishkumar

    Satishkumar - 2012-06-26

    The library identifier must be unique to each sequencing library. Since you have the same ID, the assembler is confusing the two of them. So it loads the Illumina library first as paired and then replaces it with the PacBio library which is unpaired. However, there are already fragments in the Illumina library that have pairs loaded. This is now unexpected, since the library looks unmated according to the PacBio frg file and leads to the error you see (pairs in an unpaired library).

    To fix the issue, you can either re-generate the Illumia frg file specifying a different parameters for the library name to fastqToCA. You could also change the acc line in the illumina frg file.

     
  • Brian Walenz

    Brian Walenz - 2012-06-26

    Interesting failure, thanks for mentioning it.

    The duplicate library should have been mentioned in the gkpStore err and/or errorLog files. Gatekeeper used to fail on errors, now it reports them, fixes what it can, and continues.

     
  • Brian Walenz

    Brian Walenz - 2012-06-26
    • milestone: --> gatekeeper
    • assigned_to: nobody --> brianwalenz
     
  • JHooge

    JHooge - 2012-11-15

    I had the same problem and could resolve it changing the library names for unmated and mated reads. Though I still get

    # LIB Alert: already exists; can't add it again

    from the gatekeeper the assembly runs through.

    I have a set of about 70 FASTQ files, which I converted to FRG files using the fastqToCA routine. Do all of these FRG files have to have a unique library name?

     
  • Rui

    Rui - 2013-05-16

    Hi I am running into this error. However the fix described by Satishkumar did not work for me. Is there an alternative fix>

     
  • Brian Walenz

    Brian Walenz - 2013-05-16

    The only way I can imagine this happening is if the library is claimed to be 'unmated' yet there are mate pairs included. This can occur if two libraries have the same name, and the first time it appears it is unmated.

    You can edit the library to make it a mate pair library. Set it to 'I' (innie) orientation and give it a mean and std.dev. size.

    https://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Gatekeeper#Library_Edits

    (scroll up a bit from here to get usage on how to apply the edits)

    lib iid # orientation I
    lib iid # distance MEAN STDDEV

    A 'gatekeeper -dumplibraries ASM.gkpStore' will dump all the metadata for all libraries. You can guess which library is causing trouble, or just make all libraries mated. The problem library should currently have an orientation of U (for either unmated or unknown). If you don't know an insert size, use a reasonable guess (mean = 600 or 3000 is good) and a stddev of 10% of that. Except for libraries with few mate pairs, there is little harm in setting the wrong size; it gets recomputed early in the assembly process.

     
  • Brian Walenz

    Brian Walenz - 2015-02-03

    Bug #247 had a different failure mode:

    The FASTQ loader assumes the library to add reads to is the last one added. This is not true if the library name is a duplicate.

    Reproduce:

    Create three FRG files with library names of 'A', 'B' and 'A'. The reads in the last FRG file will be added to library B.

     

Log in to post a comment.