From: James A. <j.a...@im...> - 2013-11-20 10:56:40
|
Hello, I'm trying to re-assemble a fungal genome which I've previously assembled successfully using wgs 7.0, but with the addition of some PacBio CCS sequence, using wgs 8, however cgw is crashing with the following: * Considering edges with weight >= 42.00 (maxWeightEdge 56 weightScale 0.7500) isQualityScaffoldMergingEdge()-- Merge scaffolds 19916 (1032943.1bp) and 20439 (166551122.0bp): gap -802155.4bp +- 3059.9bp weight 56 AB_BA edge terminate called after throwing an instance of 'std::bad_alloc' what(): St9bad_alloc Failed with 'Aborted' Backtrace (mangled): /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw(_Z17AS_UTL_catchCrashiP7siginfoPv+0x31)[0x4391e9] /lib64/libpthread.so.0[0x370420eb10] /lib64/libc.so.6(gsignal+0x35)[0x3703630265] /lib64/libc.so.6(abort+0x110)[0x3703631d10] /usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x114)[0x32936bed14] /usr/lib64/libstdc++.so.6[0x32936bce16] /usr/lib64/libstdc++.so.6[0x32936bce43] /usr/lib64/libstdc++.so.6[0x32936bcf2a] /usr/lib64/libstdc++.so.6(_Znwm+0x79)[0x32936bd239] /usr/lib64/libstdc++.so.6(_Znam+0x9)[0x32936bd2f9] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw(_ZN13instrumentSCF7analyzeERSt6vectorI13instrumentLIBSaIS1_EE+0x441)[0x50c223] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw[0x46ec1c] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw(_Z28isQualityScaffoldMergingEdgeP9EdgeCGW_TP9NodeCGW_TS2_P20ScaffoldInstrumenterP12VarArrayTypedd+0x1e4)[0x46fbbc] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw(_Z36ExamineSEdgeForUsability_InterleavedP9EdgeCGW_TP16InterleavingSpecP9NodeCGW_TS4_+0x12a)[0x475fb0] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw[0x46e317] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw[0x46e524] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw(_Z24MergeScaffoldsAggressiveP14ScaffoldGraphTPci+0x2a6)[0x46e820] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw(main+0x242d)[0x4376ff] /lib64/libc.so.6(__libc_start_main+0xf4)[0x370361d994] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw(__gxx_personality_v0+0x149)[0x435059] Backtrace (demangled): [0] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw::AS_UTL_catchCrash(int, siginfo*, void*) + 0x31 [0x4391e9] [1] /lib64/libpthread.so.0 [0x370420eb10] [2] /lib64/libc.so.6::(null) + 0x35 [0x3703630265] [3] /lib64/libc.so.6::(null) + 0x110 [0x3703631d10] [4] /usr/lib64/libstdc++.so.6::__gnu_cxx::__verbose_terminate_handler() + 0x114 [0x32936bed14] [5] /usr/lib64/libstdc++.so.6 [0x32936bce16] [6] /usr/lib64/libstdc++.so.6 [0x32936bce43] [7] /usr/lib64/libstdc++.so.6 [0x32936bcf2a] [8] /usr/lib64/libstdc++.so.6::operator new(unsigned long) + 0x79 [0x32936bd239] [9] /usr/lib64/libstdc++.so.6::operator new[](unsigned long) + 0x9 [0x32936bd2f9] [10] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw::instrumentSCF::analyze(std::vector<instrumentLIB, std::allocator<instrumentLIB> >&) + 0x441 [0x50c223] [11] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw [0x46ec1c] [12] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw::isQualityScaffoldMergingEdge(EdgeCGW_T*, NodeCGW_T*, NodeCGW_T*, ScaffoldInstrumenter*, VarArrayType*, double, double) + 0x1e4 [0x46fbbc] [13] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw::ExamineSEdgeForUsability_Interleaved(EdgeCGW_T*, InterleavingSpec*, NodeCGW_T*, NodeCGW_T*) + 0x12a [0x475fb0] [14] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw [0x46e317] [15] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw [0x46e524] [16] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw::MergeScaffoldsAggressive(ScaffoldGraphT*, char*, int) + 0x2a6 [0x46e820] [17] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw::(null) + 0x242d [0x4376ff] [18] /lib64/libc.so.6::(null) + 0xf4 [0x370361d994] [19] /scratch/BluGen/wgs-8.0/Linux-amd64/bin/cgw::(null) + 0x149 [0x435059] I've recompiled with debugging enabled and rerun the job but the cgw.out is >700Mb, which I can make available via http if this would be useful. The assembler was compiled and run under CentOS 5.6 (gcc 4.1.2, glibc 2.5.58), with 32Gb RAM available (we have higher memory machines available if this may be a factor...). The gpkstore.info for the assembly is as follows: libIID bgnIID endIID active deleted mated totLen clrLen libName 0 1 2789548 2789548 0 640132 1631468328 1380548976 GLOBAL 0 0 0 0 0 0 0 0 LegacyUnmatedReads 1 1 183478 183478 0 176410 179564258 179564258 plasmids 2 183479 658786 475308 0 463722 391222096 391222096 fosmids 3 658787 1208799 550013 0 0 146236541 146236541 FLX 4 1208800 2760713 1551914 0 0 850798225 599878873 XLR 5 2760714 2789548 28835 0 0 63647208 63647208 PacBio So there is a rather low proportion of mate-pairs compared to reads from fragment libraries (an area the PacBio data was intended to resolve, however due to DNA quality it has only been possible to generate CCS reads so far...). The genome in question is highly repetitive (~70%), with many nested transposons, and I believe our existing assembly is 'over-scaffolded' due to misplaced mate-pairs in repeat regions, so am intending to increase cgwMinMergeWeight, and probably cgwMergeFilterLevel to increase the stringency of the scaffolding, however as a first pass was running using the default settings. Any suggestions as to whether this crash is due to a bug or the nature of the genome/combination of data types? Best Regards, James -- Dr. James Abbott Lead Bioinformatician Bioinformatics Support Service Imperial College, London |