From: mathog <ma...@ca...> - 2015-04-29 18:19:41
|
We just received a bunch of Illumina sequence which has been unpacked into a single fastq file of 217610498050 bytes. It has not been cleaned up in any way, so it has lots of N's and many entries which failed the "chastity filter". Tried to count the tuples on it with this: time ~/wgs*/wgs-8.1/Linux-amd64/bin/meryl -v -B -m 17 -C \ -s 15659_all.fastq -threads 40 -o killme.table and it did this (sorry about the wrap in the backtrace): REALLOC len=4194304 from 4194304 to 8388608 REALLOC len=8388608 from 8388608 to 16777216 REALLOC len=16777216 from 16777216 to 33554432 REALLOC len=33554432 from 33554432 to 67108864 REALLOC len=67108864 from 67108864 to 134217728 Failed with 'Segmentation fault' The backtrace it emitted was: [0] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::AS_UTL_catchCrash(int, siginfo*, void*) + 0x27 [0x40d477] [1] /lib64/libpthread.so.0() [0x353d40f710] [2] /lib64/libc.so.6::(null) + 0x15b [0x353cc897cb] [3] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::fastqFile::constructIndex() + 0x551 [0x430e61] [4] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::fastqFile::fastqFile(char const*) + 0x43 [0x431323] [5] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::fastqFile::openFile(char const*) + 0x109 [0x431499] [6] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::seqFactory::openFile(char const*) + 0x3c [0x42ca8c] [7] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::seqStream::seqStream(char const*) + 0x42 [0x42d892] [8] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::prepareBatch(merylArgs*) + 0xe2 [0x41d882] [9] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::build(merylArgs*) + 0x67d [0x42110d] [10] /home/mathog/wgs_project/wgs-8.1/Linux-amd64/bin/meryl::(null) + 0x158 [0x409578] This crash happened after 4 minutes and a few seconds had elapsed. Have tried it with several other versions of meryl, including from a wgs built from trunk just this morning, and it always seems to crash the same way and at about the same place (as judged by run time). The system has 529G of memory which I would have assumed is sufficient. The instructions for meryl are pretty light, so perhaps I missed some other switch which should be added to the command line? Other suggestions? Thanks, David Mathog ma...@ca... Manager, Sequence Analysis Facility, Biology Division, Caltech |