Menu

#103 Tide core dump from string operation

post v2.0
open
None
2014-03-20
2014-01-22
No

I have been running a script that fails occasionally with the following error:

Read 221326 targets from tide-malaria/tide-malaria/tide-index.peptides.target.txt.
Printed 221242 decoys.
WARNING: Unexpected parameter file option 'nmod-peptide-mods-spec'
INFO: Beginning tide-index.
INFO: Writing results to output directory 'tide-malaria/00/01/decoy0001.idx-out'.
INFO: CPU: n023.grid.gs.washington.edu
INFO: Mon Jan 13 12:59:15 PST 2014
INFO: Running tide-index...
INFO: Writing results to output directory 'tide-malaria/00/01/decoy0001.index'.
INFO: Reading tide-malaria/00/01/decoy0001.fa and computing unmodified peptides...
INFO: Reading proteins
INFO: Wrote 100000 peptides
INFO: Wrote 200000 peptides
INFO: Computing modified peptides...
[libprotobuf FATAL /net/noble/vol1/home/noble/proj/crux/trunk/src/c/tide/peptide_mods3.cc:375] CHECK failed: reader_.OK():
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: reader_.OK():
/net/gs/vol4/sge/n001/noble/spool/n023/job_scripts/8893665: line 14: 26545 Aborted (core dumped)

If I re-run the same job it usually succeeds. Here is the actual command line:

/net/noble/vol1/home/noble/proj/crux/trunk/src/c/crux tide-index --parameter-file tide-malaria/tide-malaria.param.txt --output-dir tide-malaria/00/01/decoy0001.idx-out tide-malaria/00/01/decoy0001.fa tide-malaria/00/01/decoy0001.index

The script that was submitted is attached, along with the log file that was produced.

This is the backtrace I get from the core file:

0 0x00000038ac4328a5 in raise () from /lib64/libc.so.6

1 0x00000038ac434085 in abort () from /lib64/libc.so.6

2 0x00002b366a5eda75 in __gnu_cxx::__verbose_terminate_handler() () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95

3 0x00002b366a5ebbe6 in __cxxabiv1::__terminate(void (*)()) () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38

4 0x00002b366a5ebc13 in std::terminate() () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48

5 0x00002b366a5ebe3e in __cxa_throw () at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:84

6 0x00002b366a640e41 in std::__throw_out_of_range(char const*) ()

at ../../../../../libstdc++-v3/src/c++11/functexcept.cc:80

7 0x00002b366a64c283 in std::basic_string<char, std::char_traits\<char="">, std::allocator\<char> >::substr(unsigned long, unsigned long) const () at /usr/src/gcc-4.8.1/build/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:324</char></char,>

8 0x000000000092e2f4 in TideIndexApplication::main(int, char**) ()

9 0x000000000089b43f in CruxApplicationList::main(int, char**) ()

It seems that the problem is starting from within crux; I think a string type is being assigned an invalid value. It might help to build crux w/ -gstabs+, so that line numbers are available within gdb.

2 Attachments

Related

Issues: #103

Discussion

  • William S Noble

    William S Noble - 2014-02-20

    I have just re-created the directory where these files reside. It can be found under /net/noble/vol1/home/noble/proj/crux-projects/2012-fdr-psm/results/2013-12-18-unified/tide-malaria.

    This time I only ran 4 jobs, and none of them failed.

    Bill

     
  • Kaipo

    Kaipo - 2014-02-25

    Valgrind results are attached, it looks like there are a few memory leaks but nothing serious (invalid reads/writes).
    I guess the error is coming from TideIndexApplication.cpp:308,

    string pep_str = protein->residues().substr(location.pos(), peptide->length());

    The error indicates that location.pos() > protein->residues().length(), though I am not sure how exactly that is possible. If we can reproduce it, we could try putting some print statements right before this line to help debug.

     
    • William S Noble

      William S Noble - 2014-02-26

      If you send me a patch with the print statements, I'll see if I can run a
      bunch of jobs and try to re-create the error. I suggest having it just
      check whether location.pos() > protein->residues().length() and then print
      if that's the case.

      Bill

      On Tue, Feb 25, 2014 at 2:37 PM, Kaipo kaipot@users.sf.net wrote:

      Valgrind results are attached, it looks like there are a few memory leaks
      but nothing serious (invalid reads/writes).
      I guess the error is coming from TideIndexApplication.cpp:308,

      string pep_str = protein->residues().substr(location.pos(),
      peptide->length());

      The error indicates that location.pos() > protein->residues().length(),
      though I am not sure how exactly that is possible. If we can reproduce it,
      we could try putting some print statements right before this line to help
      debug.

      Attachment: malaria.valgrind (26.8 kB; application/octet-stream)

      Status: open
      Created: Wed Jan 22, 2014 11:58 PM UTC by William S Noble
      Last Updated: Tue Feb 25, 2014 09:52 PM UTC
      Owner: Kaipo

      I have been running a script that fails occasionally with the following
      error:

      Read 221326 targets from
      tide-malaria/tide-malaria/tide-index.peptides.target.txt.
      Printed 221242 decoys.
      WARNING: Unexpected parameter file option 'nmod-peptide-mods-spec'
      INFO: Beginning tide-index.
      INFO: Writing results to output directory
      'tide-malaria/00/01/decoy0001.idx-out'.
      INFO: CPU: n023.grid.gs.washington.edu
      INFO: Mon Jan 13 12:59:15 PST 2014
      INFO: Running tide-index...
      INFO: Writing results to output directory
      'tide-malaria/00/01/decoy0001.index'.
      INFO: Reading tide-malaria/00/01/decoy0001.fa and computing unmodified
      peptides...
      INFO: Reading proteins
      INFO: Wrote 100000 peptides
      INFO: Wrote 200000 peptides
      INFO: Computing modified peptides...
      [libprotobuf FATAL
      /net/noble/vol1/home/noble/proj/crux/trunk/src/c/tide/peptide_mods3.cc:375]
      CHECK failed: reader_.OK():
      terminate called after throwing an instance of
      'google::protobuf::FatalException'
      what(): CHECK failed: reader_.OK():
      /net/gs/vol4/sge/n001/noble/spool/n023/job_scripts/8893665: line 14: 26545
      Aborted (core dumped)

      If I re-run the same job it usually succeeds. Here is the actual command
      line:

      /net/noble/vol1/home/noble/proj/crux/trunk/src/c/crux tide-index
      --parameter-file tide-malaria/tide-malaria.param.txt --output-dir
      tide-malaria/00/01/decoy0001.idx-out tide-malaria/00/01/decoy0001.fa
      tide-malaria/00/01/decoy0001.index

      The script that was submitted is attached, along with the log file that
      was produced.

      This is the backtrace I get from the core file:
      0 0x00000038ac4328a5 in raise () from /lib64/libc.so.6 1
      0x00000038ac434085 in abort () from /lib64/libc.so.6 2 0x00002b366a5eda75
      in gnu_cxx::verbose_terminate_handler() () at
      ../../../../libstdc++-v3/libsupc++/vterminate.cc:95 3 0x00002b366a5ebbe6
      in cxxabiv1::terminate(void ()()) () at
      ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38 4
      0x00002b366a5ebc13 in std::terminate() () at
      ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48 5
      0x00002b366a5ebe3e in __cxa_throw () at
      ../../../../libstdc++-v3/libsupc++/eh_throw.cc:84 6 0x00002b366a640e41 in
      std::__throw_out_of_range(char const
      ) ()

      at ../../../../../libstdc++-v3/src/c++11/functexcept.cc:80

      7 0x00002b366a64c283 in std::basic_string, std::allocator

      ::substr(unsigned long, unsigned long) const () at
      /usr/src/gcc-4.8.1/build/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:324 8
      0x000000000092e2f4 in TideIndexApplication::main(int, char) () 9
      0x000000000089b43f in CruxApplicationList::main(int, char
      ) ()

      It seems that the problem is starting from within crux; I think a string
      type is being assigned an invalid value. It might help to build crux w/
      -gstabs+, so that line numbers are available within gdb.


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/cruxtoolkit/issues/103/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Issues: #103

  • Kaipo

    Kaipo - 2014-02-26

    Here is a patch

     
  • William S Noble

    William S Noble - 2014-03-20

    Kaipo and I went over the latest output from this set of runs, and it now appears that the failure is happening during the indexing stage. I suspect this is an issue related to disk instability. I will re-run the searches and see if the error is gone.

     
  • William S Noble

    William S Noble - 2014-03-20
    • assigned_to: Kaipo --> William S Noble
    • Milestone: Crux v2.0 --> post v2.0
     

Log in to post a comment.

MongoDB Logo MongoDB