|
From: karim b. <kar...@gm...> - 2006-09-16 17:03:55
|
Hi Valgrind 3.2.0 + memcheck crashes with a code : ... ==9576== by 0x42DE696: PyEval_EvalFrame (ceval.c:2163) ==9576== by 0x42DF128: PyEval_EvalCodeEx (ceval.c:2736) ==9576== by 0x42E0932: fast_function (ceval.c:3651) HepMcParticleLink INFO cptr: Using TruthEvent as McEventCollection key for this job --9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --9576-- si_code=80; Faulting address: 0x0; sp: 0x6278DD2C valgrind: the 'impossible' happened: Killed by fatal signal ==9576== at 0x3802C2CE: vgPlain_use_CF_info (debuginfo.c:830) ==9576== by 0x3801EFA2: vgPlain_get_StackTrace2 (m_stacktrace.c:158) ==9576== by 0x3801F075: vgPlain_get_StackTrace (m_stacktrace.c:378) ==9576== by 0x38011BCA: vgPlain_record_ExeContext (m_execontext.c:202) ==9576== by 0x380011FC: create_MC_Chunk (mc_malloc_wrappers.c:130) ==9576== by 0x38001369: vgMemCheck___builtin_new (mc_malloc_wrappers.c:192) ==9576== by 0x3802E629: do_client_request (scheduler.c:1158) ==9576== by 0x3802E040: vgPlain_scheduler (scheduler.c:869) ==9576== by 0x3803BE9C: thread_wrapper (syswrap-linux.c:87) ==9576== by 0x3803BF58: run_a_thread_NORETURN (syswrap-linux.c:120) sched status: running_tid=1 Thread 1: status = VgTs_Runnable --9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --9576-- si_code=80; Faulting address: 0x0; sp: 0x6278D51C valgrind: the 'impossible' happened: Killed by fatal signal ==9576== at 0x3802C2CE: vgPlain_use_CF_info (debuginfo.c:830) ==9576== by 0x3801EFA2: vgPlain_get_StackTrace2 (m_stacktrace.c:158) ==9576== by 0x3801F075: vgPlain_get_StackTrace (m_stacktrace.c:378) ==9576== by 0x3801F18A: vgPlain_get_and_pp_StackTrace (m_stacktrace.c:415) ==9576== by 0x38012B32: pp_sched_status (m_libcassert.c:119) ==9576== by 0x38012BBB: report_and_quit (m_libcassert.c:149) ==9576== by 0x38012D53: panic (m_libcassert.c:210) ==9576== by 0x38012D74: vgPlain_core_panic_at (m_libcassert.c:215) ==9576== by 0x3801E837: sync_signalhandler (m_signals.c:1774) ==9576== by 0x3801CDB9: calculate_SKSS_from_SCSS (m_signals.c:459) ==9576== by 0x3801EFA2: vgPlain_get_StackTrace2 (m_stacktrace.c:158) ==9576== by 0x3801F075: vgPlain_get_StackTrace (m_stacktrace.c:378) ==9576== by 0x38011BCA: vgPlain_record_ExeContext (m_execontext.c:202) ==9576== by 0x380011FC: create_MC_Chunk (mc_malloc_wrappers.c:130) ==9576== by 0x38001369: vgMemCheck___builtin_new (mc_malloc_wrappers.c:192) ==9576== by 0x3802E629: do_client_request (scheduler.c:1158) ==9576== by 0x3802E040: vgPlain_scheduler (scheduler.c:869) ==9576== by 0x3803BE9C: thread_wrapper (syswrap-linux.c:87) ==9576== by 0x3803BF58: run_a_thread_NORETURN (syswrap-linux.c:120) .... It is the first time I see this error. Is it a known problem ? Cheers Karim |
|
From: Julian S. <js...@ac...> - 2006-09-17 10:05:27
|
On Saturday 16 September 2006 18:03, karim bernardet wrote: > Hi > > Valgrind 3.2.0 + memcheck crashes with a code : > > ... > ==9576== by 0x42DE696: PyEval_EvalFrame (ceval.c:2163) > ==9576== by 0x42DF128: PyEval_EvalCodeEx (ceval.c:2736) > ==9576== by 0x42E0932: fast_function (ceval.c:3651) > HepMcParticleLink INFO cptr: Using > TruthEvent as McEventCollection key for this job > --9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - > exiting > --9576-- si_code=80; Faulting address: 0x0; sp: 0x6278DD2C Show us the complete output of valgrind, not just the part where it crashed. J |
|
From: Karim B. <kar...@gm...> - 2006-09-18 13:36:34
|
Julian Seward wrote: >Please file a bug report as described at > http://www.valgrind.org/support/bug_reports.html. That ensures >the bug will not be forgotten about. > >There is something strange happening with stack unwinding on amd64 >here. What is needed is a way to reproduce the failure. Can you >create a small test program which shows the problem, or at least >make it easy to reproduce the problem somehow? > > > sorry it is not easy for me to find an easy way to reproduce this problem (bug framework in HEP Physics) But I have started the same job on a Xeon to see what happens >J > > >On Monday 18 September 2006 09:06, Karim Bernardet wrote: > > >>Hi >> >>You can find the full logfile here : >> >>http://atlas-france.in2p3.fr/Activites/Informatique/TMP/ProdBranch/rel_6/Re >>cFull.log.gz >> >>Cheers >> >>Karim >> >>Julian Seward wrote: >> >> >>>On Saturday 16 September 2006 18:03, karim bernardet wrote: >>> >>> >>>>Hi >>>> >>>>Valgrind 3.2.0 + memcheck crashes with a code : >>>> >>>>... >>>>==9576== by 0x42DE696: PyEval_EvalFrame (ceval.c:2163) >>>>==9576== by 0x42DF128: PyEval_EvalCodeEx (ceval.c:2736) >>>>==9576== by 0x42E0932: fast_function (ceval.c:3651) >>>>HepMcParticleLink INFO cptr: Using >>>>TruthEvent as McEventCollection key for this job >>>>--9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) >>>>- exiting >>>>--9576-- si_code=80; Faulting address: 0x0; sp: 0x6278DD2C >>>> >>>> >>>Show us the complete output of valgrind, not just the part where it >>>crashed. >>> >>>J >>> >>>------------------------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job >>>easier Download IBM WebSphere Application Server v.1.0.1 based on Apache >>>Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Valgrind-users mailing list >>>Val...@li... >>>https://lists.sourceforge.net/lists/listinfo/valgrind-users >>> >>> >>------------------------------------------------------------------------- >>Using Tomcat but need to do more? Need to support web services, security? >>Get stuff done quickly with pre-integrated technology to make your job >>easier Download IBM WebSphere Application Server v.1.0.1 based on Apache >>Geronimo >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>_______________________________________________ >>Valgrind-users mailing list >>Val...@li... >>https://lists.sourceforge.net/lists/listinfo/valgrind-users >> >> > > > |
|
From: Karim B. <kar...@gm...> - 2006-09-18 08:10:16
|
Hi You can find the full logfile here : http://atlas-france.in2p3.fr/Activites/Informatique/TMP/ProdBranch/rel_6/RecFull.log.gz Cheers Karim Julian Seward wrote: >On Saturday 16 September 2006 18:03, karim bernardet wrote: > > >>Hi >> >>Valgrind 3.2.0 + memcheck crashes with a code : >> >>... >>==9576== by 0x42DE696: PyEval_EvalFrame (ceval.c:2163) >>==9576== by 0x42DF128: PyEval_EvalCodeEx (ceval.c:2736) >>==9576== by 0x42E0932: fast_function (ceval.c:3651) >>HepMcParticleLink INFO cptr: Using >>TruthEvent as McEventCollection key for this job >>--9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - >>exiting >>--9576-- si_code=80; Faulting address: 0x0; sp: 0x6278DD2C >> >> > >Show us the complete output of valgrind, not just the part where it >crashed. > >J > >------------------------------------------------------------------------- >Using Tomcat but need to do more? Need to support web services, security? >Get stuff done quickly with pre-integrated technology to make your job easier >Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >_______________________________________________ >Valgrind-users mailing list >Val...@li... >https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > |
|
From: Julian S. <js...@ac...> - 2006-09-18 10:04:44
|
Please file a bug report as described at http://www.valgrind.org/support/bug_reports.html. That ensures the bug will not be forgotten about. There is something strange happening with stack unwinding on amd64 here. What is needed is a way to reproduce the failure. Can you create a small test program which shows the problem, or at least make it easy to reproduce the problem somehow? J On Monday 18 September 2006 09:06, Karim Bernardet wrote: > Hi > > You can find the full logfile here : > > http://atlas-france.in2p3.fr/Activites/Informatique/TMP/ProdBranch/rel_6/Re >cFull.log.gz > > Cheers > > Karim > > Julian Seward wrote: > >On Saturday 16 September 2006 18:03, karim bernardet wrote: > >>Hi > >> > >>Valgrind 3.2.0 + memcheck crashes with a code : > >> > >>... > >>==9576== by 0x42DE696: PyEval_EvalFrame (ceval.c:2163) > >>==9576== by 0x42DF128: PyEval_EvalCodeEx (ceval.c:2736) > >>==9576== by 0x42E0932: fast_function (ceval.c:3651) > >>HepMcParticleLink INFO cptr: Using > >>TruthEvent as McEventCollection key for this job > >>--9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) > >> - exiting > >>--9576-- si_code=80; Faulting address: 0x0; sp: 0x6278DD2C > > > >Show us the complete output of valgrind, not just the part where it > >crashed. > > > >J > > > >------------------------------------------------------------------------- > >Using Tomcat but need to do more? Need to support web services, security? > >Get stuff done quickly with pre-integrated technology to make your job > > easier Download IBM WebSphere Application Server v.1.0.1 based on Apache > > Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > >Valgrind-users mailing list > >Val...@li... > >https://lists.sourceforge.net/lists/listinfo/valgrind-users > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier Download IBM WebSphere Application Server v.1.0.1 based on Apache > Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
|
From: Karim B. <kar...@gm...> - 2006-09-19 12:23:09
|
Karim Bernardet wrote: >Julian Seward wrote: > > > >>Please file a bug report as described at >>http://www.valgrind.org/support/bug_reports.html. That ensures >>the bug will not be forgotten about. >> >>There is something strange happening with stack unwinding on amd64 >>here. What is needed is a way to reproduce the failure. Can you >>create a small test program which shows the problem, or at least >>make it easy to reproduce the problem somehow? >> >> >> >> >> >sorry it is not easy for me to find an easy way to reproduce this >problem (bug framework in HEP Physics) >But I have started the same job on a Xeon to see what happens > > Running the same thing on a Intel(R) Xeon(TM) CPU 2.80GHz leads to a crash too http://atlas-france.in2p3.fr/Activites/Informatique/TMP/ProdBranch/rel_2/RecValgIDCALO.log.gz Karim >>J >> >> >>On Monday 18 September 2006 09:06, Karim Bernardet wrote: >> >> >> >> >>>Hi >>> >>>You can find the full logfile here : >>> >>>http://atlas-france.in2p3.fr/Activites/Informatique/TMP/ProdBranch/rel_6/Re >>>cFull.log.gz >>> >>>Cheers >>> >>>Karim >>> >>>Julian Seward wrote: >>> >>> >>> >>> >>>>On Saturday 16 September 2006 18:03, karim bernardet wrote: >>>> >>>> >>>> >>>> >>>>>Hi >>>>> >>>>>Valgrind 3.2.0 + memcheck crashes with a code : >>>>> >>>>>... >>>>>==9576== by 0x42DE696: PyEval_EvalFrame (ceval.c:2163) >>>>>==9576== by 0x42DF128: PyEval_EvalCodeEx (ceval.c:2736) >>>>>==9576== by 0x42E0932: fast_function (ceval.c:3651) >>>>>HepMcParticleLink INFO cptr: Using >>>>>TruthEvent as McEventCollection key for this job >>>>>--9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) >>>>>- exiting >>>>>--9576-- si_code=80; Faulting address: 0x0; sp: 0x6278DD2C >>>>> >>>>> >>>>> >>>>> >>>>Show us the complete output of valgrind, not just the part where it >>>>crashed. >>>> >>>>J >>>> >>>>------------------------------------------------------------------------- >>>>Using Tomcat but need to do more? Need to support web services, security? >>>>Get stuff done quickly with pre-integrated technology to make your job >>>>easier Download IBM WebSphere Application Server v.1.0.1 based on Apache >>>>Geronimo >>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>>_______________________________________________ >>>>Valgrind-users mailing list >>>>Val...@li... >>>>https://lists.sourceforge.net/lists/listinfo/valgrind-users >>>> >>>> >>>> >>>> >>>------------------------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job >>>easier Download IBM WebSphere Application Server v.1.0.1 based on Apache >>>Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Valgrind-users mailing list >>>Val...@li... >>>https://lists.sourceforge.net/lists/listinfo/valgrind-users >>> >>> >>> >>> >> >> >> >> > > >------------------------------------------------------------------------- >Using Tomcat but need to do more? Need to support web services, security? >Get stuff done quickly with pre-integrated technology to make your job easier >Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >_______________________________________________ >Valgrind-users mailing list >Val...@li... >https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > |
|
From: Julian S. <js...@ac...> - 2006-09-19 12:36:33
|
It's hard to figure out what's going on here, since (1) you have a whole bunch of processes running, and (2) your program is based on or around Python, which doesn't help. Looking at the log, it seems most likely to me that this is caused by some errors that memcheck reports in your program. The difficulty is that Python's garbage collector tends to confuse memcheck. However, I'd say that errors like this one ==4380== Invalid read of size 4 ==4380== at 0x42AD6D0: PyObject_Free (obmalloc.c:735) ==4380== by 0x429D8CA: list_dealloc (listobject.c:264) ==4380== by 0x42B9E31: mro_internal (typeobject.c:1309) ==4380== by 0x42BB406: PyType_Ready (typeobject.c:3201) ==4380== by 0x42BB798: PyType_Ready (typeobject.c:3153) ==4380== by 0x42ACA6D: _Py_ReadyTypes (object.c:1805) ==4380== by 0x4309F5F: Py_InitializeEx (pythonrun.c:167) ==4380== by 0x430A449: Py_Initialize (pythonrun.c:283) ==4380== by 0x431312D: Py_Main (main.c:418) ==4380== by 0x80486B9: main (python.c:23) ==4380== Address 0x63F0010 is 5,576 bytes inside a block of size 8,608 free'd ==4380== at 0x401B277: free (vg_replace_malloc.c:233) ==4380== by 0x571EF6E: G__free_struct_upto (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCint.so) ==4380== by 0x56AE5AD: G__scratch_all (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCint.so) ==4380== by 0x56DAB4A: G__main (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCint.so) ==4380== by 0x56DA626: G__init_cint (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCint.so) ==4380== by 0x505A688: TCint::ResetAll() (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x50599ED: TCint::TCint(char const*, char const*) (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x4FFC1E7: TROOT::TROOT(char const*, char const*, void (**)()) (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x4FFA2C9: ROOT::GetROOT() (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x502217A: TTimer::Reset() (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x5021A3D: TTimer::TTimer(long, bool) (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x5022535: __static_initialization_and_destruction_0(int, int) (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x502273F: _GLOBAL__I_gSingleShotCleaner (in /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x5402324: (within /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x4F7A530: (within /afs/cern.ch/sw/lcg/external/root/5.10.00e/slc3_ia32_gcc323/root/lib/libCore.so) ==4380== by 0x400CB10: _dl_init (in /lib/ld-2.3.2.so) ==4380== by 0x4000C94: (within /lib/ld-2.3.2.so) (that is, reading freed memory) are almost certainly errors in your code, and you should fix them. J On Tuesday 19 September 2006 13:19, Karim Bernardet wrote: > Karim Bernardet wrote: > >Julian Seward wrote: > >>Please file a bug report as described at > >>http://www.valgrind.org/support/bug_reports.html. That ensures > >>the bug will not be forgotten about. > >> > >>There is something strange happening with stack unwinding on amd64 > >>here. What is needed is a way to reproduce the failure. Can you > >>create a small test program which shows the problem, or at least > >>make it easy to reproduce the problem somehow? > > > >sorry it is not easy for me to find an easy way to reproduce this > >problem (bug framework in HEP Physics) > >But I have started the same job on a Xeon to see what happens > > Running the same thing on a > > Intel(R) Xeon(TM) CPU 2.80GHz > > leads to a crash too > > http://atlas-france.in2p3.fr/Activites/Informatique/TMP/ProdBranch/rel_2/Re >cValgIDCALO.log.gz > > Karim > > >>J > >> > >>On Monday 18 September 2006 09:06, Karim Bernardet wrote: > >>>Hi > >>> > >>>You can find the full logfile here : > >>> > >>>http://atlas-france.in2p3.fr/Activites/Informatique/TMP/ProdBranch/rel_6 > >>>/Re cFull.log.gz > >>> > >>>Cheers > >>> > >>>Karim > >>> > >>>Julian Seward wrote: > >>>>On Saturday 16 September 2006 18:03, karim bernardet wrote: > >>>>>Hi > >>>>> > >>>>>Valgrind 3.2.0 + memcheck crashes with a code : > >>>>> > >>>>>... > >>>>>==9576== by 0x42DE696: PyEval_EvalFrame (ceval.c:2163) > >>>>>==9576== by 0x42DF128: PyEval_EvalCodeEx (ceval.c:2736) > >>>>>==9576== by 0x42E0932: fast_function (ceval.c:3651) > >>>>>HepMcParticleLink INFO cptr: Using > >>>>>TruthEvent as McEventCollection key for this job > >>>>>--9576-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 > >>>>> (SIGSEGV) - exiting > >>>>>--9576-- si_code=80; Faulting address: 0x0; sp: 0x6278DD2C > >>>> > >>>>Show us the complete output of valgrind, not just the part where it > >>>>crashed. > >>>> > >>>>J > >>>> > >>>>----------------------------------------------------------------------- > >>>>-- Using Tomcat but need to do more? Need to support web services, > >>>> security? Get stuff done quickly with pre-integrated technology to > >>>> make your job easier Download IBM WebSphere Application Server v.1.0.1 > >>>> based on Apache Geronimo > >>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=12164 > >>>>2 _______________________________________________ > >>>>Valgrind-users mailing list > >>>>Val...@li... > >>>>https://lists.sourceforge.net/lists/listinfo/valgrind-users > >>> > >>>------------------------------------------------------------------------ > >>>- Using Tomcat but need to do more? Need to support web services, > >>> security? Get stuff done quickly with pre-integrated technology to make > >>> your job easier Download IBM WebSphere Application Server v.1.0.1 based > >>> on Apache Geronimo > >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >>>_______________________________________________ > >>>Valgrind-users mailing list > >>>Val...@li... > >>>https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > >------------------------------------------------------------------------- > >Using Tomcat but need to do more? Need to support web services, security? > >Get stuff done quickly with pre-integrated technology to make your job > > easier Download IBM WebSphere Application Server v.1.0.1 based on Apache > > Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > >Valgrind-users mailing list > >Val...@li... > >https://lists.sourceforge.net/lists/listinfo/valgrind-users |