From: John R. <jr...@bi...> - 2022-09-09 02:59:02
|
>> 1. Describe the environment completely. Also: Any kind of threading (pthreads, or shm_open, or mmap(,,,MAP_SHARED,,)) must be mentioned explicitly. Multiple execution contexts which access the same address space instance are a significant complicating factor. If threading is involved, then try using "valgrind --tool=drd ..." or --tool=helgrind, because those tools specifically target detecting race conditions and other synchronization errors, much like --tool=memcheck [the default tool when no --tool= is mentioned] targets errors involving malloc() and free(), uninitialized variables, etc. >> 4. Walk before attempting to run. >> Did you try a simple example? Write a half-page program with 5 subroutines, >> each of which calls the next one, and the last one sends SIGABRT to the process. >> Does the .core file when run under valgrind give the correct traceback using gdb? Specifically: apply valgrind to the small program which causes a deliberate SIGABRT, and get a core file. Does gdb give the correct traceback for that core file? If not, then you have an ideal test case for filing a bug report against valgrind because even the simple core file is bad. If gdb does give a correct traceback for the simple core file, then you have to keep looking for the source of the problem on your larger program. >> 5. (Learn and) Use the built-in tools where possible. >> Run the process interactively, invoking valgrind with "--vgdb-error=0", >> and giving the debugger command "(gdb) continue" after establishing >> connectivity between vgdb and the process. >> See the valgrind manual, section 3.2.9 "vgdb command line options". >> When the SIGABRT happens, then vgdb will allow you to use all the ordinary >> gdb commands to get a backtrace, go up and down the stack, examine >> variables and other memory, run >> (gdb) info proc >> (gdb) shell cat /proc/$PID/maps >> to see exactly the layout of process memory, etc. >> There are also special commands to access valgrind functionality >> interactively, such as checking for memory leaks. >> > > I already explained why I don't want / can't use the interactive gdb. I'm aware of the option, I've used it before, but in this case it's not very practical. The gdb process does not *have* to be run interactively, it just takes more work and patience to run non-interactively. Run "valgrind --vgdb-error=0 ..." and notice the last part of the printed instructions: and then give GDB the following command ==215935== target remote | /path/to/libexec/valgrind/../../bin/vgdb --pid=215935 ==215935== --pid is optional if only one valgrind process is running So if there is only one valgrind process, then you do not need to know the pid. Thus you can run gdb with re-directed stdin/stdout/stderr, or perhaps use the -x command-line option. This allows a static, pre-scripted list of gdb commands; it may require a few iterations to get a good debug script. (Try the commands using the trivial SIGABRT case!) Also get the full gdb manual (more than 800 pages) and look at the "thread apply all ..." and "frame apply all ..." commands. It may be possible to perform some interactive "reconnaisance" to suggest good things for the script to try. Using --vgdb-error=0, put a breakpoint on a likely location for the error (or shortly before the error), and look around. In the logged traceback: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 902, PID: 536049) (ExceptionalCondition+0x98)[0x8f5cec] (+0x57a574)[0x682574] (+0x579edc)[0x681edc] (ReorderBufferAddNewTupleCids+0x60)[0x6864dc] (SnapBuildProcessNewCid+0x94)[0x68b6a4] any of those named locations, or shortly before them, might be a good spot. When execution stops at any one of the breakpoints, then look around and see if you can find clues about "prev_first_lsn < cur_txn->first_lsn" even though the error has not yet occurred. Perhaps this will help identify location(s) that might be closer to the actual error when it does happen. This might suggest commands for the non-interactive gdb debugging script. |