|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-15 16:50:57
|
On 2017/02/15 23:32, Tom Hughes wrote:
> On 15/02/17 13:34, ISHIKAWA,chiaki wrote:
>
>> When I tried to run mozilla thunderbird mail client, which I create
>> under Debian GNU/Linux 64-bit,
>> under valgrind, valgrind mysteriously crashed and gdb was not much help.
>
> Well valgrind almost never "mysteriously crashes".
>
> In fact it is usually very verbose when anything goes wrong.
>
Hi,
Thank you for your comment.
The above was what I thought back in 2015 and actually I exchanged a few
e-mails with Julian Seward about the issue back then. But we gave up on it.
Because the system printed out "Segmentation error" without a good trace
of anything at all (!) (which was quite surprising): We traced signals,
and stuff. Everything we could think of using various options passed to
valgrind (and even traced the system calls valgrind was issuing using
strace.).
> So the first thing you should do is to tell us in detail exactly what it
> said when it stopped.
Since gdb and various traces invoked by the options passed to valgrind
are useless (as in the case back in 2015),
I traced the system calls issued by valgrind.
There was a MMAP call before something went wrong and signal 11 was
issued and then
I saw SIGSEGV passed a dozen times or so, and voila. Segmentation error
back at the shell level.
gdb does not print anything useful at all...
>
>> This happened under the latest 4.8.x kernel which Debian distributed as
>> part of its testing repository.
>>
>> I tried a few things but subsequently reverted to kernel 3.19.5.
>> Now thunderbird under valgrind works (!).
>
> So most likely this is just a new system call that valgrind doesn't
> handle or something, in which case valgrind will have reported all the
> details needed to fix it when it stopped.
That was what I (and Julian Seward) hoped back in 2015, but valgrind did
not. From the debugging I did over the last few months, I figured the
problem I face is indeed as perplexing as the case back in 2015 and I
took the easy course now: I decided that trying to find out if there is
ANYBODY who is using valgrind and running big program under it using
Debian GNU/Linux official kernel is easier (which I doubt based on my
experience). Also, Julian Seward back in 2015 mentioned valgrind could
grok thunderbird under Fedora and thus I thought it would be easier to
figure out if someone is running 64-bit thunderbird under CentOS or
Fedora 64-bit and compare the config to figure out what is causing the
problem under Debian's kernel.
BTW, the following is is what I found back in 2015.
------------------------+----------------
Kernel version | valgrind + C-C TB works or not
------------------------+----------------
Debian 3.2.0...| works <--- base debian version for wheezy
------------------------+----------------
self-compiled 3.9.0...| works
------------------------+----------------
self-compiled 3.12.40 | works
------------------------+----------------
self-compiled 3.13.11 | works
------------------------+----------------
self-compiled 3.14.38 | ??? <--- pristine kernel hit the problem
mentioned in the following patch and panicked. open source is
wonderful when it works, but when it does not
http://lkml.iu.edu/hypermail/linux/kernel/1407.3/04296.html
------------------------+----------------
self-compiled 3.15.9 | ??? <--- vanilla kernel could not bring up X
probably because the same reason above. X
did not start in a few minutes, and so I gave up. I did not see the
kernel panic, though.
------------------------+----------------
Debian backport 3.16 ...| Segmentation fault! [Why? I have no idea.]
------------------------+----------------
------------------------+------------------
Vanilla 3.19.5 | works (worked back in 2015 and now I have to
revert to it...)
------------------------+------------------
This time arouind, I tried to figure out if I could do something similar
using the latest kernel 4.9.x (vanilla version), hoping it might make
valgrind run thunderbird under it without segmentation error. But the
very late kernel caused a problem of VirtualBox utility, such as
graphics driver that supports dynamic resizing, not supporting the
latest kernel as guest at all, and I had to give it up.
(Yes, I am running Debian GNU/Linux inside VirtualBox.)
Sorry, I was so tired of debugging and seeing that the current issue
looked so much like the mysterious problem back in 2015, that I did not
bother to pursue the issue in valgrind per se, but rather wanted to
focus on kernel issue now.
I am running the |make mozmill| test of thunderbird which now takes
about 48 hours and once it is over, I will switch the kernel and gather
the gdb stack trace (which is useless) when valgrind crashes, and
also show the last part of strace (system call trace) which again is not
very revealing.
I am sure you will be perplexed why on earth valgrind is crashing when
we try to run thunderbird underneath in Debian's kernel. [I *DID* notice
that there are differences in Debian kernel that it enables stack
protection, for starter. Not sure if it affects Valgrind operation.]
>
> Tom
>
TIA
|