You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
| 2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
| 2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
(7) |
Nov
(1) |
Dec
|
|
From: Ivo R. <iv...@iv...> - 2017-02-26 13:53:33
|
2017-02-26 6:04 GMT+01:00 Austin English <aus...@gm...>:
> Indeed, svn related commands must be replaced with git ones.
> Can you propose a patch? I assume instead of SVN revisions
> we need to get git commit ids.
> I will have a look at the other Valgrind parts.
>
> Yeah, I'll look this week probably.
>
> Had I known a git migration was imminent, I would've added native git
> support to begin with ;).
The SVN->GIT migration was decided at FOSDEM this month:
https://fosdem.org/2017/schedule/track/valgrind/
https://fosdem.org/2017/schedule/event/valgrind_hackaton/ [see
video recording]
I had no idea before that it will happen...
I.
|
|
From: Paul S. <pa...@ma...> - 2017-02-25 14:00:39
|
On Sat, 2017-02-25 at 10:56 +0100, Ivo Raisr wrote: > Indeed, svn related commands must be replaced with git ones. FYI I use something like this to generate an ID for my version strings: sha=$(git rev-parse --short=10 HEAD) Or in GNU make: SHA := $(shell git rev-parse --short=10 HEAD) This prints the first 10 characters of the SHA (or more, if that's not unique), which is quite sufficient. For a repo with far less changes (our repos get up to 10 or commits to master HEAD per day almost every day) you might be able to get away with a smaller number like 7 or so, if you'd prefer. |
|
From: Ivo R. <iv...@iv...> - 2017-02-25 09:57:04
|
2017-02-25 1:53 GMT+01:00 Austin English <aus...@gm...>: > Hi Ivo, > > I'm very excited for the git move! I tested for > https://bugs.kde.org/show_bug.cgi?id=352395, which I rely upon to keep > track of precisely which version of Valgrind I'm testing with > (especially useful for historical logs). > > With git, this is broken: > austin@austin2:/tmp/valgrind$ ./vg-in-place -v --version > valgrind-3.13.0.SVN-unknown-vex-unknown > > That said, checking out / compiling went great! ;) Hi Austin, Thank you for your feedback! Indeed, svn related commands must be replaced with git ones. Can you propose a patch? I assume instead of SVN revisions we need to get git commit ids. I will have a look at the other Valgrind parts. I. |
|
From: Wuweijia <wuw...@hu...> - 2017-02-25 04:33:54
|
Hi all:
I am new. When I use the massif, I create two output files. One the file show me the correct function name , but the another cant not show me the function name , it show me the ??? symbol.
Can you tell you why? And how to resole it? I want it show the correct function name ;
Envirmont :
CPU AARCH64
VERSION 3.12
BR
Owen
|
|
From: Mike L. <mik...@gm...> - 2017-02-25 02:55:53
|
Being unable to search through gmane for now, I turn to the users group. I'd like some clarification regarding some functions I'm seeing in the tool interface. - track_new_mem_startup - .... - track_new_mem_mmap - ... - track_pre_mem_read - track_pre_mem_write - track_post_mem_write - track_post_reg_write These callbacks specifically refer to events within the *core*, right? Does this mean that they bear no relevance to profiling the user application? Am I correct that a "mem_write" callback only gets called for memory write that happens during translation, internal signal handling, etc, along with the "new_mmap" calls? If I were to instrument the *application under test* then I'd go through the callback given in "basic_tool_funcs", correct? I only asks because I just noticed these functions, and the wording in the comments wasn't incredibly clear. Thanks, Mike |
|
From: Ivo R. <iv...@iv...> - 2017-02-24 19:21:12
|
Dear Valgrind community, We are pleased to announce an imminent migration of Valgrind sources from existing Subversion SCM to modern git SCM, as discussed during our FOSDEM 2017 Valgrind devroom. What is going on now? ~~~~~~~~~~~~~~~~~ The migration has just started. We are now in beta testing stage. We still use the official SVN Valgrind repository for our work until the final migration step. If you have some patches ready now, send them for review. You can contribute to the migration process - read below. What will be migrated: ~~~~~~~~~~~~~~~~~ Valgrind and VEX sources. Precisely sources available today under svn://svn.valgrind.org/valgrind and svn://svn.valgrind.org/vex, including all production release branches and tags. Valgrind and VEX repos will be merged into one, so no more SVN externals. Where I will find the new repo: ~~~~~~~~~~~~~~~~~~~~~~~ At sourceware.org. Precisely at: git://sourceware.org/git/valgrind.git/ http://sourceware.org/git/valgrind.git/ Right now a snapshot of SVN sources as of 2017-02-21 is available for you to test. How the test migration was performed: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See recipes at https://github.com/ivosh/valgrind-git-migration What is the plan for the migration to go forward: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Test migration has been performed and initial tests were successful. 2. The test repo is now available to test and play for others with - see below for details. 3. Prepare www (website) and nightly build script changes and have them reviewed. 4. Proceed once 2+3 are successfully done. 5. Announce the final migration. 6. Completely eradicate contents in the GIT repository so the migration can start from scratch. 7. Switch SVN valgrind+vex repo readonly. 8. Perform the final migration to sourceware.org. 9. Enable email notifications from new git repo. 10. Push www and nightly script changes to the new repo. What will not be migrated: ~~~~~~~~~~~~~~~~~~~~ - Valgrind www (website) repo. Not now, but later. - Non production release branches and tags from old SVN Valgrind+VEX repos. If you need to preserve some other branches or tags, let us know: https://sourceware.org/git/?p=valgrind.git;a=heads https://sourceware.org/git/?p=valgrind.git;a=tags I have a write access to existing SVN repo. What shall I do for the new one? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please contact Julian Seward. He will point you to specific instructions. What will be my simple workflow in new git SCM? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Not much will be changed from the way we worked in SVN. We still prepare patches, send them for review, have someone with write access to push them. A minimalistic workflow would be: git clone git://sourceware.org/git/valgrind.git/ valgrind edit/compile git status/add/show git pull origin/master build + test git commit [git push - if you have write access] There are a lot of good tutorials on simple git workflows, so please have a look. If you are using something more complicated, please share with us and ideally send us a write up. I would like to help with the migration. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Yes, please! Send us your positive and negative feedback. For example: - It worked for me! - This and that did not work for me... - How do I do such and such thing now? The test repository is there for you to play with. The contents will be deleted before the final migration so no reason to worry about potential mistakes. It is also quite likely that the contents will be regenerated during the beta testing, to fix any problems found. We also need a help documenting possible workflows. Especially when preparing a release - we need to test and document how to work with branches and releases. |
|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-21 17:26:04
|
Sorry for top-posting, but thank you for the suggestions. So far, I figured out that the maps are different under 3.19.5, 4.7.0.1, and 4.9.6 versions of linux kernel. Also, I have figured out the SIGSEGV problem is timing-related/race under 4.7.0.1 (Worst bug in terms of reproducibility). If managed to attach to the thunderbird binary executing under valgrind using --vgdb=y and --vgdb-error=0 and single step (or step over functions) through thunderbird to figure out what kind of thunderbird behavior may trigger valgrind problem. Then I noticed the SIGSEGV did not occur when I stepped through the code (over the fork) while if I simply run the thunderbird code by "cont" all the way, SIGSEGV occurs :-( To sum up, under 4.7.0.1, the SIGSEGV seems to occur near the fork() system call. Thunderbird invokes a small glxtest program which checks for the graphics driver info (for debugging?). And fork() is reached before SIGSEGV is observed under 4.7.0.1. [I thought that I was homing on a possible bug.] But under 4.9.6, the SIGSEGV seems to occur way before this fork() is executed and it is very difficult yet to figure out where the SIGSEGV occurs. Under 3.19.5, thunderbird runs under valgrind just fine. From the way it goes, I will be able to post the logs with some results from additional probes at the beginning of next week. On 2017/02/20 8:30, John Reiser wrote: > On 02/18/2017 11:38 PM, ISHIKAWA,chiaki wrote: >> BTW, DOES ANYONE HAVE A GOOD IDEA ABOUT HOW TO CAPTURE the mapped >> file, etc WHEN SIGSEGV happens? It is very dynamic and by the time I >> am ready to type in shell commands, the child binary that experienced >> it may be gone. Yes, I have not been able to figure out exactly which >> process under the test >> suite setup started by thunderbird (under valgrind) is experiencing a >> difficulty. >> I guess some clever hacking via gdb gets me started there? >> BTW, valgrind's --gdb-* options are meant to debug the target under >> valgrind, NOT the segfault of valgrind itself, correct? >> [And the whole thing including valgrind works under kernel 3.19.5 and >> not under later kernel drives me crasy.] > > This gdb command will stop execution and print a message when SIGSEGV > happens: > (gdb) handle SIGSEGV stop print > When the SIGSEGV happens then you will have to focus keyboard input to > that process. > (The above 'handle' command is the default anyway, so if the automation > for your > test harness snatches control, then you still might not get a chance for > manual input.) > There is no way to ask of gdb, "Please run these commands upon SIGSEGV." > > You can write a script for the entire input to gdb: gdb -batch -x script > -e executable > (beware: it is very brittle) but gdb cannot switch its input stream > (such as back and forth between the script and the terminal) > while it is running. "gdb -batch -x script -e executable" might be your > best option, > but it will take some patience. There is no way for the script to > check that gdb is waiting for input after SIGSEGV, so you just have to > assume > that the SIGSEGV is going to happen after your 'run' command in the input. > > Yes, valgrind's --gdb-* options are for debugging the target under > valgrind, > and are NOT for debugging valgrind itself. > > > If you run "strace -f -o strace.out -e trace=execve valgrind > --trace-children=yes ..." > then the output in strace.out will tell you which process receives the > SIGSEGV. > The "-e trace=execve" is a filter which restricts tracing to execve only; > otherwise the output will be very long because it contains every system > call > for every process. > |
|
From: John R. <jr...@bi...> - 2017-02-19 23:30:54
|
On 02/18/2017 11:38 PM, ISHIKAWA,chiaki wrote: > BTW, DOES ANYONE HAVE A GOOD IDEA ABOUT HOW TO CAPTURE the mapped file, etc WHEN SIGSEGV happens? It is very dynamic and by the time I am ready to type in shell commands, the child binary that experienced it may be gone. Yes, I have not been able to figure out exactly which process under the test > suite setup started by thunderbird (under valgrind) is experiencing a difficulty. > I guess some clever hacking via gdb gets me started there? > BTW, valgrind's --gdb-* options are meant to debug the target under valgrind, NOT the segfault of valgrind itself, correct? > [And the whole thing including valgrind works under kernel 3.19.5 and not under later kernel drives me crasy.] This gdb command will stop execution and print a message when SIGSEGV happens: (gdb) handle SIGSEGV stop print When the SIGSEGV happens then you will have to focus keyboard input to that process. (The above 'handle' command is the default anyway, so if the automation for your test harness snatches control, then you still might not get a chance for manual input.) There is no way to ask of gdb, "Please run these commands upon SIGSEGV." You can write a script for the entire input to gdb: gdb -batch -x script -e executable (beware: it is very brittle) but gdb cannot switch its input stream (such as back and forth between the script and the terminal) while it is running. "gdb -batch -x script -e executable" might be your best option, but it will take some patience. There is no way for the script to check that gdb is waiting for input after SIGSEGV, so you just have to assume that the SIGSEGV is going to happen after your 'run' command in the input. Yes, valgrind's --gdb-* options are for debugging the target under valgrind, and are NOT for debugging valgrind itself. If you run "strace -f -o strace.out -e trace=execve valgrind --trace-children=yes ..." then the output in strace.out will tell you which process receives the SIGSEGV. The "-e trace=execve" is a filter which restricts tracing to execve only; otherwise the output will be very long because it contains every system call for every process. -- |
|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-19 07:39:03
|
Hi,
Thank you again.
I will hopefully upload the requested info next week.
Here is what I can write down today.
What would be the appropriate upload service? [The data would be too
large for e-mail to the list.]
On 2017/02/19 7:32, John Reiser wrote:
> How many failures occur in 10 runs of thunderbird under valgrind?
10 times, i.e., all the time under the Debian's stock newer kernel.
> How many failures occur in 10 runs if you reboot just before each run?
It never occurred to me to to reboot the system before retrying.
I will check this next week (but given the tests I did by SWITCHING
kernel versions by rebooting to a different revision before over the
last few months, I would say 10 times, i.e. all the times, but again let
me check.)
>
> Thunderbird is a user mail agent that uses interactive graphics.
> How many failures occur before the display window appears, and how many after?
There is one issue: I am seeing a failure of valgrind when I try to run
thunderbird test suite and the complicating factor here is aside from
the available user interaction through GUI under X windows, during the
execution of |make mozmill| test suite, there is a daemon that runs test
scripts and talks to the main TB binary via COM interface. [I stay away
from KB and mouse cursor during tests to avoid interfering with the test
suite run. I do this by invoking virtual X desktop using Xephyr: the
test suite run using Valgrind is done in that virtual desktop. If I
wanted to, I COULD interact with thunderbird's GUI via mouse explicitly.
I did this a few times when a bug in thunderbird or test scrips made the
execution hung waiting for a confirmation of modal dialog, etc.]
From what I did, the crash occurs before the display window of the
tested thunderbird appears all the time [all the time when the valgrind
printed mysterious Segmentation error under newer Debian kernel.
> Are the symptoms and frequency the same for a Radeon card as for NVidia?
> On the open-source NVidia driver versus the proprietary driver?
> In "dumb framebuffer" mode ("no" acceleration)?
> Please tell us which cards: "lspci -nn | grep VGA" or similar.
I am using Debian GNU/Linux inside
VirtualBox installed under Windows 10 as a platform to
develop and test thunderbird patches.
Debian GNU/Linux installed as the guest OS inside VirtualBox.
So the video graphics driver relevant here is the the VirtualBox video
driver, I think, correct? (But there was a puzzling message in X.0.log.
I will mention it to the answer to your second to last question.)
Under 3.19.5 kernel where the valgrind + thunderbird test suite works:
$ lspci -nn | grep VGA
00:02.0 VGA compatible controller [0300]: InnoTek Systemberatung GmbH
VirtualBox Graphics Adapter [80ee:beef]
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$
(InnoTek is the name of original virtualbox developer.)
I am not sure if I can remove the above virtualbox graphics adaptor and
revert to the plain VGA adaptor emulation done by VirtualBox, but let me
try.
> Are the symptoms and frequency the same for Firefox as for thunderbird?
I am not developing or creating patches for Firefox. Sorry.
> Are the symptoms and frequency the same for Chrome as for thunderbird?
Ditto.
Oh, you mean to ask whether I can run very simple
valgrind firefox-binary (without any test harness invovlment) under the
new kernel and see it works?
Then I can test it.
But Chrome. I have not even installed it before.
> Please present a histogram of the {mapped file, pc offset, instruction stream}
> when the SIGSEGV happens. [You should have at least 70 runs by now: 10 each
> for thunderbird plain, with reboot, other graphics card, other NVidia driver,
> dumb framebuffer, Firefox, Chrome.]
OK, I will gather the data (not sure what you man by "histogram", but I
will gather what I think is relevant.)
10 each
for thunderbird plain,
with reboot [I will certainly reboot before the test run.
x 10 times with the above InnoTech driver (built-in for VirtualBox).
[I am not sure if SIGSEGV happens under this setup.]
for thunderbid + test suite hookup.
I am quite certain that SIGSEGV happens under this setup.
BTW, DOES ANYONE HAVE A GOOD IDEA ABOUT HOW TO CAPTURE the mapped
file, etc WHEN SIGSEGV happens? It is very dynamic and by the time I am
ready to type in shell commands, the child binary that experienced it
may be gone. Yes, I have not been able to figure out exactly which
process under the test suite setup started by thunderbird (under
valgrind) is experiencing a difficulty.
I guess some clever hacking via gdb gets me started there?
BTW, valgrind's --gdb-* options are meant to debug the target under
valgrind, NOT the segfault of valgrind itself, correct?
[And the whole thing including valgrind works under kernel 3.19.5 and
not under later kernel drives me crasy.]
> other graphics card, other NVidia driver, These won't apply.
for thunderbird plain,
dumb framebuffer [IF THIS SETUP IS FEASIBLE under VirtualBox.]
after reboot
for thunderbird + test suite hookup.
dumb framebuffer [IF THIS SETUP IS FEASIBLE under VirtualBox.]
after reboot
> Firefox,
I think without any test suite hookup, or anything, I can
simply run Firefox ESR now available from Debian GNU/Linux repository.
I suspect without any test suite hookup, it will run.
Anyway, I will try to compare the
mmap status under firefox with stock VirtualBox graphics driver, and
mmap status under firefox with dumb framebuffer [IF THIS IS FEASIBLE.]
after reboot.
> Chrome.
It looks there is a package of Chrome for Ubuntu.
Maybe I can install it under Debian.
However, this can wait, I think.
At the same time, it would be very instructive to compare the mmap
between the one while chrome is running [AFTER REBOOT]
and the ones when mozilla software {thunderbird, firefox} is running.
> thunderbird is not available from the Debian stable "jessie" repository
> (Debian 8.7.1, 2017-01-20.) Where did you get it?
Sorry I was not clear about it.
I have fetched so-called comm-central thunderbird repository and
have been building it locally [64-bit] for testing purposes to fix some
serious bugs I experienced.
The instruction to build thunderbird locally is in the following URL and
I have basically followed it.
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Simple_Thunderbird_build
"Basically" means that I had to tweak the so-called "mozconfig" in many
ways, especially, to enable valgrind-friendly build.
Very brief explanation is in the following URL:
https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Valgrind
The above refers to test |mochitest| for firefox.
Since thunderbird lives in different source directory and
uses a very different test suite setup that uses mozmill, there are
quirks and modifications one need to add to the source files and scripts
in order to run thunderbird under valgrind.
It seems that, at one time, somebody hacked the thunderbird test suite
to run valgrind/memcheck for thunderbird, but it was abandoned and
nobody seems to recall how it was exactly done or how to update the
scripts, etc.
So basically, what I do myself to run thunderbird is
- renaming the original thunderbird binary to something else, and
- in its place, I place a binary that invokes the original
thunderbird binary under valgrind/memcheck with the supplied parameters.
This trick has worked very well and many bugs/issues were found in the
last several years until 2015 when I first experienced the strange
problem of valgrind failure. And back then,
I realized it was related to different kernel versioning.
The locally created kernel 3.19.5 saved the day.
But the world has moved on to 4.x series kernel since then, and when I
updated the kernel last summer this problem reappeared.
I have reverted the kernel to 3.19.5 for the moment, but I am not sure
how long I can stick to the older kernel.
If you need a thinderbird binary to test on your end, I can certainly
make it available.
Actually, I run the test (without valgrind) inside mozilla's
compilation/testing farm occasionally. [This makes it for me to possible
to compile/test OSX version and Windows version. This is a necessary
step before a patch is accepted into mozilla's source tree. ]
You can fetch the binary from there. Please let me know if this is the case.
> Which kernel modules have been loaded (lsmod)?
Under 3.19.5
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ uname -a
Linux ip030 3.19.5 #1 SMP Mon Apr 20 08:50:21 JST 2015 x86_64 GNU/Linux
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ lsmod
Module Size Used by
fuse 72030 1
btrfs 731518 0
xor 21081 1 btrfs
raid6_pq 95431 1 btrfs
ufs 59011 0
qnx4 13100 0
hfsplus 81692 0
hfs 45988 0
minix 27622 0
ntfs 160179 0
vfat 17270 0
msdos 17077 0
fat 50634 2 vfat,msdos
jfs 137440 0
xfs 667205 0
libcrc32c 12426 1 xfs
ext3 151975 0
jbd 52800 1 ext3
ext2 59160 0
dm_mod 77808 0
vboxsf 37355 1
mptctl 29762 0
mptbase 56835 1 mptctl
binfmt_misc 12846 1
ghash_clmulni_intel 13019 0
aesni_intel 163983 0
ppdev 12724 0
joydev 17107 0
iTCO_wdt 12831 0
iTCO_vendor_support 12704 1 iTCO_wdt
aes_x86_64 16719 1 aesni_intel
ablk_helper 12572 1 aesni_intel
cryptd 14600 3 ghash_clmulni_intel,aesni_intel,ablk_helper
lrw 12871 1 aesni_intel
evdev 17518 14
gf128mul 13047 1 lrw
glue_helper 12773 1 aesni_intel
microcode 30394 0
snd_intel8x0 30885 2
psmouse 83740 0
serio_raw 12894 0
pcspkr 12595 0
snd_ac97_codec 102547 1 snd_intel8x0
snd_pcm 73065 2 snd_ac97_codec,snd_intel8x0
snd_timer 22641 1 snd_pcm
snd 53213 8
snd_ac97_codec,snd_intel8x0,snd_timer,snd_pcm
soundcore 13031 1 snd
sg 29968 0
ac97_bus 12510 1 snd_ac97_codec
processor 28021 0
lpc_ich 20905 0
mfd_core 12601 1 lpc_ich
video 18144 0
rng_core 12880 0
vboxvideo 36417 2
vboxguest 181315 6 vboxsf,vboxvideo
thermal_sys 28310 2 video,processor
ttm 61967 1 vboxvideo
drm_kms_helper 74527 1 vboxvideo
drm 229484 5 ttm,drm_kms_helper,vboxvideo
i2c_piix4 12665 0
i2c_core 38003 3 drm,i2c_piix4,drm_kms_helper
syscopyarea 12350 1 vboxvideo
sysfillrect 12522 1 vboxvideo
sysimgblt 12351 1 vboxvideo
ac 12715 0
battery 13356 0
parport_pc 22422 0
parport 31812 2 ppdev,parport_pc
button 12988 0
sunrpc 192012 1
loop 22596 0
ip_tables 22004 0
x_tables 19034 1 ip_tables
autofs4 27584 2
ext4 403601 15
crc16 12343 1 ext4
jbd2 71809 1 ext4
mbcache 13488 3 ext2,ext3,ext4
sd_mod 39859 26
sr_mod 21993 0
cdrom 27042 1 sr_mod
ata_generic 12490 0
hid_generic 12393 0
usbhid 40671 0
hid 90268 2 hid_generic,usbhid
ohci_pci 12808 0
ehci_pci 12472 0
ohci_hcd 30951 1 ohci_pci
ehci_hcd 40790 1 ehci_pci
crc32c_intel 21850 4
ahci 29245 16
usbcore 151644 5 ohci_hcd,ohci_pci,ehci_hcd,ehci_pci,usbhid
libahci 23158 1 ahci
usb_common 12440 1 usbcore
ata_piix 29671 0
libata 145717 4 ahci,libahci,ata_generic,ata_piix
scsi_mod 172107 5 sg,libata,mptctl,sd_mod,sr_mod
e1000 90595 0
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$
I did not realize there are so many vbox drivers.
> Which version(s) of the low-level X11 and display drivers (DRM: direct
> rendering manager) are in use?
Under 3.19.5
egrep -i "(module|vbox|drm)" /var/log/Xorg.0.log &
printed out
[ 8.651] (==) ModulePath set to "/usr/lib/xorg/modules"
[ 8.651] (II) Module ABI versions:
[ 8.652] (II) xfree86: Adding drm device (/dev/dri/card0)
[ 8.655] (II) LoadModule: "glx"
[ 8.658] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[ 8.716] (II) Module glx: vendor="X.Org Foundation"
[ 8.716] compiled for 1.19.1, module version = 1.0.0
[ 8.716] (==) Matched vboxvideo as autoconfigured driver 0
[ 8.716] (==) Matched vboxvideo as autoconfigured driver 1
[ 8.716] (II) LoadModule: "vboxvideo"
[ 8.716] (WW) Warning, couldn't open module vboxvideo
[ 8.716] (II) UnloadModule: "vboxvideo"
[ 8.716] (II) Unloading vboxvideo
[ 8.716] (EE) Failed to load module "vboxvideo" (module does not
exist, 0)
[ 8.716] (II) LoadModule: "modesetting"
[ 8.716] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
[ 8.717] (II) Module modesetting: vendor="X.Org Foundation"
[ 8.717] compiled for 1.19.1, module version = 1.19.1
[ 8.717] Module class: X.Org Video Driver
[ 8.717] (II) LoadModule: "fbdev"
[ 8.717] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so
[ 8.717] (II) Module fbdev: vendor="X.Org Foundation"
[ 8.717] compiled for 1.19.0, module version = 0.4.4
[ 8.717] Module class: X.Org Video Driver
[ 8.717] (II) LoadModule: "vesa"
[ 8.717] (II) Loading /usr/lib/xorg/modules/drivers/vesa_drv.so
[ 8.717] (II) Module vesa: vendor="X.Org Foundation"
[ 8.717] compiled for 1.19.0, module version = 2.3.4
[ 8.717] Module class: X.Org Video Driver
[ 8.721] (II) Loading sub module "fbdevhw"
[ 8.721] (II) LoadModule: "fbdevhw"
[ 8.721] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so
[ 8.722] (II) Module fbdevhw: vendor="X.Org Foundation"
[ 8.722] compiled for 1.19.1, module version = 0.0.2
[ 8.722] (II) Loading sub module "glamoregl"
[ 8.722] (II) LoadModule: "glamoregl"
[ 8.722] (II) Loading /usr/lib/xorg/modules/libglamoregl.so
[ 8.733] (II) Module glamoregl: vendor="X.Org Foundation"
[ 8.733] compiled for 1.19.1, module version = 1.0.0
[ 8.838] EGL_MESA_drm_image required.
[ 8.839] (II) modeset(0): Monitor name: VBOX monitor
[ 8.840] (II) Loading sub module "fb"
[ 8.840] (II) LoadModule: "fb"
[ 8.840] (II) Loading /usr/lib/xorg/modules/libfb.so
[ 8.840] (II) Module fb: vendor="X.Org Foundation"
[ 8.840] compiled for 1.19.1, module version = 1.0.0
[ 8.840] (II) UnloadModule: "fbdev"
[ 8.840] (II) UnloadSubModule: "fbdevhw"
[ 8.840] (II) UnloadModule: "vesa"
[ 8.916] (II) LoadModule: "libinput"
[ 8.916] (II) Loading /usr/lib/xorg/modules/input/libinput_drv.so
[ 8.919] (II) Module libinput: vendor="X.Org Foundation"
[ 8.919] compiled for 1.19.0, module version = 0.23.0
[ 8.919] Module class: X.Org XInput Driver
I am a little surprised but right now I may be using glx driver given
that "vboxvide" module does not seem to be loaded and other famous
modules get unloaded. Yes, I found out glxinfo printed out rows of
output including the following lines, and glxgears seems to run fine. I
should have known.
Re: glx:
glxinfo | grep -i1 vmware
Extended renderer info (GLX_MESA_query_renderer):
Vendor: VMware, Inc. (0xffffffff)
Device: llvmpipe (LLVM 3.9, 256 bits) (0xffffffff)
--
Max GLES[23] profile version: 3.0
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (L
I will collect info on 4.9.0-1 kernel (this is the latest test kernel
where I could not run thunderbird test suite since something dies during
execution.).
It may take a little time to gether the data. (Since the
compiling/testing thunderbird requires resources, I have only once
instance of VM running on the PC. So I really have to reboot this VM to
switch the kernel to obtain data.)
I wish someone with 64GB memory could retry and reproduce the issue in
their VirtualBox images on their hardware :-)
It would be very instructive compare the mmap usage, etc. under
different kernel revisions side by side (!)
TIA
PS: Just in case the HOST CPU/OS may have something to do with the issues:
OS: Windows 10 Pro
CPU: Intel Xeon CPU E3-1240 V2
Graphics: Radeon 7700
But I am sure that VirtualBox has shielded the bare metal rather well.
Windows version of VirtualBox : 5.1.14 r112924 (Qt5.6.2)
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Valgrind-users mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>
>
|
|
From: John R. <jr...@bi...> - 2017-02-18 22:32:21
|
How many failures occur in 10 runs of thunderbird under valgrind?
How many failures occur in 10 runs if you reboot just before each run?
Thunderbird is a user mail agent that uses interactive graphics.
How many failures occur before the display window appears, and how many after?
Are the symptoms and frequency the same for a Radeon card as for NVidia?
On the open-source NVidia driver versus the proprietary driver?
In "dumb framebuffer" mode ("no" acceleration)?
Please tell us which cards: "lspci -nn | grep VGA" or similar.
Are the symptoms and frequency the same for Firefox as for thunderbird?
Are the symptoms and frequency the same for Chrome as for thunderbird?
Please present a histogram of the {mapped file, pc offset, instruction stream}
when the SIGSEGV happens. [You should have at least 70 runs by now: 10 each
for thunderbird plain, with reboot, other graphics card, other NVidia driver,
dumb framebuffer, Firefox, Chrome.]
thunderbird is not available from the Debian stable "jessie" repository
(Debian 8.7.1, 2017-01-20.) Where did you get it?
Which kernel modules have been loaded (lsmod)?
Which version(s) of the low-level X11 and display drivers (DRM: direct
rendering manager) are in use?
|
|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-18 15:56:40
|
On 2017/02/18 0:57, John Reiser wrote: > Hint #1. Fix the first complaint. Do not pass GO, do not collect $200. FIX the first complaint. > You will get more sympathy and attention if the *first* significant event > is the bug/error/mystery that is the focus of your inquiry. > >> ==3755== Mismatched free() / delete / delete [] >> ==3755== at 0x4C2CD3A: free (vg_replace_malloc.c:530) >> ==3755== by 0x13EE71B3: bool >> google::protobuf::InsertIfNotPresent... Thank you. I thought of investigating this myself. (But my previous brief analysis came to a dead end since the allocation was done inside libstdc++ AND the mozilla code seemed to honor the proper free/malloc, delete/new, delete []/new arrayobject at the superficial source code level :-( ] By running the the latest thunderbird code under valgrind/memcheck under linux kernel 3.19.5 (this is the latest kernel I could make the memcheck + thunderbird work under Debian GNU/Linux.), I obtained the mismatched warnings as many as possible, and tried to analyze them. According to https://bugzilla.mozilla.org/show_bug.cgi?id=1340576, the prospect is grim. See comment 5 there. https://bugzilla.mozilla.org/show_bug.cgi?id=1340576#c5 --- begin quote --- (In reply to ISHIKAWA, Chiaki from comment #4) > Julian, what course of action should I take from here? The simple answer is, run with --show-mismatched-frees=no. Most of them are false positives caused by inconsistent inlining of malloc into new vs free into delete. The more complex answer is, we'd have to look at them on an individual basis. Bug 1325470 is an example which Mike Hommey believes is a real bug. But those are relatively rare. Mostly Valgrind is reporting false positives here. --- end quote --- So it seems that these are actually FALSE POSITIVEs due to inconsistent inlining of compiler/header/whatever [I am not a C++ guru]. So if I say, --show-mismatched-frees=no, these won't show up and since they don't interfere with the operation of valgrind under the kernel 3.19.5, it does seem to be a false positive to me. (That these are reported as false positives is in itself a big problem: I think it is the issues of GCC6 and libstdc++ code compiled by GCC6. I am not sure whether these false positives won't happen if clang is used for compiling libstdc++ and mozilla thunderbird. But I digress.) My original question was why the test set up works under vanilla 3.19.5 linux kernel and not under 4.8.y Debian GNU/Linux kernel. Somehow the same setup works under kernel revision 3.19.5. > > ===== > >> ishikawa@ip030:/NREF-COMM-CENTRAL/comm-central$ gdb /usr/local/bin/valgrind > [[snip]] >> Program received signal SIGSEGV, Segmentation fault. >> 0x000000080470fdf8 in ?? () >> (gdb) where >> #0 0x000000080470fdf8 in ?? () >> #1 0x0000000802e8df30 in ?? () >> #2 0x000000000010d76b in ?? () >> #3 0x0000000802008460 in ?? () >> #4 0x0000000802e8df30 in ?? () >> #5 0x0000000000001c00 in ?? () >> #6 0x0000000038c6bb00 in ?? () >> #7 0x0000000000000601 in ?? () >> #8 0x0000000000011af3 in ?? () >> #9 0x0000000000000000 in ?? () >> (gdb) quit >> A debugging session is active. > > Hint #2. Use gdb effectively. > > (gdb) info reg ## show all registers > (gdb) x/5i $pc ## examine instruction stream > (gdb) x/30i $pc-0x20 ## likely previous instruction stream (heuristic sync for variable-length instructions) > (gdb) x/32xw $sp ## examine memory at stack pointer > (gdb) info proc ## display the process ID > (gdb) shell cat /proc/<PID>/maps ## show memory mapping; <PID> is "process" from "info proc" > > > Hint #3. If child processes are involved, then apply the tool to them, too. > $ valgrind --trace-children=yes ... Oh, I thought I passed "--trace-children to the particular valgrind session(s) when I captured the latest log. Hmm. All the logs in the last e-mail of the valgrind runs had --trace-children=yes option (not always at the beginning, though). Aha, there seems to have been a copy&paste error when I created the previous e-mail. case 1. valgrind --trace-children=yes ... case 2. (gdb) run Starting program: /usr/local/bin/valgrind --verbose --trace-children=yes --smc-check=all-non-file ... (I am afraid that there could have been a copy&paste error here. I ran valgrind with the echoed back options. I might have erased the command line after |run| by mistake. You can see that the said option was passed correctly from the following output from valgrind as well. --3973-- Valgrind options: --3973-- --verbose --3973-- --trace-children=yes <=== here --3973-- --smc-check=all-non-file --3973-- --gen-suppressions=all case 3. valgrind --vex-iropt-register-updates=allregs-at-mem-access --verbose --trace-children=yes ... I would check what is the memory at 0xffeffbab8 (reported in strace output): > --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffeffbab8} --- +++ killed by SIGSEGV +++ using > (gdb) shell cat /proc/<PID>/maps ## show memory mapping; <PID> is "process" from "info proc" and look at the memory area at the address. In the meantime, if there is anyone who has run a large program under valgrind/memcheck under stock Debian GNU/Linux kernel, please let me know your kernel version number. Even if I can figure out the memory mmap/stack/whatever condition by analyzing the kernel memory map, etc. by looking at the address reported when SIGSEGV is reported, unless I can figure out WHAT KERNEL OPTION is the culprit exactly, that won't be of much help to me as it stands now. :-( If I can know what KERNEL OPTION is the culprit, at least I can try to re-create the 4.8.y series kernel and try valgrind under it. There are enough differences of kernel options between 3.19.5 and 4.8.y, and a fishing trip won't discover the culprit easily. (I have been using Debian for close to 20 years now, but maybe I should switch to Fedora/CentOS since it is used by Mozilla foundation's compilatation/test farm. Oh well, the compilation/test farm uses clang and so there is another issue GCC vs clang. I have been using GCC for 30 years, and was comfortable using Debian GNU/Linux and GCC. Maybe it is time for a change.) TIA |
|
From: John R. <jr...@bi...> - 2017-02-17 15:58:04
|
Hint #1. Fix the first complaint. Do not pass GO, do not collect $200. FIX the first complaint. You will get more sympathy and attention if the *first* significant event is the bug/error/mystery that is the focus of your inquiry. > ==3755== Mismatched free() / delete / delete [] > ==3755== at 0x4C2CD3A: free (vg_replace_malloc.c:530) > ==3755== by 0x13EE71B3: bool > google::protobuf::InsertIfNotPresent... ===== > ishikawa@ip030:/NREF-COMM-CENTRAL/comm-central$ gdb /usr/local/bin/valgrind [[snip]] > Program received signal SIGSEGV, Segmentation fault. > 0x000000080470fdf8 in ?? () > (gdb) where > #0 0x000000080470fdf8 in ?? () > #1 0x0000000802e8df30 in ?? () > #2 0x000000000010d76b in ?? () > #3 0x0000000802008460 in ?? () > #4 0x0000000802e8df30 in ?? () > #5 0x0000000000001c00 in ?? () > #6 0x0000000038c6bb00 in ?? () > #7 0x0000000000000601 in ?? () > #8 0x0000000000011af3 in ?? () > #9 0x0000000000000000 in ?? () > (gdb) quit > A debugging session is active. Hint #2. Use gdb effectively. (gdb) info reg ## show all registers (gdb) x/5i $pc ## examine instruction stream (gdb) x/30i $pc-0x20 ## likely previous instruction stream (heuristic sync for variable-length instructions) (gdb) x/32xw $sp ## examine memory at stack pointer (gdb) info proc ## display the process ID (gdb) shell cat /proc/<PID>/maps ## show memory mapping; <PID> is "process" from "info proc" Hint #3. If child processes are involved, then apply the tool to them, too. $ valgrind --trace-children=yes ... -- |
|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-17 04:40:56
|
On 2017/02/16 1:50, ISHIKAWA,chiaki wrote: > On 2017/02/15 23:32, Tom Hughes wrote: >> On 15/02/17 13:34, ISHIKAWA,chiaki wrote: >> >>> When I tried to run mozilla thunderbird mail client, which I create >>> under Debian GNU/Linux 64-bit, >>> under valgrind, valgrind mysteriously crashed and gdb was not much help. >> >> Well valgrind almost never "mysteriously crashes". >> >> In fact it is usually very verbose when anything goes wrong. >> > > Hi, > > Thank you for your comment. > > The above was what I thought back in 2015 and actually I exchanged a few > e-mails with Julian Seward about the issue back then. But we gave up on it. > Because the system printed out "Segmentation error" without a good trace > of anything at all (!) (which was quite surprising): We traced signals, > and stuff. Everything we could think of using various options passed to > valgrind (and even traced the system calls valgrind was issuing using > strace.). > > >> So the first thing you should do is to tell us in detail exactly what it >> said when it stopped. > > Since gdb and various traces invoked by the options passed to valgrind > are useless (as in the case back in 2015), > I traced the system calls issued by valgrind. > > There was a MMAP call before something went wrong and signal 11 was > issued and then > I saw SIGSEGV passed a dozen times or so, and voila. Segmentation error > back at the shell level. > gdb does not print anything useful at all... > >> >>> This happened under the latest 4.8.x kernel which Debian distributed as >>> part of its testing repository. >>> >>> I tried a few things but subsequently reverted to kernel 3.19.5. >>> Now thunderbird under valgrind works (!). >> >> So most likely this is just a new system call that valgrind doesn't >> handle or something, in which case valgrind will have reported all the >> details needed to fix it when it stopped. > > That was what I (and Julian Seward) hoped back in 2015, but valgrind did > not. From the debugging I did over the last few months, I figured the > problem I face is indeed as perplexing as the case back in 2015 and I > took the easy course now: I decided that trying to find out if there is > ANYBODY who is using valgrind and running big program under it using > Debian GNU/Linux official kernel is easier (which I doubt based on my > experience). Also, Julian Seward back in 2015 mentioned valgrind could > grok thunderbird under Fedora and thus I thought it would be easier to > figure out if someone is running 64-bit thunderbird under CentOS or > Fedora 64-bit and compare the config to figure out what is causing the > problem under Debian's kernel. > > BTW, the following is is what I found back in 2015. > > > ------------------------+---------------- > Kernel version | valgrind + C-C TB works or not > ------------------------+---------------- > Debian 3.2.0...| works <--- base debian version for wheezy > ------------------------+---------------- > self-compiled 3.9.0...| works > ------------------------+---------------- > self-compiled 3.12.40 | works > ------------------------+---------------- > self-compiled 3.13.11 | works > ------------------------+---------------- > > self-compiled 3.14.38 | ??? <--- pristine kernel hit the problem > mentioned in the following patch and panicked. open source is > wonderful when it works, but when it does not > http://lkml.iu.edu/hypermail/linux/kernel/1407.3/04296.html > > ------------------------+---------------- > self-compiled 3.15.9 | ??? <--- vanilla kernel could not bring up X > probably because the same reason above. X > did not start in a few minutes, and so I gave up. I did not see the > kernel panic, though. > > ------------------------+---------------- > Debian backport 3.16 ...| Segmentation fault! [Why? I have no idea.] > ------------------------+---------------- > > ------------------------+------------------ > Vanilla 3.19.5 | works (worked back in 2015 and now I have to > revert to it...) > ------------------------+------------------ > > This time arouind, I tried to figure out if I could do something similar > using the latest kernel 4.9.x (vanilla version), hoping it might make > valgrind run thunderbird under it without segmentation error. But the > very late kernel caused a problem of VirtualBox utility, such as > graphics driver that supports dynamic resizing, not supporting the > latest kernel as guest at all, and I had to give it up. > (Yes, I am running Debian GNU/Linux inside VirtualBox.) > > Sorry, I was so tired of debugging and seeing that the current issue > looked so much like the mysterious problem back in 2015, that I did not > bother to pursue the issue in valgrind per se, but rather wanted to > focus on kernel issue now. > > I am running the |make mozmill| test of thunderbird which now takes > about 48 hours and once it is over, I will switch the kernel and gather > the gdb stack trace (which is useless) when valgrind crashes, and > also show the last part of strace (system call trace) which again is not > very revealing. > > I am sure you will be perplexed why on earth valgrind is crashing when > we try to run thunderbird underneath in Debian's kernel. [I *DID* notice > that there are differences in Debian kernel that it enables stack > protection, for starter. Not sure if it affects Valgrind operation.] > > >> >> Tom >> > > TIA Here are snipets from the log when valgrind could not run mozilla thunderbird (which seems to spawn a few binaries during its life time when it is invoked as part of |make mozmill| test suite.) uname -a Linux ip030 4.8.0-2-amd64 #1 SMP Debian 4.8.15-2 (2017-01-04) x86_64 GNU/Linux ishikawa@ip030:/NREF-COMM-CENTRAL/comm-central$ --- run-valgrind (masquerading as thunderbird binary) final command line is: valgrind --trace-children=yes --smc-check=all-non-file --gen-suppressions=all --malloc-fill=0xA5 --free-fill=0xC3 --leak-check=full --num-callers=50 --suppressions=$HOME/Dropbox/myown.sup --suppressions=$HOME/Dropbox/myown32.sup --show-possibly-lost=no /NREF-COMM-CENTRAL/objdir-tb3/dist/bin/thunderbird-bin -jsbridge 24242 -foreground -profile /NREF-COMM-CENTRAL/objdir-tb3/_tests/mozmill/mozmillprofile ==3755== Memcheck, a memory error detector ==3755== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==3755== Using Valgrind-3.13.0.SVN and LibVEX; rerun with -h for copyright info ==3755== Command: /NREF-COMM-CENTRAL/objdir-tb3/dist/bin/thunderbird-bin -jsbridge 24242 -foreground -profile /NREF-COMM-CENTRAL/objdir-tb3/_tests/mozmill/mozmillprofile ==3755== ==3755== Mismatched free() / delete / delete [] ==3755== at 0x4C2CD3A: free (vg_replace_malloc.c:530) ==3755== by 0x13EE71B3: bool google::protobuf::InsertIfNotPresent<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > > >(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > >*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > >::value_type::first_type const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > >::value_type::second_type const&) (mozalloc.h:218) ==3755== by 0x13EE8827: google::protobuf::SimpleDescriptorDatabase::DescriptorIndex<std::pair<void const*, int> >::AddFile(google::protobuf::FileDescriptorProto const&, std::pair<void const*, int>) (descriptor_database.cc:56) ==3755== by 0x13EE8DDE: google::protobuf::EncodedDescriptorDatabase::Add(void const*, int) (descriptor_database.cc:313) ==3755== by 0x13EE8E4A: google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int) (descriptor.cc:1018) ==3755== by 0x13EE8EEE: google::protobuf::protobuf_AddDesc_google_2fprotobuf_2fdescriptor_2eproto() (descriptor.pb.cc:711) ==3755== by 0x13EEB33A: __static_initialization_and_destruction_0(int, int) (descriptor.pb.cc:762) ==3755== by 0x13EF879F: _GLOBAL__sub_I_Unified_cpp_components_protobuf0.cpp (message.cc:358) ==3755== by 0x400F649: call_init.part.0 (dl-init.c:72) ==3755== by 0x400F75A: call_init (dl-init.c:30) ==3755== by 0x400F75A: _dl_init (dl-init.c:120) ==3755== by 0x4013CD7: dl_open_worker (dl-open.c:575) ==3755== by 0x400F4F3: _dl_catch_error (dl-error.c:187) ==3755== by 0x4013488: _dl_open (dl-open.c:660) ==3755== by 0x5055EE8: dlopen_doit (dlopen.c:66) ==3755== by 0x400F4F3: _dl_catch_error (dl-error.c:187) ==3755== by 0x5056520: _dlerror_run (dlerror.c:163) ==3755== by 0x5055F81: dlopen@@GLIBC_2.2.5 (dlopen.c:87) ==3755== by 0x123072: GetLibHandle(char const*) (nsXPCOMGlue.cpp:105) ==3755== by 0x1230FA: ReadDependentCB(char const*) (nsXPCOMGlue.cpp:157) ==3755== by 0x123337: XPCOMGlueLoad(char const*) (nsXPCOMGlue.cpp:333) ==3755== by 0x12347B: mozilla::GetBootstrap(char const*) (nsXPCOMGlue.cpp:408) ==3755== by 0x10D406: InitXPCOMGlue(char const*) (nsMailApp.cpp:247) ==3755== by 0x10D7C4: main (nsMailApp.cpp:295) ==3755== Address 0x5f08f90 is 0 bytes inside a block of size 33 alloc'd ==3755== at 0x4C2C1EC: operator new(unsigned long) (vg_replace_malloc.c:334) ==3755== by 0x113B8A: void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) (basic_string.tcc:219) ==3755== by 0x13EE7185: bool google::protobuf::InsertIfNotPresent<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > > >(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > >*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > >::value_type::first_type const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<void const*, int>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::pair<void const*, int> > > >::value_type::second_type const&) (basic_string.h:196) ==3755== by 0x13EE8827: google::protobuf::SimpleDescriptorDatabase::DescriptorIndex<std::pair<void const*, int> >::AddFile(google::protobuf::FileDescriptorProto const&, std::pair<void const*, int>) (descriptor_database.cc:56) ... flurry of mismatched free/delete, etc. ==3755== by 0x123337: XPCOMGlueLoad(char const*) (nsXPCOMGlue.cpp:333) ==3755== by 0x12347B: mozilla::GetBootstrap(char const*) (nsXPCOMGlue.cpp:408) ==3755== by 0x10D406: InitXPCOMGlue(char const*) (nsMailApp.cpp:247) ==3755== by 0x10D7C4: main (nsMailApp.cpp:295) ==3755== { <insert_a_suppression_name_here> Memcheck:Free fun:free fun:_ZN6google8protobuf20OneofDescriptorProto10SharedDtorEv fun:_ZN6google8protobuf20OneofDescriptorProtoD1Ev fun:_ZN6google8protobuf20OneofDescriptorProtoD0Ev fun:_ZN6google8protobuf8internal20RepeatedPtrFieldBase7DestroyINS0_16RepeatedPtrFieldINS0_20OneofDescriptorProtoEE11TypeHandlerEEEvv fun:_ZN6google8protobuf16RepeatedPtrFieldINS0_20OneofDescriptorProtoEED1Ev fun:_ZN6google8protobuf15DescriptorProtoD1Ev fun:_ZN6google8protobuf15DescriptorProtoD0Ev fun:_ZN6google8protobuf8internal20RepeatedPtrFieldBase7DestroyINS0_16RepeatedPtrFieldINS0_15DescriptorProtoEE11TypeHandlerEEEvv fun:_ZN6google8protobuf16RepeatedPtrFieldINS0_15DescriptorProtoEED1Ev fun:_ZN6google8protobuf15DescriptorProtoD1Ev fun:_ZN6google8protobuf15DescriptorProtoD0Ev fun:_ZN6google8protobuf8internal20RepeatedPtrFieldBase7DestroyINS0_16RepeatedPtrFieldINS0_15DescriptorProtoEE11TypeHandlerEEEvv fun:_ZN6google8protobuf16RepeatedPtrFieldINS0_15DescriptorProtoEED1Ev fun:_ZN6google8protobuf19FileDescriptorProtoD1Ev fun:_ZN6google8protobuf25EncodedDescriptorDatabase3AddEPKvi fun:_ZN6google8protobuf14DescriptorPool24InternalAddGeneratedFileEPKvi fun:_ZN7mozilla8devtools8protobuf33protobuf_AddDesc_CoreDump_2eprotoEv fun:_Z41__static_initialization_and_destruction_0ii fun:_GLOBAL__sub_I_CoreDump.pb.cc fun:call_init.part.0 fun:call_init fun:_dl_init fun:dl_open_worker fun:_dl_catch_error fun:_dl_open fun:dlopen_doit fun:_dl_catch_error fun:_dlerror_run fun:dlopen@@GLIBC_2.2.5 fun:_ZL12GetLibHandlePKc fun:_ZL15ReadDependentCBPKc fun:_ZL13XPCOMGlueLoadPKc fun:_ZN7mozilla12GetBootstrapEPKc fun:_ZL13InitXPCOMGluePKc fun:main } Segmentation fault <===== one of the binaries invoked by the above command fails here. ==3760== ==3760== HEAP SUMMARY: ==3760== in use at exit: 426,021 bytes in 1,928 blocks ==3760== total heap usage: 6,366 allocs, 4,438 frees, 12,676,013 bytes allocated ==3760== ==3760== LEAK SUMMARY: ==3760== definitely lost: 0 bytes in 0 blocks ==3760== indirectly lost: 0 bytes in 0 blocks ==3760== possibly lost: 8,848 bytes in 150 blocks ==3760== still reachable: 416,533 bytes in 1,777 blocks ==3760== of which reachable via heuristic: ==3760== newarray : 1,536 bytes in 16 blocks ==3760== suppressed: 640 bytes in 1 blocks ==3760== Reachable blocks (those to which a pointer was found) are not shown. ==3760== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==3760== ==3760== For counts of detected and suppressed errors, rerun with: -v ==3760== ERROR SUMMARY: 155 errors from 37 contexts (suppressed: 1 from 1) Traceback (most recent call last): File "runtestlist.py", line 107, in <module> line = proc.stdout.readline() KeyboardInterrupt xfwm4: Fatal IO error 11 (Resource temporarily unavailable) on X server :2.0. /NREF-COMM-CENTRAL/comm-central/mozilla/../mail/testsuite-targets.mk:30: recipe for target 'mozmill' failed make: *** [mozmill] Interrupt [Note the Segmentation error]? === So I invoked valgrind under gdb directly. ishikawa@ip030:/NREF-COMM-CENTRAL/comm-central$ gdb /usr/local/bin/valgrind GNU gdb (GDB) 7.10.50.20160102-cvs Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/local/bin/valgrind...done. (gdb) run Starting program: /usr/local/bin/valgrind --verbose --trace-children=yes --smc-check=all-non-file --gen-suppressions=all --malloc-fill=0xA5 --free-fill=0xC3 --leak-check=full --num-callers=50 --suppressions=$HOME/Dropbox/myown.sup --suppressions=$HOME/Dropbox/myown32.sup --show-possibly-lost=no /NREF-COMM-CENTRAL/objdir-tb3/dist/bin/thunderbird-bin -jsbridge 24242 -foreground -profile /NREF-COMM-CENTRAL/objdir-tb3/_tests/mozmill/mozmillprofile process 3973 is executing new program: /usr/local/lib/valgrind/memcheck-amd64-linux ==3973== Memcheck, a memory error detector ==3973== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==3973== Using Valgrind-3.13.0.SVN and LibVEX; rerun with -h for copyright info ==3973== Command: /NREF-COMM-CENTRAL/objdir-tb3/dist/bin/thunderbird-bin -jsbridge 24242 -foreground -profile /NREF-COMM-CENTRAL/objdir-tb3/_tests/mozmill/mozmillprofile ==3973== --3973-- Valgrind options: --3973-- --verbose --3973-- --trace-children=yes --3973-- --smc-check=all-non-file --3973-- --gen-suppressions=all --3973-- --malloc-fill=0xA5 --3973-- --free-fill=0xC3 --3973-- --leak-check=full --3973-- --num-callers=50 --3973-- --suppressions=/home/ishikawa/Dropbox/myown.sup --3973-- --suppressions=/home/ishikawa/Dropbox/myown32.sup --3973-- --show-possibly-lost=no --3973-- Contents of /proc/version: --3973-- Linux version 4.8.0-2-amd64 (deb...@li...) (gcc version 5.4.1 20161202 (Debian 5.4.1-4) ) #1 SMP Debian 4.8.15-2 (2017-01-04) --3973-- --3973-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-rdtscp-sse3-avx --3973-- Page sizes: currently 4096, max supported 4096 --3973-- Valgrind library directory: /usr/local/lib/valgrind --3973-- Reading syms from /NREF-COMM-CENTRAL/objdir-tb3/dist/bin/thunderbird-bin --3973-- Reading syms from /lib/x86_64-linux-gnu/ld-2.24.so --3973-- Considering /usr/lib/debug/.build-id/09/5935d2da92389e2991f2b56d14dab9e6978696.debug .. --3973-- .. build-id is valid --3973-- Reading syms from /usr/local/lib/valgrind/memcheck-amd64-linux --3973-- object doesn't have a dynamic symbol table --3973-- Scheduler: using generic scheduler lock implementation. --3973-- Reading suppressions file: /home/ishikawa/Dropbox/myown.sup --3973-- Reading suppressions file: /home/ishikawa/Dropbox/myown32.sup --3973-- Reading suppressions file: /usr/local/lib/valgrind/default.supp ==3973== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-3973-by-ishikawa-on-??? ==3973== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-3973-by-ishikawa-on-??? ==3973== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-3973-by-ishikawa-on-??? ==3973== ==3973== TO CONTROL THIS PROCESS USING vgdb (which you probably ==3973== don't want to do, unless you know exactly what you're doing, ==3973== or are doing some strange experiment): ==3973== /usr/local/lib/valgrind/../../bin/vgdb --pid=3973 ...command... ==3973== ==3973== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==3973== /path/to/gdb /NREF-COMM-CENTRAL/objdir-tb3/dist/bin/thunderbird-bin ==3973== and then give GDB the following command ==3973== target remote | /usr/local/lib/valgrind/../../bin/vgdb --pid=3973 ==3973== --pid is optional if only one valgrind process is running ==3973== --3973-- REDIR: 0x401af50 (ld-linux-x86-64.so.2:strlen) redirected to 0x380a80e8 (vgPlain_amd64_linux_REDIR_FOR_strlen) --3973-- REDIR: 0x40198a0 (ld-linux-x86-64.so.2:index) redirected to 0x380a8102 (vgPlain_amd64_linux_REDIR_FOR_index) --3973-- Reading syms from /usr/local/lib/valgrind/vgpreload_core-amd64-linux.so --3973-- Reading syms from /usr/local/lib/valgrind/vgpreload_memcheck-amd64-linux.so ==3973== WARNING: new redirection conflicts with existing -- ignoring it --3973-- old: 0x0401af50 (strlen ) R-> (0000.0) 0x380a80e8 vgPlain_amd64_linux_REDIR_FOR_strlen --3973-- new: 0x0401af50 (strlen ) R-> (2007.0) 0x04c2ec60 strlen --3973-- REDIR: 0x4019ac0 (ld-linux-x86-64.so.2:strcmp) redirected to 0x4c2fd60 (strcmp) --3973-- REDIR: 0x401ba60 (ld-linux-x86-64.so.2:mempcpy) redirected to 0x4c33130 (mempcpy) --3973-- Reading syms from /lib/x86_64-linux-gnu/libpthread-2.24.so --3973-- Considering /usr/lib/debug/.build-id/75/b2a574fa9c03e43b58f53b424b1daec1211862.debug .. --3973-- .. build-id is valid --3973-- Reading syms from /lib/x86_64-linux-gnu/libdl-2.24.so --3973-- Considering /usr/lib/debug/.build-id/e4/8bb27b88670405041a12eefef9ef586f6e1533.debug .. --3973-- .. build-id is valid --3973-- Reading syms from /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22 --3973-- object doesn't have a symbol table --3973-- Reading syms from /lib/x86_64-linux-gnu/libm-2.24.so --3973-- Considering /usr/lib/debug/.build-id/d0/4c68ec51462ba3088cf1b19d54e1706463f723.debug .. --3973-- .. build-id is valid --3973-- Reading syms from /lib/x86_64-linux-gnu/libgcc_s.so.1 --3973-- Considering /usr/lib/debug/.build-id/90/f96c5be1c683de41a42ab262411fb7a3876fb2.debug .. --3973-- .. build-id is valid --3973-- Reading syms from /lib/x86_64-linux-gnu/libc-2.24.so --3973-- Considering /usr/lib/debug/.build-id/4b/9cc30ba41f027a0dca6cd877f59f0db38f4025.debug .. --3973-- .. build-id is valid --3973-- REDIR: 0x5b7a510 (libc.so.6:strcasecmp) redirected to 0x4a26742 (_vgnU_ifunc_wrapper) --3973-- REDIR: 0x5b75fc0 (libc.so.6:strcspn) redirected to 0x4a26742 (_vgnU_ifunc_wrapper) --3973-- REDIR: 0x5b7c800 (libc.so.6:strncasecmp) redirected to 0x4a26742 (_vgnU_ifunc_wrapper) --3973-- REDIR: 0x5b78430 (libc.so.6:strpbrk) redirected to 0x4a26742 (_vgnU_ifunc_wrapper) --3973-- REDIR: 0x5b787c0 (libc.so.6:strspn) redirected to 0x4a26742 (_vgnU_ifunc_wrapper) --3973-- REDIR: 0x5b79b90 (libc.so.6:memmove) redirected to 0x4a26742 (_vgnU_ifunc_wrapper) --3973-- REDIR: 0x5b78140 (libc.so.6:rindex) redirected to 0x4c2e5f0 (rindex) --3973-- REDIR: 0x5b70d30 (libc.so.6:malloc) redirected to 0x4c2bb1f (malloc) --3973-- REDIR: 0x5c1bbf0 (libc.so.6:__strcasecmp_avx) redirected to 0x4c2f4a0 (strcasecmp) Program received signal SIGSEGV, Segmentation fault. 0x000000080470fdf8 in ?? () (gdb) where #0 0x000000080470fdf8 in ?? () #1 0x0000000802e8df30 in ?? () #2 0x000000000010d76b in ?? () #3 0x0000000802008460 in ?? () #4 0x0000000802e8df30 in ?? () #5 0x0000000000001c00 in ?? () #6 0x0000000038c6bb00 in ?? () #7 0x0000000000000601 in ?? () #8 0x0000000000011af3 in ?? () #9 0x0000000000000000 in ?? () (gdb) quit A debugging session is active. Inferior 1 [process 3973] will be killed. Quit anyway? (y or n) y ishikawa@ip030:/NREF-COMM-CENTRAL/comm-central$ /usr/local/bin/valgrind --help === Oh by the way, I added --vex-iropt-register... in the option. valgrind --vex-iropt-register-updates=allregs-at-mem-access --verbose --trace-children=yes --smc-check=all-non-file --gen-suppressions=all --malloc-fill=0xA5 --free-fill=0xC3 --leak-check=full --num-callers=50 --suppressions=$HOME/Dropbox/myown.sup --suppressions=$HOME/Dropbox/myown32.sup --show-possibly-lost=no /NREF-COMM-CENTRAL/objdir-tb3/dist/bin/thunderbird-bin -jsbridge 24242 -foreground -profile /NREF-COMM-CENTRAL/objdir-tb3/_tests/mozmill/mozmillprofile But something segfaults anyway. [...] --5688-- object doesn't have a symbol table --5688-- REDIR: 0x5b76820 (libc.so.6:strncat) redirected to 0x4a26742 (_vgnU_ifunc_wrapper) Segmentation fault ishikawa@ip030:/NREF-COMM-CENTRAL/comm-central$ --5688-- REDIR: 0x5b72dd0 (libc.so.6:posix_memalign) redirected to 0x4c2de3a (posix_memalign) --5688-- Reading syms from /usr/lib/x86_64-linux-gnu/libtxc_dxtn_s2tc.so.0.0.0 --5688-- object doesn't have a symbol table --5688-- REDIR: 0x5c1bad0 (libc.so.6:__strspn_sse42) redirected to 0x4c33530 (strspn) --5688-- REDIR: 0x5b79c90 (libc.so.6:__memcpy_chk_sse2_unaligned) redirected to 0x4c33220 (__memcpy_chk) [...] === The final part of sstrace output. getpid() = 4280 gettid() = 4280 write(1029, "F", 1) = 1 rt_sigprocmask(SIG_SETMASK, [], ~[KILL STOP], 8) = 0 open("/tmp/thunderbird_ishikawa/.parentlock", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 6 rt_sigprocmask(SIG_SETMASK, ~[KILL STOP], NULL, 8) = 0 gettid() = 4280 read(1028, "F", 1) = 1 fcntl(6, F_GETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0, l_pid=0}) = 0 fcntl(6, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0 lstat("/tmp/thunderbird_ishikawa/lock", 0xffeffe100) = -1 ENOENT (No such file or directory) uname({sysname="Linux", nodename="ip030", ...}) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffeffbab8} --- +++ killed by SIGSEGV +++ It is possible that the signal handler was not completely installed before a signal (SIGSEGV) was generated? ===== The above is the situation under .8.0-2-amd6 kernel (Debian GNU/Linux ) --- To my consternation, this is kernel-dependent. Under the vanilla kernel I created, valgrind runs just fine Linux ip030 3.19.5 #1 SMP Mon Apr 20 08:50:21 JST 2015 x86_64 GNU/Linux Any thoughts? |
|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-15 16:50:57
|
On 2017/02/15 23:32, Tom Hughes wrote:
> On 15/02/17 13:34, ISHIKAWA,chiaki wrote:
>
>> When I tried to run mozilla thunderbird mail client, which I create
>> under Debian GNU/Linux 64-bit,
>> under valgrind, valgrind mysteriously crashed and gdb was not much help.
>
> Well valgrind almost never "mysteriously crashes".
>
> In fact it is usually very verbose when anything goes wrong.
>
Hi,
Thank you for your comment.
The above was what I thought back in 2015 and actually I exchanged a few
e-mails with Julian Seward about the issue back then. But we gave up on it.
Because the system printed out "Segmentation error" without a good trace
of anything at all (!) (which was quite surprising): We traced signals,
and stuff. Everything we could think of using various options passed to
valgrind (and even traced the system calls valgrind was issuing using
strace.).
> So the first thing you should do is to tell us in detail exactly what it
> said when it stopped.
Since gdb and various traces invoked by the options passed to valgrind
are useless (as in the case back in 2015),
I traced the system calls issued by valgrind.
There was a MMAP call before something went wrong and signal 11 was
issued and then
I saw SIGSEGV passed a dozen times or so, and voila. Segmentation error
back at the shell level.
gdb does not print anything useful at all...
>
>> This happened under the latest 4.8.x kernel which Debian distributed as
>> part of its testing repository.
>>
>> I tried a few things but subsequently reverted to kernel 3.19.5.
>> Now thunderbird under valgrind works (!).
>
> So most likely this is just a new system call that valgrind doesn't
> handle or something, in which case valgrind will have reported all the
> details needed to fix it when it stopped.
That was what I (and Julian Seward) hoped back in 2015, but valgrind did
not. From the debugging I did over the last few months, I figured the
problem I face is indeed as perplexing as the case back in 2015 and I
took the easy course now: I decided that trying to find out if there is
ANYBODY who is using valgrind and running big program under it using
Debian GNU/Linux official kernel is easier (which I doubt based on my
experience). Also, Julian Seward back in 2015 mentioned valgrind could
grok thunderbird under Fedora and thus I thought it would be easier to
figure out if someone is running 64-bit thunderbird under CentOS or
Fedora 64-bit and compare the config to figure out what is causing the
problem under Debian's kernel.
BTW, the following is is what I found back in 2015.
------------------------+----------------
Kernel version | valgrind + C-C TB works or not
------------------------+----------------
Debian 3.2.0...| works <--- base debian version for wheezy
------------------------+----------------
self-compiled 3.9.0...| works
------------------------+----------------
self-compiled 3.12.40 | works
------------------------+----------------
self-compiled 3.13.11 | works
------------------------+----------------
self-compiled 3.14.38 | ??? <--- pristine kernel hit the problem
mentioned in the following patch and panicked. open source is
wonderful when it works, but when it does not
http://lkml.iu.edu/hypermail/linux/kernel/1407.3/04296.html
------------------------+----------------
self-compiled 3.15.9 | ??? <--- vanilla kernel could not bring up X
probably because the same reason above. X
did not start in a few minutes, and so I gave up. I did not see the
kernel panic, though.
------------------------+----------------
Debian backport 3.16 ...| Segmentation fault! [Why? I have no idea.]
------------------------+----------------
------------------------+------------------
Vanilla 3.19.5 | works (worked back in 2015 and now I have to
revert to it...)
------------------------+------------------
This time arouind, I tried to figure out if I could do something similar
using the latest kernel 4.9.x (vanilla version), hoping it might make
valgrind run thunderbird under it without segmentation error. But the
very late kernel caused a problem of VirtualBox utility, such as
graphics driver that supports dynamic resizing, not supporting the
latest kernel as guest at all, and I had to give it up.
(Yes, I am running Debian GNU/Linux inside VirtualBox.)
Sorry, I was so tired of debugging and seeing that the current issue
looked so much like the mysterious problem back in 2015, that I did not
bother to pursue the issue in valgrind per se, but rather wanted to
focus on kernel issue now.
I am running the |make mozmill| test of thunderbird which now takes
about 48 hours and once it is over, I will switch the kernel and gather
the gdb stack trace (which is useless) when valgrind crashes, and
also show the last part of strace (system call trace) which again is not
very revealing.
I am sure you will be perplexed why on earth valgrind is crashing when
we try to run thunderbird underneath in Debian's kernel. [I *DID* notice
that there are differences in Debian kernel that it enables stack
protection, for starter. Not sure if it affects Valgrind operation.]
>
> Tom
>
TIA
|
|
From: Tom H. <to...@co...> - 2017-02-15 15:01:10
|
On 15/02/17 13:34, ISHIKAWA,chiaki wrote: > When I tried to run mozilla thunderbird mail client, which I create > under Debian GNU/Linux 64-bit, > under valgrind, valgrind mysteriously crashed and gdb was not much help. Well valgrind almost never "mysteriously crashes". In fact it is usually very verbose when anything goes wrong. So the first thing you should do is to tell us in detail exactly what it said when it stopped. > This happened under the latest 4.8.x kernel which Debian distributed as > part of its testing repository. > > I tried a few things but subsequently reverted to kernel 3.19.5. > Now thunderbird under valgrind works (!). So most likely this is just a new system call that valgrind doesn't handle or something, in which case valgrind will have reported all the details needed to fix it when it stopped. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-15 13:49:42
|
Hi, Thank you for sharing the great debugging tool. When I tried to run mozilla thunderbird mail client, which I create under Debian GNU/Linux 64-bit, under valgrind, valgrind mysteriously crashed and gdb was not much help. This happened under the latest 4.8.x kernel which Debian distributed as part of its testing repository. I tried a few things but subsequently reverted to kernel 3.19.5. Now thunderbird under valgrind works (!). The following is an excerpt from a post that I sent to developer mailing list of mozilla thunderbird. If the symptom rings a bell, please let me know. TIA --- begin quote Well the original problem I had: valgrind crashed when I tried to invoke it as part of |make mozmill| test of mozilla thunderbird. It looks that there is a Debign GNU/Linux kernel issue. (It occurred about a couple of years ago in 2015, too.). I was using 4.x series since late last year and valgrind did not work any more, and I reverted back to kernel version 3.19.5.[userland has been upgraded to work with 4.x series, and so I am a little uncomfortable doing this. But, I need valgrind to work for debugging.] Now thunderbird under valgrind works again under 3.19.5 $ uname -a Linux ip030 3.19.5 #1 SMP Mon Apr 20 08:50:21 JST 2015 x86_64 GNU/Linux There is something, about Debian's kernel config, that interferes with the correct valgrind operation. I am not sure what. For those interested: I can send you the config file that Debian used to create these kernel images that Debian officially distributes. On the other hand, if someone uses valgrind on, say, CentOS or Fedora (kernel 4.x series) and can run thunderbird under it succssefully, I would appreciate to look at the config file to create the kernel image there, so I would be able to compare and tinker with the kernel and if I can make the valgrind to run in a modified kernel in Debian GNU/Linux environment. TIA --- end quote |
|
From: Not Me <iti...@gm...> - 2017-02-03 11:18:43
|
And now I seem to run into this open issue: https://bugs.kde.org/show_bug.cgi?id=365327 Please fix that one for us :) Thanks! On Fri, Feb 3, 2017 at 12:08 PM, Not Me <iti...@gm...> wrote: > Never mind, sorry, got it to work already using: > > ./configure -disable-tls --enable-only64bit --build=amd64-darwin > > > > On Fri, Feb 3, 2017 at 11:59 AM, Not Me <iti...@gm...> wrote: > >> Hi all, >> >> Because Homebrew installation of valgrind failed on my Mac, I'm trying to >> build from the svn repo by following the steps documented at >> http://valgrind.org/downloads/repository.html. >> >> While doing so, I'm running into a build issue. Could somebody let me >> know whether this is a supported configuration and possible even how to fix >> the problem? >> >> Thanks a lot! >> >> >> *Homebrew*: >> >> $ brew install valgrind >> valgrind: This formula either does not compile or function as expected on >> macOS >> versions newer than El Capitan due to an upstream incompatibility. >> Error: An unsatisfied requirement failed this build. >> $ brew install --HEAD valgrind >> valgrind: This formula either does not compile or function as expected on >> macOS >> versions newer than El Capitan due to an upstream incompatibility. >> Error: An unsatisfied requirement failed this build. >> >> >> *My versions*: >> OS X, Sierra, 10.12.3. >> Xcode 8.2.1 (8C1002) >> >> *Build error*: >> ./autogen.sh >> ./configure >> >> Maximum build arch: amd64 >> Primary build arch: amd64 >> Secondary build arch: x86 >> Build OS: darwin >> Primary build target: AMD64_DARWIN >> Secondary build target: X86_DARWIN >> Platform variant: vanilla >> Primary -DVGPV string: -DVGPV_amd64_darwin_vanilla=1 >> Default supp files: exp-sgcheck.supp xfree-3.supp xfree-4.supp >> darwin10-drd.supp darwin16.supp >> >> make >> >> ../coregrind/link_tool_exe_darwin 0x38000000 gcc -o >> memcheck-x86-darwin -arch i386 -O2 -g -std=gnu99 -Wall >> -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes >> -Wmissing-declarations -Wcast-align -Wcast-qual -Wwrite-strings >> -Wempty-body -Wformat -Wformat-security -Wignored-qualifiers >> -fno-stack-protector -fno-strict-aliasing -fno-builtin -Wno-cast-align >> -Wno-self-assign -Wno-tautological-compare -mmacosx-version-min=10.6 >> -fno-stack-protector -fno-pic -fno-PIC -O2 -nodefaultlibs -nostartfiles >> -Wl,-u,__start -Wl,-e,__start -arch i386 memcheck_x86_darwin-mc_leakcheck.o >> memcheck_x86_darwin-mc_malloc_wrappers.o memcheck_x86_darwin-mc_main.o >> memcheck_x86_darwin-mc_translate.o memcheck_x86_darwin-mc_machine.o >> memcheck_x86_darwin-mc_errors.o ../coregrind/libcoregrind-x86-darwin.a >> ../VEX/libvex-x86-darwin.a -lgcc >> link_tool_exe_darwin: /usr/bin/ld -static -arch i386 -macosx_version_min >> 10.6 -o memcheck-x86-darwin -u __start -e __start -image_base 0x38000000 >> -stack_addr 0x34000000 -stack_size 0x800000 memcheck_x86_darwin-mc_leakcheck.o >> memcheck_x86_darwin-mc_malloc_wrappers.o memcheck_x86_darwin-mc_main.o >> memcheck_x86_darwin-mc_translate.o memcheck_x86_darwin-mc_machine.o >> memcheck_x86_darwin-mc_errors.o ../coregrind/libcoregrind-x86-darwin.a >> ../VEX/libvex-x86-darwin.a >> Undefined symbols for architecture i386: >> "___ctzdi2", referenced from: >> _doRegisterAllocation in libvex-x86-darwin.a(libvex_x86 >> _darwin_a-host_generic_reg_alloc2.o) >> ld: symbol(s) not found for architecture i386 >> make[3]: *** [memcheck-x86-darwin] Error 1 >> make[2]: *** [all-recursive] Error 1 >> make[1]: *** [all-recursive] Error 1 >> make: *** [all] Error 2 >> > > |
|
From: Not Me <iti...@gm...> - 2017-02-03 11:08:46
|
Never mind, sorry, got it to work already using: ./configure -disable-tls --enable-only64bit --build=amd64-darwin On Fri, Feb 3, 2017 at 11:59 AM, Not Me <iti...@gm...> wrote: > Hi all, > > Because Homebrew installation of valgrind failed on my Mac, I'm trying to > build from the svn repo by following the steps documented at > http://valgrind.org/downloads/repository.html. > > While doing so, I'm running into a build issue. Could somebody let me know > whether this is a supported configuration and possible even how to fix the > problem? > > Thanks a lot! > > > *Homebrew*: > > $ brew install valgrind > valgrind: This formula either does not compile or function as expected on > macOS > versions newer than El Capitan due to an upstream incompatibility. > Error: An unsatisfied requirement failed this build. > $ brew install --HEAD valgrind > valgrind: This formula either does not compile or function as expected on > macOS > versions newer than El Capitan due to an upstream incompatibility. > Error: An unsatisfied requirement failed this build. > > > *My versions*: > OS X, Sierra, 10.12.3. > Xcode 8.2.1 (8C1002) > > *Build error*: > ./autogen.sh > ./configure > > Maximum build arch: amd64 > Primary build arch: amd64 > Secondary build arch: x86 > Build OS: darwin > Primary build target: AMD64_DARWIN > Secondary build target: X86_DARWIN > Platform variant: vanilla > Primary -DVGPV string: -DVGPV_amd64_darwin_vanilla=1 > Default supp files: exp-sgcheck.supp xfree-3.supp xfree-4.supp > darwin10-drd.supp darwin16.supp > > make > > ../coregrind/link_tool_exe_darwin 0x38000000 gcc -o > memcheck-x86-darwin -arch i386 -O2 -g -std=gnu99 -Wall > -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes > -Wmissing-declarations -Wcast-align -Wcast-qual -Wwrite-strings > -Wempty-body -Wformat -Wformat-security -Wignored-qualifiers > -fno-stack-protector -fno-strict-aliasing -fno-builtin -Wno-cast-align > -Wno-self-assign -Wno-tautological-compare -mmacosx-version-min=10.6 > -fno-stack-protector -fno-pic -fno-PIC -O2 -nodefaultlibs -nostartfiles > -Wl,-u,__start -Wl,-e,__start -arch i386 memcheck_x86_darwin-mc_leakcheck.o > memcheck_x86_darwin-mc_malloc_wrappers.o memcheck_x86_darwin-mc_main.o > memcheck_x86_darwin-mc_translate.o memcheck_x86_darwin-mc_machine.o > memcheck_x86_darwin-mc_errors.o ../coregrind/libcoregrind-x86-darwin.a > ../VEX/libvex-x86-darwin.a -lgcc > link_tool_exe_darwin: /usr/bin/ld -static -arch i386 -macosx_version_min > 10.6 -o memcheck-x86-darwin -u __start -e __start -image_base 0x38000000 > -stack_addr 0x34000000 -stack_size 0x800000 memcheck_x86_darwin-mc_leakcheck.o > memcheck_x86_darwin-mc_malloc_wrappers.o memcheck_x86_darwin-mc_main.o > memcheck_x86_darwin-mc_translate.o memcheck_x86_darwin-mc_machine.o > memcheck_x86_darwin-mc_errors.o ../coregrind/libcoregrind-x86-darwin.a > ../VEX/libvex-x86-darwin.a > Undefined symbols for architecture i386: > "___ctzdi2", referenced from: > _doRegisterAllocation in libvex-x86-darwin.a(libvex_ > x86_darwin_a-host_generic_reg_alloc2.o) > ld: symbol(s) not found for architecture i386 > make[3]: *** [memcheck-x86-darwin] Error 1 > make[2]: *** [all-recursive] Error 1 > make[1]: *** [all-recursive] Error 1 > make: *** [all] Error 2 > |
|
From: Not Me <iti...@gm...> - 2017-02-03 10:59:31
|
Hi all, Because Homebrew installation of valgrind failed on my Mac, I'm trying to build from the svn repo by following the steps documented at http://valgrind.org/downloads/repository.html. While doing so, I'm running into a build issue. Could somebody let me know whether this is a supported configuration and possible even how to fix the problem? Thanks a lot! *Homebrew*: $ brew install valgrind valgrind: This formula either does not compile or function as expected on macOS versions newer than El Capitan due to an upstream incompatibility. Error: An unsatisfied requirement failed this build. $ brew install --HEAD valgrind valgrind: This formula either does not compile or function as expected on macOS versions newer than El Capitan due to an upstream incompatibility. Error: An unsatisfied requirement failed this build. *My versions*: OS X, Sierra, 10.12.3. Xcode 8.2.1 (8C1002) *Build error*: ./autogen.sh ./configure Maximum build arch: amd64 Primary build arch: amd64 Secondary build arch: x86 Build OS: darwin Primary build target: AMD64_DARWIN Secondary build target: X86_DARWIN Platform variant: vanilla Primary -DVGPV string: -DVGPV_amd64_darwin_vanilla=1 Default supp files: exp-sgcheck.supp xfree-3.supp xfree-4.supp darwin10-drd.supp darwin16.supp make ../coregrind/link_tool_exe_darwin 0x38000000 gcc -o memcheck-x86-darwin -arch i386 -O2 -g -std=gnu99 -Wall -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations -Wcast-align -Wcast-qual -Wwrite-strings -Wempty-body -Wformat -Wformat-security -Wignored-qualifiers -fno-stack-protector -fno-strict-aliasing -fno-builtin -Wno-cast-align -Wno-self-assign -Wno-tautological-compare -mmacosx-version-min=10.6 -fno-stack-protector -fno-pic -fno-PIC -O2 -nodefaultlibs -nostartfiles -Wl,-u,__start -Wl,-e,__start -arch i386 memcheck_x86_darwin-mc_leakcheck.o memcheck_x86_darwin-mc_malloc_wrappers.o memcheck_x86_darwin-mc_main.o memcheck_x86_darwin-mc_translate.o memcheck_x86_darwin-mc_machine.o memcheck_x86_darwin-mc_errors.o ../coregrind/libcoregrind-x86-darwin.a ../VEX/libvex-x86-darwin.a -lgcc link_tool_exe_darwin: /usr/bin/ld -static -arch i386 -macosx_version_min 10.6 -o memcheck-x86-darwin -u __start -e __start -image_base 0x38000000 -stack_addr 0x34000000 -stack_size 0x800000 memcheck_x86_darwin-mc_leakcheck.o memcheck_x86_darwin-mc_malloc_wrappers.o memcheck_x86_darwin-mc_main.o memcheck_x86_darwin-mc_translate.o memcheck_x86_darwin-mc_machine.o memcheck_x86_darwin-mc_errors.o ../coregrind/libcoregrind-x86-darwin.a ../VEX/libvex-x86-darwin.a Undefined symbols for architecture i386: "___ctzdi2", referenced from: _doRegisterAllocation in libvex-x86-darwin.a(libvex_x86_darwin_a-host_generic_reg_alloc2.o) ld: symbol(s) not found for architecture i386 make[3]: *** [memcheck-x86-darwin] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2 |
|
From: Mark W. <mj...@re...> - 2017-02-02 17:27:04
|
Hi Hackers, users and valgrind enthusiasts, This is a reminder that our valgrind devroom meeting at Fosdem, Brussels, is this Saturday. For detials, see below. Also please let me know if you have any issues you want to discuss during the valgrind hackaton event (even if you cannot come, you can submit ideas!) If you are giving a presentation earlier on the day, but have some bonus slides that you might not have time to present (we will be strict on time) then feel free to sent them to me and/or feel free to suggest you present the extra bonus material at the hackaton. On Mon, Dec 12, 2016 at 01:46:39PM +0100, Mark Wielaard wrote: > Hi Valgrind users, hackers and enthusiasts, > > We have program from the valgrind devroom that will take place during > Fosdem in Brussels on Saturday 5 February 2017: > https://fosdem.org/2017/schedule/track/valgrind/ > > The last session will be the Valgrind BoF and Hackaton! > > We need your help for that session! The simplest way to help is to just > show up and discuss anything you like and then start hacking! But it is > also fun to have a little structure to the session by letting us know > beforehand what you think we should discuss and/or hack on. > > Here is the current desciption, please help improve it: > > Valgrind BoF and Hackaton > Open discussion of ideas for Valgrind - and then we hack! > https://fosdem.org/2017/schedule/event/valgrind_hackaton/ > > Come and hack on Valgrind together. Open discussion about small (or big) > ideas to improve or change Valgrind. > > Valgrind developers and users are encouraged to participate either by > submitting ideas/suggestions or by joining the discussion. And of course > by kindly (or bitterly) complain about bugs you find important that are > still Not YET solved for that many years!?@!!! > > Afterwards we will sit together and try to fix or implement some of the > things discussed. > > > Discuss any kind of possible improvement (technical or functional) to > Valgrind. > > If you want to put something on the agenda please send a small > description (one or two paragraphs) to the the moderator Mark Wielaard > mj...@re... with in the subject: "FOSDEM devroom discuss: ..." If you > want to discuss a somewhat larger topic please do feel free to send two > or three slides in advance. > > Mark will collect ideas/suggestions/... and present these and coordinate > the discussion (and keep track of the time, so every idea will be > discussed). > > Some discussion topic ideas: > > * Release/bugfixing strategy/policy. > * Can we move to git yet? > * Valgrind and transactional memory. > * Making Valgrind really multi-threaded, parallelising Memcheck, > parallelising the rest of the framework, and tools. > * Instant leak detector. Modify memcheck to report the last leaked > pointer to a block. Integrate "omega" as a memcheck option or > omega as a separate tool. > http://www.brainmurders.eclipse.co.uk/omega.html > * Make Callgrind work sanely on ARM (and PPC). The Callgrind > algorithm to track call and return is to be improved to work > properly on these platforms. Is there a way to make this better? > E.g. by having a fast way working in most cases, and rely on > unwind info in the difficult cases. Can we detect at > instrumentation time that an instruction is a difficult case? > * Packaging valgrind for distros, handling patches, suppressions, > etc. > * 32-bit x86 vs modern instruction sets (avx, etc.) > * VEX is in theory cross-architecture. What would it take to make > valgrind cross-arch? How about starting with i686 on x86_64? > * Which CPUID is it anyway? Valgrind isn't completely consistent > in handling host CPU capabilities vs VEX emulation capabilities. > What can we do to improve that? Make it user tunable? > * <YOUR SUGGESTION HERE!> > > And now is the time on Sprockets when we hack! |
|
From: Julian S. <js...@ac...> - 2017-01-19 15:53:49
|
> On 13/01/17 23:03, Philippe Waroquiers wrote: > So, no developer, no user, no company person telling anything, > and public information telling the chip will be replaced. I agree, I think this port should be removed. A whole year with absolutely no signs of user or maintainer activity is, in my view, long enough. J |
|
From: Nagendra K. G. (न. ग. <nag...@gm...> - 2017-01-17 17:17:13
|
Please see below. sgemm comes from openblas : http://www.openblas.net/ ==17348== Memcheck, a memory error detector ==17348== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==17348== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info ==17348== Command: ./a.out new.conf seg ==17348== ==== SB 0 (evchecks 0) [tid 1] 0x40012d0 UNKNOWN_FUNCTION /lib/x86_64-linux-gnu/ld-2.19.so+0x12d0 ==== SB 1 (evchecks 1) [tid 1] 0x4004ac5 _dl_start+85 /lib/x86_64-linux-gnu/ ld-2.19.so+0x4ac5 ==== SB 2 (evchecks 2) [tid 1] 0x4004b1f _dl_start+175 /lib/x86_64-linux-gnu/ld-2.19.so+0x4b1f ==== SB 3 (evchecks 3) [tid 1] 0x4004b08 _dl_start+152 /lib/x86_64-linux-gnu/ld-2.19.so+0x4b08 ==== SB 4 (evchecks 6) [tid 1] 0x4004b25 _dl_start+181 /lib/x86_64-linux-gnu/ld-2.19.so+0x4b25 . . . ==== SB 38875 (evchecks 11190178207) [tid 1] 0xda8f94 sgemm_tn+1316 a.out+0xda8f94 ==== SB 38876 (evchecks 11190178208) [tid 1] 0xda8c82 sgemm_tn+530 a.out+0xda8c82 ==== SB 38877 (evchecks 11190178209) [tid 1] 0xda8cbb sgemm_tn+587 a.out+0xda8cbb ==== SB 38878 (evchecks 11190178210) [tid 1] 0x20c83f0 sgemm_incopy_HASWELL a.out+0x20c83f0 ==== SB 38879 (evchecks 11190178211) [tid 1] 0x20c843f sgemm_incopy_HASWELL+79 a.out+0x20c843f ==== SB 38880 (evchecks 11190178212) [tid 1] 0x20c8565 sgemm_incopy_HASWELL+373 a.out+0x20c8565 ==== SB 38881 (evchecks 11190178213) [tid 1] 0x20c856b sgemm_incopy_HASWELL+379 a.out+0x20c856b ==== SB 38882 (evchecks 11190178214) [tid 1] 0x20c86cb sgemm_incopy_HASWELL+731 a.out+0x20c86cb ==== SB 38883 (evchecks 11190178215) [tid 1] 0x20c8813 sgemm_incopy_HASWELL+1059 a.out+0x20c8813 ==== SB 38884 (evchecks 11190178216) [tid 1] 0x20c8588 sgemm_incopy_HASWELL+408 a.out+0x20c8588 ==== SB 38885 (evchecks 11190178217) [tid 1] 0x20c86ee sgemm_incopy_HASWELL+766 a.out+0x20c86ee ==== SB 38886 (evchecks 11190178218) [tid 1] 0x20c8833 sgemm_incopy_HASWELL+1091 a.out+0x20c8833 ==== SB 38887 (evchecks 11190178327) [tid 1] 0x20c887f sgemm_incopy_HASWELL+1167 a.out+0x20c887f ==== SB 38888 (evchecks 11190178328) [tid 1] 0x20c8a0d sgemm_incopy_HASWELL+1565 a.out+0x20c8a0d ==== SB 38889 (evchecks 11190178329) [tid 1] 0x20c8a2a sgemm_incopy_HASWELL+1594 a.out+0x20c8a2a ==== SB 38890 (evchecks 11190179650) [tid 1] 0x20c8a40 sgemm_incopy_HASWELL+1616 a.out+0x20c8a40 ==== SB 38891 (evchecks 11190179651) [tid 1] 0x20c8a6c sgemm_incopy_HASWELL+1660 a.out+0x20c8a6c ==== SB 38892 (evchecks 11190179652) [tid 1] 0x20c8ac1 sgemm_incopy_HASWELL+1745 a.out+0x20c8ac1 ==== SB 38893 (evchecks 11190179653) [tid 1] 0x20c8bf9 sgemm_incopy_HASWELL+2057 a.out+0x20c8bf9 ==== SB 38894 (evchecks 11190179654) [tid 1] 0x20c8ad8 sgemm_incopy_HASWELL+1768 a.out+0x20c8ad8 ==== SB 38895 (evchecks 11190179763) [tid 1] 0x20c8c05 sgemm_incopy_HASWELL+2069 a.out+0x20c8c05 ==== SB 38896 (evchecks 11190179764) [tid 1] 0x20c8c4b sgemm_incopy_HASWELL+2139 a.out+0x20c8c4b ==== SB 38897 (evchecks 11190179765) [tid 1] 0x20c8c5c sgemm_incopy_HASWELL+2156 a.out+0x20c8c5c ==== SB 38898 (evchecks 11190179766) [tid 1] 0x20c8ca1 sgemm_incopy_HASWELL+2225 a.out+0x20c8ca1 ==== SB 38899 (evchecks 11190179767) [tid 1] 0x20c8cb8 sgemm_incopy_HASWELL+2248 a.out+0x20c8cb8 ==== SB 38900 (evchecks 11190179876) [tid 1] 0x20c8d09 sgemm_incopy_HASWELL+2329 a.out+0x20c8d09 ==== SB 38901 (evchecks 11190179877) [tid 1] 0x20c8d43 sgemm_incopy_HASWELL+2387 a.out+0x20c8d43 ==== SB 38902 (evchecks 11190179878) [tid 1] 0x20c8dd1 sgemm_incopy_HASWELL+2529 a.out+0x20c8dd1 ==== SB 38903 (evchecks 11190179879) [tid 1] 0x20c8e29 sgemm_incopy_HASWELL+2617 a.out+0x20c8e29 ==== SB 38904 (evchecks 11190179880) [tid 1] 0xda8cec sgemm_tn+636 a.out+0xda8cec ==== SB 38905 (evchecks 11190179881) [tid 1] 0xda8cfc sgemm_tn+652 a.out+0xda8cfc ==== SB 38906 (evchecks 11190179882) [tid 1] 0xda8d34 sgemm_tn+708 a.out+0xda8d34 ==== SB 38907 (evchecks 11190179883) [tid 1] 0x20c9610 sgemm_oncopy_HASWELL a.out+0x20c9610 ==== SB 38908 (evchecks 11190179884) [tid 1] 0x20c9650 sgemm_oncopy_HASWELL+64 a.out+0x20c9650 ==== SB 38909 (evchecks 11190179885) [tid 1] 0x20c9709 sgemm_oncopy_HASWELL+249 a.out+0x20c9709 ==== SB 38910 (evchecks 11190179886) [tid 1] 0x20c9720 sgemm_oncopy_HASWELL+272 a.out+0x20c9720 ==== SB 38911 (evchecks 11190179940) [tid 1] 0x20c9814 sgemm_oncopy_HASWELL+516 a.out+0x20c9814 ==== SB 38912 (evchecks 11190179941) [tid 1] 0x20c987d sgemm_oncopy_HASWELL+621 a.out+0x20c987d ==== SB 38913 (evchecks 11190179942) [tid 1] 0x20c9897 sgemm_oncopy_HASWELL+647 a.out+0x20c9897 ==== SB 38914 (evchecks 11190180053) [tid 1] 0x20c98a8 sgemm_oncopy_HASWELL+664 a.out+0x20c98a8 ==== SB 38915 (evchecks 11190180054) [tid 1] 0x20c998a sgemm_oncopy_HASWELL+890 a.out+0x20c998a ==== SB 38916 (evchecks 11190180055) [tid 1] 0x20c9a0b sgemm_oncopy_HASWELL+1019 a.out+0x20c9a0b ==== SB 38917 (evchecks 11190180056) [tid 1] 0xda8d7c sgemm_tn+780 a.out+0xda8d7c ==== SB 38918 (evchecks 11190180057) [tid 1] 0x20bfa00 sgemm_kernel_HASWELL a.out+0x20bfa00 ==== SB 38919 (evchecks 11190180058) [tid 1] 0x20bfa44 sgemm_kernel_HASWELL+68 a.out+0x20bfa44 ==== SB 38920 (evchecks 11190180059) [tid 1] 0x20bfa4e sgemm_kernel_HASWELL+78 a.out+0x20bfa4e ==== SB 38921 (evchecks 11190180060) [tid 1] 0x20bfa58 sgemm_kernel_HASWELL+88 a.out+0x20bfa58 ==== SB 38922 (evchecks 11190180061) [tid 1] 0x20bfa98 sgemm_kernel_HASWELL+152 a.out+0x20bfa98 ==== SB 38923 (evchecks 11190180062) [tid 1] 0x20bfac0 sgemm_kernel_HASWELL+192 a.out+0x20bfac0 ==== SB 38924 (evchecks 11190180172) [tid 1] 0x20bfae3 sgemm_kernel_HASWELL+227 a.out+0x20bfae3 ==== SB 38925 (evchecks 11190180173) [tid 1] 0x20bfb0a sgemm_kernel_HASWELL+266 a.out+0x20bfb0a ==== SB 38926 (evchecks 11190180174) [tid 1] 0x20bfb2c sgemm_kernel_HASWELL+300 a.out+0x20bfb2c ==== SB 38927 (evchecks 11190180175) [tid 1] 0x20bfc63 sgemm_kernel_HASWELL+611 a.out+0x20bfc63 valgrind: m_translate.c:1772 (vgPlain_translate): Assertion 'tres.status == VexTransOK' failed. host stacktrace: ==17348== at 0x38089E9A: show_sched_status_wrk (m_libcassert.c:343) ==17348== by 0x38089FB4: report_and_quit (m_libcassert.c:419) ==17348== by 0x3808A14A: vgPlain_assert_fail (m_libcassert.c:485) ==17348== by 0x380AA31C: vgPlain_translate (m_translate.c:1772) ==17348== by 0x380DFBBB: handle_chain_me (scheduler.c:1076) ==17348== by 0x380E1947: vgPlain_scheduler (scheduler.c:1420) ==17348== by 0x380F18F0: thread_wrapper (syswrap-linux.c:103) ==17348== by 0x380F18F0: run_a_thread_NORETURN (syswrap-linux.c:156) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 17348) ==17348== at 0x20BFC63: sgemm_kernel_HASWELL (in a.out) Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. On Tue, Jan 17, 2017 at 6:12 AM, Julian Seward <js...@ac...> wrote: > > > valgrind: m_translate.c:1772 (vgPlain_translate): Assertion 'tres.status > == > > VexTransOK' failed. > > That is a very strange failure. It might just be believable that the front > end failed somehow whilst parsing handwritten assembly in this file > > > Thread 3: status = VgTs_Runnable (lwpid 17216) > > ==17214== at 0x1497463: sgemm_kernel (sgemm_kernel_16x4_haswell.S: > 1284) > > Can you re-run with --trace-flags=10000000 and mail the last 50 lines > before > the failure here? I want to see what bit of code it was trying to process > at the time it failed. > > J > > |
|
From: Julian S. <js...@ac...> - 2017-01-17 11:12:12
|
> valgrind: m_translate.c:1772 (vgPlain_translate): Assertion 'tres.status == > VexTransOK' failed. That is a very strange failure. It might just be believable that the front end failed somehow whilst parsing handwritten assembly in this file > Thread 3: status = VgTs_Runnable (lwpid 17216) > ==17214== at 0x1497463: sgemm_kernel (sgemm_kernel_16x4_haswell.S:1284) Can you re-run with --trace-flags=10000000 and mail the last 50 lines before the failure here? I want to see what bit of code it was trying to process at the time it failed. J |
|
From: John R. <jr...@bi...> - 2017-01-16 16:13:21
|
On 01/16/2017 07:14 AM, Nagendra Kumar Goel (नगेन्द्र गोयल) wrote: > I am trying to run valgrind on a multithreaded websockets based program, to check for memory leaks. Valgrind crashes in the middle of the program with following errors: > > --17214-- REDIR: 0x6e44620 (libc.so.6:__memcpy_chk) redirected to 0x4a28770 (_vgnU_ifunc_wrapper) > --17214-- REDIR: 0x6e7c2a0 (libc.so.6:__memcpy_chk_avx_unaligned) redirected to 0x4c34e10 (__memcpy_chk) > > valgrind: m_translate.c:1772 (vgPlain_translate): Assertion 'tres.status == VexTransOK' failed. There is no workaround. The bug must be fixed. Please file a bug report. https://valgrind.org > Bug reports Copy+paste from your message into the bug report, add a title. PLEASE INCLUDE THE VERSION OF VALGRIND and the operating system (uname -a). |
|
From: Nagendra K. G. (न. ग. <nag...@gm...> - 2017-01-16 15:14:09
|
I am trying to run valgrind on a multithreaded websockets based program, to check for memory leaks. Valgrind crashes in the middle of the program with following errors: --17214-- REDIR: 0x6e44620 (libc.so.6:__memcpy_chk) redirected to 0x4a28770 (_vgnU_ifunc_wrapper) --17214-- REDIR: 0x6e7c2a0 (libc.so.6:__memcpy_chk_avx_unaligned) redirected to 0x4c34e10 (__memcpy_chk) valgrind: m_translate.c:1772 (vgPlain_translate): Assertion 'tres.status == VexTransOK' failed. host stacktrace: ==17214== at 0x38086843: show_sched_status_wrk (m_libcassert.c:378) ==17214== by 0x38086944: report_and_quit (m_libcassert.c:449) ==17214== by 0x38086AD1: vgPlain_assert_fail (m_libcassert.c:515) ==17214== by 0x380A5606: vgPlain_translate (m_translate.c:1772) ==17214== by 0x380DB83B: handle_chain_me (scheduler.c:1076) ==17214== by 0x380DD36F: vgPlain_scheduler (scheduler.c:1420) ==17214== by 0x380EC716: thread_wrapper (syswrap-linux.c:103) ==17214== by 0x380EC716: run_a_thread_NORETURN (syswrap-linux.c:156) ==17214== by 0x380EC9AA: vgModuleLocal_start_thread_NORETURN (syswrap-linux.c:329) ==17214== by 0x3811619D: ??? (in /usr/local/lib/valgrind/memcheck-amd64-linux) ==17214== by 0xDEADBEEFDEADBEEE: ??? ==17214== by 0xDEADBEEFDEADBEEE: ??? ==17214== by 0xDEADBEEFDEADBEEE: ??? sched status: running_tid=3 Thread 1: status = VgTs_WaitSys (lwpid 17214) ==17214== at 0x516C9DD: pthread_join (pthread_join.c:90) ==17214== by 0x6546B96: std::thread::join() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21) ==17214== by 0xE6A489: WebSocketServer::StartThreadService() (WebSocket-Server.cc:364) ==17214== by 0xE6A11D: WebSocketServer::run() (WebSocket-Server.cc:321) ==17214== by 0xE5EFCA: main (kws-WebSocket-Server.cc:37) Thread 2: status = VgTs_WaitSys (lwpid 17215) ==17214== at 0x6E29B5D: ??? (syscall-template.S:84) ==17214== by 0x1612440: _lws_plat_service_tsi (lws-plat-unix.c:147) ==17214== by 0x1600AAD: lws_service_tsi (service.c:1159) ==17214== by 0xE6A14B: WebSocketServer::ThreadService(unsigned int) (WebSocket-Server.cc:325) ==17214== by 0xE74AC1: void std::_Mem_fn_base<void (WebSocketServer::*)(unsigned int), true>::operator()<unsigned int, void>(WebSocketServer*, unsigned int&&) const (functional:600) ==17214== by 0xE74A3E: void std::_Bind_simple<std::_Mem_fn<void (WebSocketServer::*)(unsigned int)> (WebSocketServer*, unsigned int)>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (functional:1531) ==17214== by 0xE748F5: std::_Bind_simple<std::_Mem_fn<void (WebSocketServer::*)(unsigned int)> (WebSocketServer*, unsigned int)>::operator()() (functional:1520) ==17214== by 0xE74885: std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (WebSocketServer::*)(unsigned int)> (WebSocketServer*, unsigned int)> >::_M_run() (thread:115) ==17214== by 0x6546C7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21) ==17214== by 0x516B709: start_thread (pthread_create.c:333) ==17214== by 0x6E3582C: clone (clone.S:109) Thread 3: status = VgTs_Runnable (lwpid 17216) ==17214== at 0x1497463: sgemm_kernel (sgemm_kernel_16x4_haswell.S:1284) I have tried changing the operating system (from fedora to mint and ubuntu). I also checked out the latest valgrind valgrind --version valgrind-3.13.0.SVN to see if that will help me alleviate the issue. It did not help. Can someone please advise how to get around this issue. |