|
From: ISHIKAWA,chiaki <ish...@yk...> - 2017-02-19 07:39:03
|
Hi,
Thank you again.
I will hopefully upload the requested info next week.
Here is what I can write down today.
What would be the appropriate upload service? [The data would be too
large for e-mail to the list.]
On 2017/02/19 7:32, John Reiser wrote:
> How many failures occur in 10 runs of thunderbird under valgrind?
10 times, i.e., all the time under the Debian's stock newer kernel.
> How many failures occur in 10 runs if you reboot just before each run?
It never occurred to me to to reboot the system before retrying.
I will check this next week (but given the tests I did by SWITCHING
kernel versions by rebooting to a different revision before over the
last few months, I would say 10 times, i.e. all the times, but again let
me check.)
>
> Thunderbird is a user mail agent that uses interactive graphics.
> How many failures occur before the display window appears, and how many after?
There is one issue: I am seeing a failure of valgrind when I try to run
thunderbird test suite and the complicating factor here is aside from
the available user interaction through GUI under X windows, during the
execution of |make mozmill| test suite, there is a daemon that runs test
scripts and talks to the main TB binary via COM interface. [I stay away
from KB and mouse cursor during tests to avoid interfering with the test
suite run. I do this by invoking virtual X desktop using Xephyr: the
test suite run using Valgrind is done in that virtual desktop. If I
wanted to, I COULD interact with thunderbird's GUI via mouse explicitly.
I did this a few times when a bug in thunderbird or test scrips made the
execution hung waiting for a confirmation of modal dialog, etc.]
From what I did, the crash occurs before the display window of the
tested thunderbird appears all the time [all the time when the valgrind
printed mysterious Segmentation error under newer Debian kernel.
> Are the symptoms and frequency the same for a Radeon card as for NVidia?
> On the open-source NVidia driver versus the proprietary driver?
> In "dumb framebuffer" mode ("no" acceleration)?
> Please tell us which cards: "lspci -nn | grep VGA" or similar.
I am using Debian GNU/Linux inside
VirtualBox installed under Windows 10 as a platform to
develop and test thunderbird patches.
Debian GNU/Linux installed as the guest OS inside VirtualBox.
So the video graphics driver relevant here is the the VirtualBox video
driver, I think, correct? (But there was a puzzling message in X.0.log.
I will mention it to the answer to your second to last question.)
Under 3.19.5 kernel where the valgrind + thunderbird test suite works:
$ lspci -nn | grep VGA
00:02.0 VGA compatible controller [0300]: InnoTek Systemberatung GmbH
VirtualBox Graphics Adapter [80ee:beef]
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$
(InnoTek is the name of original virtualbox developer.)
I am not sure if I can remove the above virtualbox graphics adaptor and
revert to the plain VGA adaptor emulation done by VirtualBox, but let me
try.
> Are the symptoms and frequency the same for Firefox as for thunderbird?
I am not developing or creating patches for Firefox. Sorry.
> Are the symptoms and frequency the same for Chrome as for thunderbird?
Ditto.
Oh, you mean to ask whether I can run very simple
valgrind firefox-binary (without any test harness invovlment) under the
new kernel and see it works?
Then I can test it.
But Chrome. I have not even installed it before.
> Please present a histogram of the {mapped file, pc offset, instruction stream}
> when the SIGSEGV happens. [You should have at least 70 runs by now: 10 each
> for thunderbird plain, with reboot, other graphics card, other NVidia driver,
> dumb framebuffer, Firefox, Chrome.]
OK, I will gather the data (not sure what you man by "histogram", but I
will gather what I think is relevant.)
10 each
for thunderbird plain,
with reboot [I will certainly reboot before the test run.
x 10 times with the above InnoTech driver (built-in for VirtualBox).
[I am not sure if SIGSEGV happens under this setup.]
for thunderbid + test suite hookup.
I am quite certain that SIGSEGV happens under this setup.
BTW, DOES ANYONE HAVE A GOOD IDEA ABOUT HOW TO CAPTURE the mapped
file, etc WHEN SIGSEGV happens? It is very dynamic and by the time I am
ready to type in shell commands, the child binary that experienced it
may be gone. Yes, I have not been able to figure out exactly which
process under the test suite setup started by thunderbird (under
valgrind) is experiencing a difficulty.
I guess some clever hacking via gdb gets me started there?
BTW, valgrind's --gdb-* options are meant to debug the target under
valgrind, NOT the segfault of valgrind itself, correct?
[And the whole thing including valgrind works under kernel 3.19.5 and
not under later kernel drives me crasy.]
> other graphics card, other NVidia driver, These won't apply.
for thunderbird plain,
dumb framebuffer [IF THIS SETUP IS FEASIBLE under VirtualBox.]
after reboot
for thunderbird + test suite hookup.
dumb framebuffer [IF THIS SETUP IS FEASIBLE under VirtualBox.]
after reboot
> Firefox,
I think without any test suite hookup, or anything, I can
simply run Firefox ESR now available from Debian GNU/Linux repository.
I suspect without any test suite hookup, it will run.
Anyway, I will try to compare the
mmap status under firefox with stock VirtualBox graphics driver, and
mmap status under firefox with dumb framebuffer [IF THIS IS FEASIBLE.]
after reboot.
> Chrome.
It looks there is a package of Chrome for Ubuntu.
Maybe I can install it under Debian.
However, this can wait, I think.
At the same time, it would be very instructive to compare the mmap
between the one while chrome is running [AFTER REBOOT]
and the ones when mozilla software {thunderbird, firefox} is running.
> thunderbird is not available from the Debian stable "jessie" repository
> (Debian 8.7.1, 2017-01-20.) Where did you get it?
Sorry I was not clear about it.
I have fetched so-called comm-central thunderbird repository and
have been building it locally [64-bit] for testing purposes to fix some
serious bugs I experienced.
The instruction to build thunderbird locally is in the following URL and
I have basically followed it.
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Simple_Thunderbird_build
"Basically" means that I had to tweak the so-called "mozconfig" in many
ways, especially, to enable valgrind-friendly build.
Very brief explanation is in the following URL:
https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Valgrind
The above refers to test |mochitest| for firefox.
Since thunderbird lives in different source directory and
uses a very different test suite setup that uses mozmill, there are
quirks and modifications one need to add to the source files and scripts
in order to run thunderbird under valgrind.
It seems that, at one time, somebody hacked the thunderbird test suite
to run valgrind/memcheck for thunderbird, but it was abandoned and
nobody seems to recall how it was exactly done or how to update the
scripts, etc.
So basically, what I do myself to run thunderbird is
- renaming the original thunderbird binary to something else, and
- in its place, I place a binary that invokes the original
thunderbird binary under valgrind/memcheck with the supplied parameters.
This trick has worked very well and many bugs/issues were found in the
last several years until 2015 when I first experienced the strange
problem of valgrind failure. And back then,
I realized it was related to different kernel versioning.
The locally created kernel 3.19.5 saved the day.
But the world has moved on to 4.x series kernel since then, and when I
updated the kernel last summer this problem reappeared.
I have reverted the kernel to 3.19.5 for the moment, but I am not sure
how long I can stick to the older kernel.
If you need a thinderbird binary to test on your end, I can certainly
make it available.
Actually, I run the test (without valgrind) inside mozilla's
compilation/testing farm occasionally. [This makes it for me to possible
to compile/test OSX version and Windows version. This is a necessary
step before a patch is accepted into mozilla's source tree. ]
You can fetch the binary from there. Please let me know if this is the case.
> Which kernel modules have been loaded (lsmod)?
Under 3.19.5
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ uname -a
Linux ip030 3.19.5 #1 SMP Mon Apr 20 08:50:21 JST 2015 x86_64 GNU/Linux
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ lsmod
Module Size Used by
fuse 72030 1
btrfs 731518 0
xor 21081 1 btrfs
raid6_pq 95431 1 btrfs
ufs 59011 0
qnx4 13100 0
hfsplus 81692 0
hfs 45988 0
minix 27622 0
ntfs 160179 0
vfat 17270 0
msdos 17077 0
fat 50634 2 vfat,msdos
jfs 137440 0
xfs 667205 0
libcrc32c 12426 1 xfs
ext3 151975 0
jbd 52800 1 ext3
ext2 59160 0
dm_mod 77808 0
vboxsf 37355 1
mptctl 29762 0
mptbase 56835 1 mptctl
binfmt_misc 12846 1
ghash_clmulni_intel 13019 0
aesni_intel 163983 0
ppdev 12724 0
joydev 17107 0
iTCO_wdt 12831 0
iTCO_vendor_support 12704 1 iTCO_wdt
aes_x86_64 16719 1 aesni_intel
ablk_helper 12572 1 aesni_intel
cryptd 14600 3 ghash_clmulni_intel,aesni_intel,ablk_helper
lrw 12871 1 aesni_intel
evdev 17518 14
gf128mul 13047 1 lrw
glue_helper 12773 1 aesni_intel
microcode 30394 0
snd_intel8x0 30885 2
psmouse 83740 0
serio_raw 12894 0
pcspkr 12595 0
snd_ac97_codec 102547 1 snd_intel8x0
snd_pcm 73065 2 snd_ac97_codec,snd_intel8x0
snd_timer 22641 1 snd_pcm
snd 53213 8
snd_ac97_codec,snd_intel8x0,snd_timer,snd_pcm
soundcore 13031 1 snd
sg 29968 0
ac97_bus 12510 1 snd_ac97_codec
processor 28021 0
lpc_ich 20905 0
mfd_core 12601 1 lpc_ich
video 18144 0
rng_core 12880 0
vboxvideo 36417 2
vboxguest 181315 6 vboxsf,vboxvideo
thermal_sys 28310 2 video,processor
ttm 61967 1 vboxvideo
drm_kms_helper 74527 1 vboxvideo
drm 229484 5 ttm,drm_kms_helper,vboxvideo
i2c_piix4 12665 0
i2c_core 38003 3 drm,i2c_piix4,drm_kms_helper
syscopyarea 12350 1 vboxvideo
sysfillrect 12522 1 vboxvideo
sysimgblt 12351 1 vboxvideo
ac 12715 0
battery 13356 0
parport_pc 22422 0
parport 31812 2 ppdev,parport_pc
button 12988 0
sunrpc 192012 1
loop 22596 0
ip_tables 22004 0
x_tables 19034 1 ip_tables
autofs4 27584 2
ext4 403601 15
crc16 12343 1 ext4
jbd2 71809 1 ext4
mbcache 13488 3 ext2,ext3,ext4
sd_mod 39859 26
sr_mod 21993 0
cdrom 27042 1 sr_mod
ata_generic 12490 0
hid_generic 12393 0
usbhid 40671 0
hid 90268 2 hid_generic,usbhid
ohci_pci 12808 0
ehci_pci 12472 0
ohci_hcd 30951 1 ohci_pci
ehci_hcd 40790 1 ehci_pci
crc32c_intel 21850 4
ahci 29245 16
usbcore 151644 5 ohci_hcd,ohci_pci,ehci_hcd,ehci_pci,usbhid
libahci 23158 1 ahci
usb_common 12440 1 usbcore
ata_piix 29671 0
libata 145717 4 ahci,libahci,ata_generic,ata_piix
scsi_mod 172107 5 sg,libata,mptctl,sd_mod,sr_mod
e1000 90595 0
ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$
I did not realize there are so many vbox drivers.
> Which version(s) of the low-level X11 and display drivers (DRM: direct
> rendering manager) are in use?
Under 3.19.5
egrep -i "(module|vbox|drm)" /var/log/Xorg.0.log &
printed out
[ 8.651] (==) ModulePath set to "/usr/lib/xorg/modules"
[ 8.651] (II) Module ABI versions:
[ 8.652] (II) xfree86: Adding drm device (/dev/dri/card0)
[ 8.655] (II) LoadModule: "glx"
[ 8.658] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[ 8.716] (II) Module glx: vendor="X.Org Foundation"
[ 8.716] compiled for 1.19.1, module version = 1.0.0
[ 8.716] (==) Matched vboxvideo as autoconfigured driver 0
[ 8.716] (==) Matched vboxvideo as autoconfigured driver 1
[ 8.716] (II) LoadModule: "vboxvideo"
[ 8.716] (WW) Warning, couldn't open module vboxvideo
[ 8.716] (II) UnloadModule: "vboxvideo"
[ 8.716] (II) Unloading vboxvideo
[ 8.716] (EE) Failed to load module "vboxvideo" (module does not
exist, 0)
[ 8.716] (II) LoadModule: "modesetting"
[ 8.716] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
[ 8.717] (II) Module modesetting: vendor="X.Org Foundation"
[ 8.717] compiled for 1.19.1, module version = 1.19.1
[ 8.717] Module class: X.Org Video Driver
[ 8.717] (II) LoadModule: "fbdev"
[ 8.717] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so
[ 8.717] (II) Module fbdev: vendor="X.Org Foundation"
[ 8.717] compiled for 1.19.0, module version = 0.4.4
[ 8.717] Module class: X.Org Video Driver
[ 8.717] (II) LoadModule: "vesa"
[ 8.717] (II) Loading /usr/lib/xorg/modules/drivers/vesa_drv.so
[ 8.717] (II) Module vesa: vendor="X.Org Foundation"
[ 8.717] compiled for 1.19.0, module version = 2.3.4
[ 8.717] Module class: X.Org Video Driver
[ 8.721] (II) Loading sub module "fbdevhw"
[ 8.721] (II) LoadModule: "fbdevhw"
[ 8.721] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so
[ 8.722] (II) Module fbdevhw: vendor="X.Org Foundation"
[ 8.722] compiled for 1.19.1, module version = 0.0.2
[ 8.722] (II) Loading sub module "glamoregl"
[ 8.722] (II) LoadModule: "glamoregl"
[ 8.722] (II) Loading /usr/lib/xorg/modules/libglamoregl.so
[ 8.733] (II) Module glamoregl: vendor="X.Org Foundation"
[ 8.733] compiled for 1.19.1, module version = 1.0.0
[ 8.838] EGL_MESA_drm_image required.
[ 8.839] (II) modeset(0): Monitor name: VBOX monitor
[ 8.840] (II) Loading sub module "fb"
[ 8.840] (II) LoadModule: "fb"
[ 8.840] (II) Loading /usr/lib/xorg/modules/libfb.so
[ 8.840] (II) Module fb: vendor="X.Org Foundation"
[ 8.840] compiled for 1.19.1, module version = 1.0.0
[ 8.840] (II) UnloadModule: "fbdev"
[ 8.840] (II) UnloadSubModule: "fbdevhw"
[ 8.840] (II) UnloadModule: "vesa"
[ 8.916] (II) LoadModule: "libinput"
[ 8.916] (II) Loading /usr/lib/xorg/modules/input/libinput_drv.so
[ 8.919] (II) Module libinput: vendor="X.Org Foundation"
[ 8.919] compiled for 1.19.0, module version = 0.23.0
[ 8.919] Module class: X.Org XInput Driver
I am a little surprised but right now I may be using glx driver given
that "vboxvide" module does not seem to be loaded and other famous
modules get unloaded. Yes, I found out glxinfo printed out rows of
output including the following lines, and glxgears seems to run fine. I
should have known.
Re: glx:
glxinfo | grep -i1 vmware
Extended renderer info (GLX_MESA_query_renderer):
Vendor: VMware, Inc. (0xffffffff)
Device: llvmpipe (LLVM 3.9, 256 bits) (0xffffffff)
--
Max GLES[23] profile version: 3.0
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (L
I will collect info on 4.9.0-1 kernel (this is the latest test kernel
where I could not run thunderbird test suite since something dies during
execution.).
It may take a little time to gether the data. (Since the
compiling/testing thunderbird requires resources, I have only once
instance of VM running on the PC. So I really have to reboot this VM to
switch the kernel to obtain data.)
I wish someone with 64GB memory could retry and reproduce the issue in
their VirtualBox images on their hardware :-)
It would be very instructive compare the mmap usage, etc. under
different kernel revisions side by side (!)
TIA
PS: Just in case the HOST CPU/OS may have something to do with the issues:
OS: Windows 10 Pro
CPU: Intel Xeon CPU E3-1240 V2
Graphics: Radeon 7700
But I am sure that VirtualBox has shielded the bare metal rather well.
Windows version of VirtualBox : 5.1.14 r112924 (Qt5.6.2)
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Valgrind-users mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>
>
|