You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
|
2
|
3
(1) |
4
|
|
5
(1) |
6
|
7
|
8
(1) |
9
|
10
(1) |
11
|
|
12
|
13
|
14
(1) |
15
(2) |
16
|
17
|
18
|
|
19
(3) |
20
(2) |
21
(7) |
22
(9) |
23
(2) |
24
|
25
|
|
26
|
27
|
28
(3) |
29
(2) |
30
(16) |
31
(3) |
|
|
From: Nicholas N. <nj...@ca...> - 2003-10-23 05:34:57
|
On Tue, 21 Oct 2003, Ayodele Thomas wrote: > I have seen the functions to print UCode instructions, but are there also > functions to print the x86 instructions that can be used from a skin? Not as such, unfortunately. If you use --trace-codegen=10001 you will see the initial x86 and final x86 instructions printed, though. N |
|
From: Jeremy F. <je...@go...> - 2003-10-22 21:45:43
|
On Wed, 2003-10-22 at 09:01, Jeremy Fitzhardinge wrote: > It is, and I've made that change to fix problems with SuSE's kernel - it > doesn't kill the other threads on exec, so you end up with proxy LWPs > hanging around marooned in old address spaces. The patch is to replace > the "tid" with "VG_INVALID_THREADID" in the > VG_(nuke_all_threads_except)() line. Hm, this isn't actually safe, but neither was the old code. It makes a real mess of the process/valgrind state, which is OK if the execve works. But if it fails, we basically cannot continue, since we've effectively killed all the threads anyway. As a workaround, I've changed PRE(execve) to do some preflight tests to see if the execve can possibly work, and fail out early if not. The execve may still fail, but the only thing we can do is panic. J |
|
From: Jeremy F. <je...@go...> - 2003-10-22 16:12:39
|
On Wed, 2003-10-22 at 04:25, Tom Hughes wrote: > My money is on the culprit being de_thread() in fs/exec.c in the > kernel, which does all sorts of mucking about to kill the extra > threads and unshare the signal handling apparatus. Either the signal > is pending and is discarded when a new clean signal handling structure > is created for the thread doing the exec() or the signal is being > delivered but is only killing the other threads and not the one doing > the exec(). Yes, that code is all very tricky. Valgrind has triggered a number of bugs in the 2.6 implementation. > Either way, patching valgrind to explicitly kill the proxy thread in > the pre handler for execve seems to work around the problem... Whether > this is a safe thing to do isn't totally clear to me though. It is, and I've made that change to fix problems with SuSE's kernel - it doesn't kill the other threads on exec, so you end up with proxy LWPs hanging around marooned in old address spaces. The patch is to replace the "tid" with "VG_INVALID_THREADID" in the VG_(nuke_all_threads_except)() line. J |
|
From: Tom H. <th...@cy...> - 2003-10-22 13:12:46
|
In message <1066783120.4409.10.camel@localhost.localdomain>
Jeremy Fitzhardinge <je...@go...> wrote:
> On Tue, 2003-10-21 at 16:08, Tom Hughes wrote:
>
>> Specifically the wait4 system call exists with ERESTARTSYS but is isn't
>> restarted and the waitpid library call returns 114 which is the number of
>> the wait4 system call, as shown here:
>
> Gak. I tested all this, honest. Try this (manually hacked up, so the
> lines will be all off) patch.
That seems to fix my test case, yes.
Naturally enough that meant that it didn't fix my original problem in
our software though ;-)
Several hours chasing around and poring over linux kernel source have
led me to a sort of understanding about my actual problem and a way to
work around it in valgrind.
My original thesis that the SIGKILL wasn't making it to the child is
in fact correct, as can be seen by the fact that when the parent hangs
waiting for the child the child is still running quite happily.
What seems to happen is that there is some sort of race condition such
that if the SIGKILL is sent by the parent after the child process has
called exec() but before the new executable actually starts running
then the signal can disappear. This only seems to happen if the
process that calls exec() has more than one thread, as is the case now
that valgrind has a proxy thread running.
My money is on the culprit being de_thread() in fs/exec.c in the
kernel, which does all sorts of mucking about to kill the extra
threads and unshare the signal handling apparatus. Either the signal
is pending and is discarded when a new clean signal handling structure
is created for the thread doing the exec() or the signal is being
delivered but is only killing the other threads and not the one doing
the exec().
Either way, patching valgrind to explicitly kill the proxy thread in
the pre handler for execve seems to work around the problem... Whether
this is a safe thing to do isn't totally clear to me though.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Tom H. <th...@cy...> - 2003-10-22 07:58:20
|
In message <106...@ix...>
Jeremy Fitzhardinge <je...@go...> wrote:
> On Tue, 2003-10-21 at 04:57, Tom Hughes wrote:
>
> > So the -512 is actually ERESTARTSYS but that shouldn't be making it
> > back to my program - either the system call should be restarted for me
> > or I should get EINTR back.
> >
> > Bizarrely if I install a signal handler for SIGCHLD then although
> > valgrind still shows ERESTARTSYS as the result of the wait4 system
> > call the waitpid library routine gives me a zero result, without it
> > having restarted the system call.
> >
> > Without the signal handler I get ERESTARTSYS back from the waitpid
> > library routine.
>
> Hm, yes, you shouldn't be seeing that. Can you send me your code (or
> ideally, carve out a minimal piece which shows the problem)?
My test program was attached to one of the earlier messages, but I'm
attaching it again here because they've been turning up rather out of
order...
> Which kernel are you using again?
RedHat 9, so 2.4.20 with various extra patches.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Robert W. <rj...@du...> - 2003-10-22 06:04:14
|
> I would like to use client requests to allow the application to=20 > communicate dynamically with the core. Is this possible?=20 >=20 > For instance, if a client is called from within a loop, I would like it t= o=20 > be executed multiple times rather than the once it appears to execute now= . This sounds wrong. It should execute multiple times like any other piece of code. Can you isolate the loop into a small test case? Regards, Robert. --=20 Robert Walsh Amalgamated Durables, Inc. - "We don't make the things you buy." Email: rj...@du... |
|
From: Ayodele T. <em...@st...> - 2003-10-22 05:54:01
|
I have seen the functions to print UCode instructions, but are there also functions to print the x86 instructions that can be used from a skin? Ayo -- --------------------------------------------------------------- Ayodele Bolaji Thomas "Joy in the Morning" Ph.D. Candidate Electrical Engineering Computer Systems Laboratory Stanford University Ayo...@st... --------------------------------------------------------------- |
|
From: Ayodele T. <em...@st...> - 2003-10-22 04:58:51
|
I would like to use client requests to allow the application to communicate dynamically with the core. Is this possible? For instance, if a client is called from within a loop, I would like it to be executed multiple times rather than the once it appears to execute now. Ayo -- --------------------------------------------------------------- Ayodele Bolaji Thomas "Joy in the Morning" Ph.D. Candidate Electrical Engineering Computer Systems Laboratory Stanford University Ayo...@st... Support our troops. Support our country. Pray for Peace. \o/ "We succeed, not because of Affirmative Action, but in spite of the conditions that make it necessary" (ABE '98) "War may sometimes be a necessary evil. But no matter how necessary, it is always an evil, never a good. We will not learn to live together in peace by killing each other's children." Jimmy Carter '02 --------------------------------------------------------------- |
|
From: Jeremy F. <je...@go...> - 2003-10-22 01:49:35
|
On Tue, 2003-10-21 at 16:08, Tom Hughes wrote:
> In message <1066774658.4409.2.camel@localhost.localdomain>
> Jeremy Fitzhardinge <je...@go...> wrote:
>
> > OK, there seems to be some slight timing difference between 2.4-RH and
> > 2.6, or something. Anyway, I think this is the right fix - can you test
> > it out?
>
> I'm running on a dual processor box which may change things given
> that there are at least three threads of control running here.
No, its all nicely reproducible on my UP laptop, so I think it's just
something to do with the way the kernels work. I'm not really sure why
I didn't see this under 2.6.
> That patch certainly improves things - it fixes the case where there is
> no handler installed for the signal. The modified test program that is
> attached to this message, and which installs a SIGCHLD handler, exhibits
> a different problem however.
>
> Specifically the wait4 system call exists with ERESTARTSYS but is isn't
> restarted and the waitpid library call returns 114 which is the number of
> the wait4 system call, as shown here:
Gak. I tested all this, honest. Try this (manually hacked up, so the
lines will be all off) patch.
Index: vg_proxylwp.c
===================================================================
RCS file: /cvsroot/valgrind/valgrind/coregrind/vg_proxylwp.c,v
retrieving revision 1.1
diff -c -r1.1 vg_proxylwp.c
*** vg_proxylwp.c 13 Oct 2003 22:26:54 -0000 1.1
--- vg_proxylwp.c 22 Oct 2003 00:34:49 -0000
***************
*** 714,720 ****
case PX_RunSyscall:
/* Run a syscall for our thread; results will be poked
back into tst */
! reply.syscallno = tst->m_eax;
vg_assert(px->state == PXS_WaitReq ||
px->state == PXS_SigACK);
--- 716,722 ----
case PX_RunSyscall:
/* Run a syscall for our thread; results will be poked
back into tst */
! reply.syscallno = tst->syscallno;
vg_assert(px->state == PXS_WaitReq ||
px->state == PXS_SigACK);
***************
*** 728,734 ****
reply.syscallno);
tst->m_eax = -VKI_ERESTARTSYS;
} else {
! Int syscallno = tst->m_eax;
px->state = PXS_RunSyscall;
/* If we're interrupted before we get to the syscall
--- 730,736 ----
reply.syscallno);
tst->m_eax = -VKI_ERESTARTSYS;
} else {
! Int syscallno = tst->syscallno;
px->state = PXS_RunSyscall;
/* If we're interrupted before we get to the syscall
***************
*** 1264,1269 ****
--- 1271,1279 ----
req.request = PX_RunSyscall;
+ tst->syscallno = tst->m_eax;
+ tst->m_eax = -VKI_ERESTARTSYS;
+
/* clear the results pipe before we try to write to a proxy to
prevent a deadlock */
VG_(proxy_results)();
|
|
From: Jeremy F. <je...@go...> - 2003-10-22 00:26:26
|
On Tue, 2003-10-21 at 04:04, Tom Hughes wrote:
> Unfortunately this is now hanging because the pre-syscall action for
> kill looks like this:
>
> PRE(kill)
> {
> /* int kill(pid_t pid, int sig); */
> MAYBE_PRINTF("kill ( %d, %d )\n", arg1,arg2);
> if (arg2 == VKI_SIGVGINT || arg2 == VKI_SIGVGKILL)
> res = -VKI_EINVAL;
> }
>
> Because this suppresses SIGKILL the waitpid then hangs...
No, it suppresses SIG*VG*INT and SIG*VG*KILL, which is are magic
internal signals I'm using (32 and 33).
I'm still trying to reproduce your ERESTARTSYS problem.
J
|
|
From: Tom H. <th...@cy...> - 2003-10-21 23:10:05
|
In message <1066774658.4409.2.camel@localhost.localdomain>
Jeremy Fitzhardinge <je...@go...> wrote:
> OK, there seems to be some slight timing difference between 2.4-RH and
> 2.6, or something. Anyway, I think this is the right fix - can you test
> it out?
I'm running on a dual processor box which may change things given
that there are at least three threads of control running here.
That patch certainly improves things - it fixes the case where there is
no handler installed for the signal. The modified test program that is
attached to this message, and which installs a SIGCHLD handler, exhibits
a different problem however.
Specifically the wait4 system call exists with ERESTARTSYS but is isn't
restarted and the waitpid library call returns 114 which is the number of
the wait4 system call, as shown here:
==9444== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux.
==9444== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==9444== Using valgrind-20030725, a program supervision framework for x86-linux.
==9444== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
--9444-- sys_wait_results: got PX_SetSigmask for TID 1
--9444-- sys_wait_results: got PX_SetSigmask for TID 1
==9444== Estimated CPU clock rate is 2005 MHz
==9444== For more details, rerun with: -v
==9444==
SYSCALL[9444,1](174) special:sigaction ( 17, 0xBFFFD420, 0x0 )
SYSCALL[9444,1]( 2):fork ()
fork: process 9444 created child 9451
SYSCALL[9444,1]( 37):kill ( 9451, 9 )
SYSCALL[9444,1](114) blocking:wait4 ( 9451, 0xBFFFD504, 0, 0x0 )
--9444-- sys_wait_results: got PX_Signal for TID 1, signal 17
--9444-- sys_wait_results: got PX_RunSyscall for TID 1: syscall 114 result -512
--9444-- sys_wait_results: got PX_SetSigmask for TID 1
SYSCALL[9444,1](197):fstat64 ( 1, 0xBFFFCD20 )
SYSCALL[9444,1](192):mmap2 ( 0x0, 4096, 3, 34, -1, 0 )
==9444== Use of uninitialised value of size 4
==9444== at 0x4027C94F: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so)
==9444== by 0x40285311: _IO_printf (in /lib/i686/libc-2.3.2.so)
==9444== by 0x8048668: main (in /home/thh/vgtest/sigkill)
==9444== by 0x40248A06: __libc_start_main (in /lib/i686/libc-2.3.2.so)
==9444==
==9444== Conditional jump or move depends on uninitialised value(s)
==9444== at 0x4027C957: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so)
==9444== by 0x40285311: _IO_printf (in /lib/i686/libc-2.3.2.so)
==9444== by 0x8048668: main (in /home/thh/vgtest/sigkill)
==9444== by 0x40248A06: __libc_start_main (in /lib/i686/libc-2.3.2.so)
==9444==
==9444== Conditional jump or move depends on uninitialised value(s)
==9444== at 0x4027C020: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so)
==9444== by 0x40285311: _IO_printf (in /lib/i686/libc-2.3.2.so)
==9444== by 0x8048668: main (in /home/thh/vgtest/sigkill)
==9444== by 0x40248A06: __libc_start_main (in /lib/i686/libc-2.3.2.so)
==9444==
==9444== Conditional jump or move depends on uninitialised value(s)
==9444== at 0x4027C07E: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so)
==9444== by 0x40285311: _IO_printf (in /lib/i686/libc-2.3.2.so)
==9444== by 0x8048668: main (in /home/thh/vgtest/sigkill)
==9444== by 0x40248A06: __libc_start_main (in /lib/i686/libc-2.3.2.so)
==9444==
==9444== Conditional jump or move depends on uninitialised value(s)
==9444== at 0x4027C0AF: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so)
==9444== by 0x40285311: _IO_printf (in /lib/i686/libc-2.3.2.so)
==9444== by 0x8048668: main (in /home/thh/vgtest/sigkill)
==9444== by 0x40248A06: __libc_start_main (in /lib/i686/libc-2.3.2.so)
==9444==
==9444== Conditional jump or move depends on uninitialised value(s)
==9444== at 0x4027C0E0: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so)
==9444== by 0x40285311: _IO_printf (in /lib/i686/libc-2.3.2.so)
==9444== by 0x8048668: main (in /home/thh/vgtest/sigkill)
==9444== by 0x40248A06: __libc_start_main (in /lib/i686/libc-2.3.2.so)
==9444==
==9444== Conditional jump or move depends on uninitialised value(s)
==9444== at 0x4027C10C: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so)
==9444== by 0x40285311: _IO_printf (in /lib/i686/libc-2.3.2.so)
==9444== by 0x8048668: main (in /home/thh/vgtest/sigkill)
==9444== by 0x40248A06: __libc_start_main (in /lib/i686/libc-2.3.2.so)
SYSCALL[9444,1]( 4) blocking:write ( 1, 0x40230000, 40 )
Child 114 exited with status 1073831928
--9444-- sys_wait_results: got PX_RunSyscall for TID 1: syscall 4 result 40
SYSCALL[9444,1]( 91):munmap ( 0x40230000, 4096 )
--9444-- Caught __NR_exit; running __libc_freeres()
SYSCALL[9444,1]( 91):munmap ( 0x0, 0 )
--9444-- __libc_freeres() done; really quitting!
==9444==
==9444== ERROR SUMMARY: 25 errors from 7 contexts (suppressed: 0 from 0)
==9444== malloc/free: in use at exit: 0 bytes in 0 blocks.
==9444== malloc/free: 0 allocs, 0 frees, 0 bytes allocated.
==9444== For a detailed leak analysis, rerun with: --leak-check=yes
==9444== For counts of detected errors, rerun with: -v
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Tom H. <th...@cy...> - 2003-10-21 23:07:00
|
In message <1066768769.4354.2.camel@localhost.localdomain>
Jeremy Fitzhardinge <je...@go...> wrote:
> On Tue, 2003-10-21 at 04:04, Tom Hughes wrote:
> > Unfortunately this is now hanging because the pre-syscall action for
> > kill looks like this:
> >
> > PRE(kill)
> > {
> > /* int kill(pid_t pid, int sig); */
> > MAYBE_PRINTF("kill ( %d, %d )\n", arg1,arg2);
> > if (arg2 == VKI_SIGVGINT || arg2 == VKI_SIGVGKILL)
> > res = -VKI_EINVAL;
> > }
> >
> > Because this suppresses SIGKILL the waitpid then hangs...
>
> No, it suppresses SIG*VG*INT and SIG*VG*KILL, which is are magic
> internal signals I'm using (32 and 33).
Yes I realised that shortly after I sent that which is what led to
the later posts pinpointing waitpid as the problem.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Jeremy F. <je...@go...> - 2003-10-21 22:34:35
|
On Tue, 2003-10-21 at 04:30, Tom Hughes wrote:
> My real problem seems to be that the waitpid is sometimes returning a
> nonsense error of -512 at the system call level. I've written a little
> test program, which is attached, and when it is run under valgrind
> this is what I see:
OK, there seems to be some slight timing difference between 2.4-RH and
2.6, or something. Anyway, I think this is the right fix - can you test
it out?
Index: vg_syscalls.c
===================================================================
RCS file: /cvsroot/valgrind/valgrind/coregrind/vg_syscalls.c,v
retrieving revision 1.50
diff -c -r1.50 vg_syscalls.c
*** vg_syscalls.c 19 Oct 2003 16:46:06 -0000 1.50
--- vg_syscalls.c 21 Oct 2003 22:12:12 -0000
***************
*** 4467,4472 ****
--- 4467,4482 ----
VGP_POPCC(VgpSkinSysWrap);
}
+ if (tst->m_eax == -VKI_ERESTARTSYS) {
+ /* Applications never expect to see this, so we should actually
+ restart the syscall (it means the signal happened before the
+ syscall made any progress, so we can safely restart it and
+ pretend the signal happened before the syscall even
+ started) */
+ tst->m_eax = tst->syscallno;
+ tst->m_eip -= 2; /* sizeof(int $0x80) */
+ }
+
tst->status = VgTs_Runnable; /* runnable again */
tst->syscallno = -1;
J
|
|
From: Tom H. <th...@cy...> - 2003-10-21 18:16:54
|
In message <106...@ix...>
Jeremy Fitzhardinge <je...@go...> wrote:
> This is a large, complex change. I've been testing it pretty
> extensively, but I expect there's still some bugs in there (I found one
> just before checking in). Please sync to CVS HEAD and try it out on
> your favorite programs.
I've another problem now - all attempts to send SIGINT or SIGKILL seem
to be suppressed for some reason. We have some code which does this:
kill( child, SIGKILL );
waitpid( child, &status, 0 );
Unfortunately this is now hanging because the pre-syscall action for
kill looks like this:
PRE(kill)
{
/* int kill(pid_t pid, int sig); */
MAYBE_PRINTF("kill ( %d, %d )\n", arg1,arg2);
if (arg2 == VKI_SIGVGINT || arg2 == VKI_SIGVGKILL)
res = -VKI_EINVAL;
}
Because this suppresses SIGKILL the waitpid then hangs...
Actually there is something even more bizarre than that happening
as we only actually do the waitpid if the kill succeeds, but we still
seem to be hanging in waitpid, as shown:
SYSCALL[14042,1]( 37):kill ( 14080, 9 )
SYSCALL[14042,1](114) blocking:wait4 ( 14080, 0xBFFF69F0, 0, 0x0 )
Anyway, the point is that it's highly antisocial to mysteriously
break kill for certain signals, and I can't see an obvious reason
for it.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Jeremy F. <je...@go...> - 2003-10-21 17:03:50
|
On Tue, 2003-10-21 at 04:57, Tom Hughes wrote: > In message <yek...@au...> > Tom Hughes <th...@cy...> wrote: > > > My real problem seems to be that the waitpid is sometimes returning a > > nonsense error of -512 at the system call level. I've written a little > > test program, which is attached, and when it is run under valgrind > > this is what I see: > > So the -512 is actually ERESTARTSYS but that shouldn't be making it > back to my program - either the system call should be restarted for me > or I should get EINTR back. > > Bizarrely if I install a signal handler for SIGCHLD then although > valgrind still shows ERESTARTSYS as the result of the wait4 system > call the waitpid library routine gives me a zero result, without it > having restarted the system call. > > Without the signal handler I get ERESTARTSYS back from the waitpid > library routine. Hm, yes, you shouldn't be seeing that. Can you send me your code (or ideally, carve out a minimal piece which shows the problem)? Which kernel are you using again? J |
|
From: Tom H. <th...@cy...> - 2003-10-21 15:31:41
|
In message <yek...@au...>
Tom Hughes <th...@cy...> wrote:
> My real problem seems to be that the waitpid is sometimes returning a
> nonsense error of -512 at the system call level. I've written a little
> test program, which is attached, and when it is run under valgrind
> this is what I see:
So the -512 is actually ERESTARTSYS but that shouldn't be making it
back to my program - either the system call should be restarted for me
or I should get EINTR back.
Bizarrely if I install a signal handler for SIGCHLD then although
valgrind still shows ERESTARTSYS as the result of the wait4 system
call the waitpid library routine gives me a zero result, without it
having restarted the system call.
Without the signal handler I get ERESTARTSYS back from the waitpid
library routine.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Jeremy F. <je...@go...> - 2003-10-20 18:20:19
|
On Mon, 2003-10-20 at 01:27, Nicholas Nethercote wrote: > On Tue, 14 Oct 2003, Jeremy Fitzhardinge wrote: > > > This is a large, complex change. I've been testing it pretty > > extensively, but I expect there's still some bugs in there (I found one > > just before checking in). Please sync to CVS HEAD and try it out on > > your favorite programs. > > --trace-malloc=yes seems to be broken. Running on 'date', I get hundreds > of these messages: > > ==12538== Warning: bad use of file descriptor 821 in syscall write() > ==12538== Use --logfile-fd=<number> to select an alternative logfile fd. > > but no --trace-malloc output. Kernel 2.4.19 again, glibc-2.3.2. Looks like a bug in --trace-malloc - its printfs seem to be being run in the virtual machine rather than the real one... I think Robert's VALGRIND_PRINTF patch might be in order. J |
|
From: Nicholas N. <nj...@ca...> - 2003-10-20 10:07:17
|
On Tue, 14 Oct 2003, Jeremy Fitzhardinge wrote: > This is a large, complex change. I've been testing it pretty > extensively, but I expect there's still some bugs in there (I found one > just before checking in). Please sync to CVS HEAD and try it out on > your favorite programs. --trace-malloc=yes seems to be broken. Running on 'date', I get hundreds of these messages: ==12538== Warning: bad use of file descriptor 821 in syscall write() ==12538== Use --logfile-fd=<number> to select an alternative logfile fd. but no --trace-malloc output. Kernel 2.4.19 again, glibc-2.3.2. N |
|
From: Tom H. <th...@cy...> - 2003-10-19 23:29:19
|
In message <106...@ix...>
Jeremy Fitzhardinge <je...@go...> wrote:
> On Sun, 2003-10-19 at 07:38, Tom Hughes wrote:
>
> > Your commit has finally shown up in the anonymous CVS tree and it
> > all seems to be fine so far, except that valgrind fails unless you
> > have a large soft limit on file descriptors, as follows:
>
> Thanks, I was wondering when this would turn up as a problem. I guess
> I'll have to try and raise the soft limit to at least 1024.
The only problem with that is, of course, that it makes it impossible
to use valgrind on any sort of problem where the number of available
file descriptors affects the behaviour of the program.
I think that probably what needs to be done is something along the
following lines:
- Add a switch (--descriptor-limit?) to control the descriptor
limit that valgrind will try and select
- Make valgrind raise the soft limit to the descriptor limit
chosen by the user using the switch, or 1024 if no switch
- Make the safe_fd routine start allocating FDs from some offset
below the chosen soft limit rather than a fixed point
That should make everything work reasonably well by default whilst
still allowing the user to reduce the limit on the rare occasions
when it is necessary to provoke a bug.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Jeremy F. <je...@go...> - 2003-10-19 16:43:59
|
On Sun, 2003-10-19 at 07:38, Tom Hughes wrote: > In message <106...@ix...> > Jeremy Fitzhardinge <je...@go...> wrote: > > > This is a large, complex change. I've been testing it pretty > > extensively, but I expect there's still some bugs in there (I found one > > just before checking in). Please sync to CVS HEAD and try it out on > > your favorite programs. > > Your commit has finally shown up in the anonymous CVS tree and it > all seems to be fine so far, except that valgrind fails unless you > have a large soft limit on file descriptors, as follows: Thanks, I was wondering when this would turn up as a problem. I guess I'll have to try and raise the soft limit to at least 1024. J |
|
From: Tom H. <th...@cy...> - 2003-10-19 15:03:54
|
In message <106...@ix...>
Jeremy Fitzhardinge <je...@go...> wrote:
> This is a large, complex change. I've been testing it pretty
> extensively, but I expect there's still some bugs in there (I found one
> just before checking in). Please sync to CVS HEAD and try it out on
> your favorite programs.
Your commit has finally shown up in the anonymous CVS tree and it
all seems to be fine so far, except that valgrind fails unless you
have a large soft limit on file descriptors, as follows:
audi [~] % valgrind ls
==2035== valgrind: failed to move logfile fd into safe range
==2035== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux.
==2035== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==2035== Using valgrind-20030725, a program supervision framework for x86-linux.
==2035== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
valgrind: vg_mylibc.c:1229 (vgPlain_safe_fd): Assertion `newfd > (1024 - (100*2 + 4))' failed.
==2035== at 0x4016E66B: ???
==2035== by 0x4016E66A: ???
==2035== by 0x4016E6C5: ???
==2035== by 0x4016E7B8: ???
Looking at the code this is because the code that sets up the proxy
threads tries to dup descriptors up to a high number but my descriptor
limit was set to 256 by default.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Tom H. <th...@cy...> - 2003-10-15 13:53:47
|
In message <5.2.1.1.0.20031015141506.01f4c4d0@pop3>
Adam Gundy <ar...@cy...> wrote:
> attached is a patch to add support for POP %FS and POP %GS. These are used
> by WINE. Code is a simple cut & paste from the other POP <seg> support.
You could have done push while you were at it Adam ;-) Attached is a
new version that does push as well as pop.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Adam G. <ar...@cy...> - 2003-10-15 13:18:21
|
attached is a patch to add support for POP %FS and POP %GS. These are used by WINE. Code is a simple cut & paste from the other POP <seg> support. Seeya, Adam -- Real Programmers don't comment their code. If it was hard to write, it should be hard to read, and even harder to modify. These are all my own opinions. |
|
From: Jeremy F. <je...@go...> - 2003-10-14 17:30:11
|
Hi all,
I've checked in all my system calls rework of the last few months onto
CVS HEAD. This patch revamps the way Valgrind handles signals and
syscalls by eliminating the need to turn all blocking syscalls into
non-blocking. It does this by creating a worker task for each thread
(ProxyLWP is the term I use) which executes blocking syscalls on the
behalf of that thread. It also handles signals, since it needs to be
the same thread to get signals interrupting syscalls properly.
My intention with this change was to make Valgrind support the full
range of Posix signal behaviour. It should now handle all the strange
corners of signal handling, including the SA_SIGINFO flag, blocking
signals around a handler, and so on.
There are a couple of minor improvements which make day-to-day use
better as well:
* It now reports fatal signals rather than just quietly dying; if
the signal is a core-dumper, it also shows the stack backtrace.
* When reporting the use of a bad FD, -v also makes it show the
backtrace to the site of the bad usage (making the message
actually useful)
* Lots of other little things
My target for this patch was Linux 2.6. I've tried as hard as possible
to make Linux 2.4 work well, but there are a few deficient corners - I
suspect no real programs will encounter them (especially if they ever
worked with the previous signals/syscalls implementation). RedHat's
version of 2.4 includes a number of 2.6 changes, and so should work more
like 2.6 than 2.4.
This is a large, complex change. I've been testing it pretty
extensively, but I expect there's still some bugs in there (I found one
just before checking in). Please sync to CVS HEAD and try it out on
your favorite programs.
New command-line options:
* --signal-polltime=<time> - on 2.4, Valgrind has to poll for
pending signals to deliver them to the appropriate threads. By
default this is every 50ms.
* --lowlat-signals=no|yes - normally the thread scheduler applies
strict a round-robin policy, but if you have a thread blocking
on signals competing against lots of CPU-bound threads, it could
take a while before it gets run. This option makes the
scheduler start running the signalled thread immediately.
* --lowlat-syscalls=no|yes - this is the same as lowlat-signals,
except that it wakes the thread quickly when a syscall unblocks.
J
|
|
From: Josef W. <Jos...@gm...> - 2003-10-10 18:36:38
|
Hi, in Valgrind CVS, there are still some unsupported SSE2 instructions, namely conversion functions for 2 packed values (e.g. cvtdq2pd). Can someone give me a hint how to write support for it? I looked around and I am not sure if I need to introduce new uops, or can these packed versions somehow be simulated with the already existing SSE uops? Josef |