You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
(20) |
2
(6) |
3
(11) |
4
(1) |
|
5
|
6
(2) |
7
(13) |
8
(14) |
9
(3) |
10
(3) |
11
(2) |
|
12
(4) |
13
|
14
(8) |
15
(6) |
16
(7) |
17
(4) |
18
(3) |
|
19
(5) |
20
(4) |
21
(10) |
22
(6) |
23
|
24
(7) |
25
|
|
26
(6) |
27
(6) |
28
(2) |
29
(4) |
30
(5) |
31
(7) |
|
|
From: Nicholas N. <nj...@cs...> - 2006-03-08 22:43:58
|
On Wed, 8 Mar 2006 val...@ce... wrote: > I am starting to use Valgrinds cachegrind to test my program for caches > performance. Once i do this, how can I improve the cache performance? that > is, how can I eliminate or reduce cache missies in my app? Are certain types > of programming techniques more prone to cache misses? For example, nested > loops? Certain data type declarations etc? Is there a good source of > information on the internet? There is no easy answer. Cachegrind points you in the right direction, but it can't tell you how directly how to fix your program. There is lots of information about caches on the internet, google for some. In general, try to make the crucial data structures smaller, and improve locality of accesses to them. Nick |
|
From: Ron V. I. <van...@ca...> - 2006-03-08 20:03:52
|
I don't know if this is considered a bug in Valgrind or not but it is a
feature of Purify that Valgrind does not support. Valgrind appears to
ensure that newing code with [] match the deletes with []s, mallocs
match the frees and so on. However, it does not appear to catch when
you destruct something as the wrong type. eg, the following code
generates no errors from Valgrind and runs to completion with the
expected results:
#include <iostream>
class foo{
public:
virtual ~foo(){std::cout<<"In foo's destructor"<<std::endl;}
};
class bar{
public:
virtual ~bar(){std::cout<<"In bar's destructor"<<std::endl;}
};
int main()
{
bar* b = new bar;
foo* f = (foo*)b;
delete f;
}
This kind of code can often lead to heap corruption on our old MIPS
systems and, I would think, the same is likely true of Xeon EM64T
systems. Can this lead to heap corruption on Linux or was this simply a
problem in the older system?
--Ron
|
|
From: Ron V. I. <van...@ca...> - 2006-03-08 19:54:52
|
First, let me start off with saying valgrind is one of the most wonderful pieces of open source software I have used, ranking right up there with g++, in making my day to day work livable. I do have one small itch relating to valgrind and that is the fact that you cannot use both db-attach and trace-children at the same time. On my project, we currently run on larger SGI MIPS 64processor systems and are migrating to multi processor Xeon systems. On MIPS, we use purify to debug our simulation software which often requires 40+ processors to run. Purify supports what they call Just-In-Time (JIT) Debugging which pops up a dbx session in another window whenever a bug is encountered. One can also have this occur only on corrupting errors rather than just on non corrupting errors such as uninitialized memory reads. The JIT feature of purify is very useful since our large simulation often has memory issues that are much easier to debug if a debugger is provided at the moment of the error occuring. I searched the mailing list archives and saw this question come up in August of 2005 and Tom Hughes supposed that the problem with db-attach and multiple children is a focus issue with the terminal. Purify solved that by simply popping open a new xterm with dbx for each child that encountered a problem. This could, of course, be done using ddd or some other GUI debugger as well. This certainly solves the focus issue and, if that is the only problem, it would be nice to have this fixed. With a bit of direction, I could try to implement this myself but I have not even begun digging around in the source code yet... --Ron |
|
From: Tom S. <to...@pl...> - 2006-03-08 19:09:14
|
When running valgrind-3.1.0 (Debian) on one specific app, I immediately
get:
valgrind: mmap(0x804A000, 39997440) failed in UME.
But I also know why! Here is how to reproduce the problem:
#include <stdio.h>
int main(int argc, char * argv[]) {
static int idata[10000000];
printf("Hello world!");
return(0);
}
If I get rid of the huge static array, the program runs fine under
valgrind.
--
Tom Schutter (mailto:to...@pl...)
Platte River Associates, Inc. (http://www.platte.com)
|
|
From: <val...@ce...> - 2006-03-08 17:33:37
|
I am starting to use Valgrinds cachegrind to test my program for caches performance. Once i do this, how can I improve the cache performance? that is, how can I eliminate or reduce cache missies in my app? Are certain types of programming techniques more prone to cache misses? For example, nested loops? Certain data type declarations etc? Is there a good source of information on the internet? Thanks, CB |
|
From: Ashley P. <as...@qu...> - 2006-03-08 10:33:15
|
On Wed, 2006-03-08 at 12:42 +1100, Dave Airlie wrote: > > > > Nope. You have the address -- just dereference it! > > > > I knew I was missing something simple, forgot that data was already > written to the address at that time.. > > I'd really like to know though what the system was going to write > before it wrote it as well, as some graphics chips may have write-only > registers, where readback gives a totally different answer... in that > case do I need to do that hard work? This tool is exactly the same as one I was discussing on Monday, we to do something very similar (with a network card rather than a GPU) and have the problem you describe here that the mmaped memory is write-only, it gives back garbage when read. I'd be very interested in seeing your tool when it's up and running. Ashley, |
|
From: Tom H. <to...@co...> - 2006-03-08 07:13:36
|
In message <200...@we...>
Beorn Johnson <beo...@ya...> wrote:
> Although I didn't get any response the first time I
> inquired (several weeks ago), I thought I would try
> again (original question quoted below).
>
> I am mearly looking for some hints/guidance as
> to why the implementation of sys_tkill is sufficiently
> complicated that it's commented out, and maybe even
> what might have to be done to get something that
> works.
I fixed up the tkill wrapper a few weeks ago - so this should
work in the next release.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: John R.
|
Beorn Johnson wrote: >>So, it seems there are some subtleties here that aren't obvious. > I am mearly looking for some hints/guidance as > to why the implementation of sys_tkill is sufficiently > complicated that it's commented out, and maybe even > what might have to be done to get something that > works. Valgrind's internal model of threads is [was] not isomorphic to the model used by the Linux kernel+glibc. There are control, scheduling, and "multiplexing" issues. Also, there is the problem that Linux has [had] two models: (old) linuxthreads and (new) nptl. A different product, such as Insure++, may work sooner. -- |
|
From: Beorn J. <beo...@ya...> - 2006-03-08 04:42:13
|
Although I didn't get any response the first time I
inquired (several weeks ago), I thought I would try
again (original question quoted below).
I am mearly looking for some hints/guidance as
to why the implementation of sys_tkill is sufficiently
complicated that it's commented out, and maybe even
what might have to be done to get something that
works.
Thank you for any response,
Beorn Johnson
--- Beorn Johnson <beo...@ya...> wrote:
> Date: Sat, 4 Feb 2006 21:09:48 -0800 (PST)
> From: Beorn Johnson <beo...@ya...>
> Subject: sys_tkill / pthread_kill
> To: val...@li...
>
> Hello,
>
> Recently someone added some code to a project which uses
> 'pthread_kill', which in turn uses the 'tkill' system call.
>
> This resulted in the "README_MISSING_SYSCALL_OR_IOCTL" error message,
> which I did, and as I dug into the matter further, I found in
> valgrind-3.1.0/coregrind/m_syswrap/syswrap-linux.c (including the '//zz's):
>
> //zz PRE(sys_tkill, Special)
> //zz {
> //zz /* int tkill(pid_t tid, int sig); */
> //zz PRINT("sys_tkill ( %d, %d )", ARG1,ARG2);
> //zz PRE_REG_READ2(long, "tkill", int, tid, int, sig);
> //zz if (!ML_(client_signal_OK)(ARG2)) {
> //zz SET_STATUS_( -VKI_EINVAL );
> //zz return;
> //zz }
> //zz
> //zz /* If we're sending SIGKILL, check to see if the target is one of
> //zz our threads and handle it specially. */
> //zz if (ARG2 == VKI_SIGKILL && ML_(do_sigkill)(ARG1, -1))
> //zz SET_STATUS_(0);
> //zz else
> //zz SET_STATUS_(VG_(do_syscall2)(SYSNO, ARG1, ARG2));
> //zz
> //zz if (VG_(clo_trace_signals))
> //zz VG_(message)(Vg_DebugMsg, "tkill: sent signal %d to pid %d",
> //zz ARG2, ARG1);
> //zz // Check to see if this kill gave us a pending signal
> //zz XXX FIXME VG_(poll_signals)(tid);
> //zz }
>
>
> So, it seems there are some subtleties here that aren't obvious.
> The same thing appears in the svn version.
>
> Anyone have any suggestions as to how to proceed?
>
> Thanks,
>
> Beorn Johnson
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
|
|
From: Nicholas N. <nj...@cs...> - 2006-03-08 03:52:04
|
On Wed, 8 Mar 2006, Dave Airlie wrote: > I'd really like to know though what the system was going to write > before it wrote it as well, as some graphics chips may have write-only > registers, where readback gives a totally different answer... in that > case do I need to do that hard work? In that case you need to pass in the to-be-stored value to trace_store(). Imagine a store, something like: STORE (t1), t2 which stores the value in register t2 in the address pointed to by t1. Currently you're passing in t1 to trace_store(). You want to also pass in t2. You are already almost doing it -- you have three args in the instrumentation code, but trace_store() only takes two. I don't think data_expr is what you want, but you're close. Memcheck and/or Cachegrind must do similar things in places. Nick |
|
From: Dave A. <ai...@gm...> - 2006-03-08 01:42:22
|
> > Nope. You have the address -- just dereference it! > I knew I was missing something simple, forgot that data was already written to the address at that time.. I'd really like to know though what the system was going to write before it wrote it as well, as some graphics chips may have write-only registers, where readback gives a totally different answer... in that case do I need to do that hard work? Dave. |
|
From: Olly B. <ol...@su...> - 2006-03-08 01:40:01
|
On 2006-03-08, Nicholas Nethercote <nj...@cs...> wrote:
> ULong data;
> switch (size) {
> case 1: data = *(UChar*)addr;
> case 2: data = *(UShort*)addr;
> case 4: data = *(UInt*)addr;
> case 8: data = *(ULong*)addr;
> default: /*barf*/
> }
>
> I think that'll work.
Not without the addition of "break;" after each case!
Cheers,
Olly
|
|
From: Nicholas N. <nj...@cs...> - 2006-03-08 01:36:47
|
On Wed, 8 Mar 2006, Dave Airlie wrote:
> I wrote a trivial store interceptor copying from lackey, and I've set
> it up to intercept mmap (changed new_mem_mmap to see the offset) and
> if I like the offset I want to dump info in that area.
>
> This is fine and I can tell when the app writes to the address, and
> what address it is and how much, however I'm not really sure how I can
> actually extract the data the app is writing.
>
> Do I need to do something along the lines of memcheck with all the
> shadowing stuff in order to get what data is being written into the
> area in my handler function?
Nope. You have the address -- just dereference it!
> static VG_REGPARM(3) void trace_store(Addr addr, SizeT size)
> {
> if ((addr > mmt_gpu_reg.addr) && (addr < mmt_gpu_reg.addr+mmt_gpu_reg.len))
> VG_(printf)("store: %p, %d\n", addr-mmt_gpu_reg.addr, size);
> }
Here you need something like this (warning, untested code):
ULong data;
switch (size) {
case 1: data = *(UChar*)addr;
case 2: data = *(UShort*)addr;
case 4: data = *(UInt*)addr;
case 8: data = *(ULong*)addr;
default: /*barf*/
}
I think that'll work.
Nick
|
|
From: Dave A. <ai...@gm...> - 2006-03-08 01:26:43
|
I want to trace all writes to an mmaped piece of RAM (a GPU) using valgrind=
,
I wrote a trivial store interceptor copying from lackey, and I've set
it up to intercept mmap (changed new_mem_mmap to see the offset) and
if I like the offset I want to dump info in that area.
This is fine and I can tell when the app writes to the address, and
what address it is and how much, however I'm not really sure how I can
actually extract the data the app is writing.
Do I need to do something along the lines of memcheck with all the
shadowing stuff in order to get what data is being written into the
area in my handler function?
Some of my code is below (unworking.. can't push non-32 or 64-bit words)
Dave.
static struct mmt_memmap {
Addr addr;
SizeT len;
} mmt_gpu_reg;
static VG_REGPARM(3) void trace_store(Addr addr, SizeT size)
{
if ((addr > mmt_gpu_reg.addr) && (addr < mmt_gpu_reg.addr+mmt_gpu_reg.len=
))
VG_(printf)("store: %p, %d\n", addr-mmt_gpu_reg.addr, size);
}
static void mmt_post_clo_init(void)
{
}
static
IRBB* mmt_instrument ( VgCallbackClosure* closure,
IRBB* bb_in,
VexGuestLayout* layout,
VexGuestExtents* vge,
IRType gWordTy, IRType hWordTy )
{
IRBB* bb;
IRStmt* st;
IRDirty* di;
int i;
IRExpr** argv;
IRExpr* addr_expr;
IRExpr* size_expr;
IRExpr* data_expr;
IRType arg_ty;
if (gWordTy !=3D hWordTy) {
/* We don't currently support this case. */
VG_(tool_panic)("host/guest word size mismatch");
}
/* Set up BB */
bb =3D emptyIRBB();
bb->tyenv =3D dopyIRTypeEnv(bb_in->tyenv);
bb->next =3D dopyIRExpr(bb_in->next);
bb->jumpkind =3D bb_in->jumpkind;
// Copy verbatim any IR preamble preceding the first IMark
for (i =3D 0; i < bb_in->stmts_used; i++) {
st =3D bb_in->stmts[i];
tl_assert(st);
switch(st->tag) {
case Ist_Store:
arg_ty =3D typeOfIRExpr(bb->tyenv, st->Ist.Store.data);
addr_expr =3D st->Ist.Store.addr;
size_expr =3D mkIRExpr_HWord(sizeofIRType(arg_ty));
data_expr =3D st->Ist.Store.data;
switch(arg_ty)
{
}
switch(data_expr->tag) {
case Iex_Tmp:
case Iex_Const:
=09argv =3D mkIRExprVec_3( addr_expr, size_expr, data_expr );
=09di =3D unsafeIRDirty_0_N( /*regparms*/2,
=09=09=09=09"trace_store",
=09=09=09=09VG_(fnptr_to_fnentry)( trace_store ),
=09=09=09=09argv );
=09addStmtToIRBB( bb, IRStmt_Dirty(di) );
=09break;
default:
=09break;
}
default:
break;
}
addStmtToIRBB( bb, st );
}
return bb;
|
|
From: Julian S. <js...@ac...> - 2006-03-07 21:15:12
|
> When the first (have_fp) SIGILL is generated, the OS (or glibc/gcc or > other) blocks subsequent SIGILL's while the registered handler > function (handler_sigill) is running, with the intent of unblocking > signals on return from the function. In the testcase (and m_machine.c) > the handler never returns, but longjmps out. This has the effect of > leaving signals blocked. Yes. I eventually discovered this too. It is fixed in svn rev 5662 (for the trunk) and 5703 (3.1 branch). > My fix was to unmask symbols after return from longjmp, but you could > also set up the hander to not block symbols using SA_NODEFER. 5662/5703 use the SA_NODEFER solution. It would be good if you could check out and test the 3.1 branch and/or the trunk (preferably both) to check they work for you. It's easy: svn co svn://svn.valgrind.org/valgrind/trunk (for the trunk) or svn co svn://svn.valgrind.org/valgrind/branches/VALGRIND_3_1_BRANCH then cd into the directory you get ./autogen.sh then configure/build in the normal way. J |
|
From: Scott M. <ss...@us...> - 2006-03-07 21:04:47
|
I'm attempting to respond (far after the fact) to a mail on this list with the given subject (http://sourceforge.net/mailarchive/message.php?msg_id=14689132). I'm pretty sure that there is a generic logic bug that is causing the given failure. Attached is a simple test program that emulates the behavior of m_machine.c in the floating point and vmx detection. When the first (have_fp) SIGILL is generated, the OS (or glibc/gcc or other) blocks subsequent SIGILL's while the registered handler function (handler_sigill) is running, with the intent of unblocking signals on return from the function. In the testcase (and m_machine.c) the handler never returns, but longjmps out. This has the effect of leaving signals blocked. The next signal is generated (have_vmx), but is not delivered because it is blocked. Then, when the default action (saved_act) and mask (saved_set) is restored, the signal is delivered and the default signal handler kills the app. The failure only occurs when the system does not have a fpu *and* does not have vmx. My guess as to why others haven't seen this problem is that most people with ppcnf systems must be running with kernel FPU emulation turned on. The testcase uses setjmp and longjmp, but their __builtin_ siblings behave the same way. The test case's usage is: Usage: sigill-test <enable_workaround> have_fp have_vmx each argument takes 1 or 0 and defaults to 0 The only time a sigill kills the program is with args 0 0 0 (the defaults). Ie, it only fails when you do not enable the fix, and do not have a fpu or vmx. My fix was to unmask symbols after return from longjmp, but you could also set up the hander to not block symbols using SA_NODEFER. Scott |
|
From: Harry M. <hj...@ta...> - 2006-03-07 15:28:26
|
My apologies - I posted too fast. I missed some error messages in some of my own stderr. Valgrind wins again. (slams head against wall) hjm Julian Seward wrote: > On Tuesday 07 March 2006 07:42, Harry Mangalam wrote: >> I have an odd bug - it's actually a non-bug. My vanilla C/terminal app >> runs fine when run under valgrind > > When you say 'runs fine', you mean V reports no errors at all? > > J > > |
|
From: Tom H. <to...@co...> - 2006-03-07 14:28:28
|
In message <114...@po...>
Patrick Ohly <pat...@in...> wrote:
> In gdb, I see:
>
> #0 0x04148034 in __libc_write () from /lib/i686/libc.so.6
> #1 0x040469b8 in __DTOR_END__ () from /lib/i686/libpthread.so.0
> #2 0x0403f37d in __pthread_create_2_1 (thread=0xbefe9e30, attr=0x0,
> start_routine=0x8048520 <threadhandler>, arg=0x0) at pthread.c:673
I wonder how it is doing that...
> I'm not sure about this _DTOR_END__ and how the program ended up there.
> 0x403aa1c contains:
> 0x0403aa1c: jmp *0x18(%ebx)
That's the PLT entry that links the two shared libraries together
at a guess.
> These are the standard libraries as included in RH AS2.1 and I
> am not sure how exactly those are built, but the gcc 2.96 in that
> release does not use DWARF by default. I suppose readelf should
> be able to tell me what is really contained in
> /lib/i686/libpthread-0.9.so but I need some guidance what to
> watch out for - I have never dealt with debug information at
> that level.
>
> Regarding frame pointers, __pthread_create_2_1 starts with the
> usual code for frame pointers:
> #2 0x0403f37d in __pthread_create_2_1 (thread=0xbefe9e30, attr=0x0,
> start_routine=0x8048520 <threadhandler>, arg=0x0) at pthread.c:673
> Line number 673 out of range; pthread.c has 15 lines.
> (gdb) disass
> Dump of assembler code for function __pthread_create_2_1:
> 0x0403f2f0 <__pthread_create_2_1+0>: push %ebp
> 0x0403f2f1 <__pthread_create_2_1+1>: mov %esp,%ebp
> 0x0403f2f3 <__pthread_create_2_1+3>: push %esi
> 0x0403f2f4 <__pthread_create_2_1+4>: push %ebx
> 0x0403f2f5 <__pthread_create_2_1+5>: sub $0xa0,%esp
> ...
>
> libc however apparently is not compiled with frame pointers,
> unless I misinterpret something here:
> (gdb) disass __libc_write
> Dump of assembler code for function __libc_write:
> 0x04148020 <__libc_write+0>: push %ebx
> 0x04148021 <__libc_write+1>: mov 0x10(%esp,1),%edx
> 0x04148025 <__libc_write+5>: mov 0xc(%esp,1),%ecx
> 0x04148029 <__libc_write+9>: mov 0x8(%esp,1),%ebx
> 0x0414802d <__libc_write+13>: mov $0x4,%eax
> 0x04148032 <__libc_write+18>: int $0x80
> 0x04148034 <__libc_write+20>: pop %ebx
> ...
That is the problem then - as __libc_write does not construct a
stack frame, the PC will appear to be in that routine but when the
stack is unwound we will wind up in the caller of __pthread_create_2_1.
When there are no frame pointers unwinding the stack is hard (there
are tricks but I didn't realise gdb used them) unless the debug
info tells you how, and stabs can't do that (DWARF can).
The debugger we use (ups) does have tricks for unwinding the stack
when there is no frame pointer by searching the stack looking for
an address which appears to map to a call instruction targeting
the current routine - my colleague wrote the code in question and
it has been maintained by me more recently. That is really horrible
stuff though.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Patrick O. <pat...@in...> - 2006-03-07 13:32:12
|
On Tue, 2006-03-07 at 12:42 +0000, Tom Hughes wrote: > In message <114...@po...> > Patrick Ohly <pat...@in...> wrote: > > > On Tue, 2006-03-07 at 11:17 +0000, Tom Hughes wrote: > > > >> Some of them are virtually impossible to suppress as I recall > >> as the write appears to come directly from the user program with > >> no sign of libpthread at all... > > > > That's exactly the problem here: why is the stack backtrace > > incomplete and doesn't list the write() as coming from > > inside libpthread? When I use --db-attach on such a write > > gdb correctly tells me that, but valgrind skips one > > level in the callstack. > > I was assuming it was a tail call from libpthread to write that > caused the stack frame to disappear. In gdb, I see: #0 0x04148034 in __libc_write () from /lib/i686/libc.so.6 #1 0x040469b8 in __DTOR_END__ () from /lib/i686/libpthread.so.0 #2 0x0403f37d in __pthread_create_2_1 (thread=0xbefe9e30, attr=0x0, start_routine=0x8048520 <threadhandler>, arg=0x0) at pthread.c:673 According to /proc/*/maps #2 is also in libpthread.so. The 0x0403f37d is after a normal call instruction, not a jmp: 0x0403f36f <__pthread_create_2_1+127>: mov 0x1f4(%ebx),%eax 0x0403f375 <__pthread_create_2_1+133>: mov (%eax),%ecx 0x0403f377 <__pthread_create_2_1+135>: push %ecx 0x0403f378 <__pthread_create_2_1+136>: call 0x403aa1c 0x0403f37d <__pthread_create_2_1+141>: add $0x10,%esp 0x0403f380 <__pthread_create_2_1+144>: inc %eax 0x0403f381 <__pthread_create_2_1+145>: je 0x403f3b0 <__pthread_create_2_1+192> ... I'm not sure about this _DTOR_END__ and how the program ended up there. 0x403aa1c contains: 0x0403aa1c: jmp *0x18(%ebx) gdb shows that it is feasible to detect the __pthread_create_2_1 call in the stack backtrace; whether it is possible with reasonable effort in valgrind is of course a different story. > Do you have DWARF debugging available for glibc and libpthread? or > have them built with frame pointers? These are the standard libraries as included in RH AS2.1 and I am not sure how exactly those are built, but the gcc 2.96 in that release does not use DWARF by default. I suppose readelf should be able to tell me what is really contained in /lib/i686/libpthread-0.9.so but I need some guidance what to watch out for - I have never dealt with debug information at that level. Regarding frame pointers, __pthread_create_2_1 starts with the usual code for frame pointers: #2 0x0403f37d in __pthread_create_2_1 (thread=0xbefe9e30, attr=0x0, start_routine=0x8048520 <threadhandler>, arg=0x0) at pthread.c:673 Line number 673 out of range; pthread.c has 15 lines. (gdb) disass Dump of assembler code for function __pthread_create_2_1: 0x0403f2f0 <__pthread_create_2_1+0>: push %ebp 0x0403f2f1 <__pthread_create_2_1+1>: mov %esp,%ebp 0x0403f2f3 <__pthread_create_2_1+3>: push %esi 0x0403f2f4 <__pthread_create_2_1+4>: push %ebx 0x0403f2f5 <__pthread_create_2_1+5>: sub $0xa0,%esp ... libc however apparently is not compiled with frame pointers, unless I misinterpret something here: (gdb) disass __libc_write Dump of assembler code for function __libc_write: 0x04148020 <__libc_write+0>: push %ebx 0x04148021 <__libc_write+1>: mov 0x10(%esp,1),%edx 0x04148025 <__libc_write+5>: mov 0xc(%esp,1),%ecx 0x04148029 <__libc_write+9>: mov 0x8(%esp,1),%ebx 0x0414802d <__libc_write+13>: mov $0x4,%eax 0x04148032 <__libc_write+18>: int $0x80 0x04148034 <__libc_write+20>: pop %ebx ... BTW, if I run the simple test program (one with pthread_create/join in main) directly in gdb with a break point in __libc_write, I never get such an unusual stack backtrace. Could that be an artifact of the valgrind engine? -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. |
|
From: Tom H. <to...@co...> - 2006-03-07 12:42:49
|
In message <114...@po...>
Patrick Ohly <pat...@in...> wrote:
> On Tue, 2006-03-07 at 11:17 +0000, Tom Hughes wrote:
>
>> Some of them are virtually impossible to suppress as I recall
>> as the write appears to come directly from the user program with
>> no sign of libpthread at all...
>
> That's exactly the problem here: why is the stack backtrace
> incomplete and doesn't list the write() as coming from
> inside libpthread? When I use --db-attach on such a write
> gdb correctly tells me that, but valgrind skips one
> level in the callstack.
I was assuming it was a tail call from libpthread to write that
caused the stack frame to disappear.
Do you have DWARF debugging available for glibc and libpthread? or
have them built with frame pointers?
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Patrick O. <pat...@in...> - 2006-03-07 12:33:38
|
On Tue, 2006-03-07 at 11:17 +0000, Tom Hughes wrote: > In message <114...@po...> > Patrick Ohly <pat...@in...> wrote: > > > I still think the handling of suid binaries called from inside > > programs running under valgrind control should be investigated > > further (#119404, listed as "possibly just close") and the > > failing suppression of uninitialized memory reads inside Linux > > pthreads has also irritated a few people on this list (#119446, > > not listed). > > Keeping with all the different builds of glibc that people use > and which all seem to manage to give different backtraces is a > major pain as far as the linuxthreads issue goes. There are suppressions which used to work fine. > Some of them are virtually impossible to suppress as I recall > as the write appears to come directly from the user program with > no sign of libpthread at all... That's exactly the problem here: why is the stack backtrace incomplete and doesn't list the write() as coming from inside libpthread? When I use --db-attach on such a write gdb correctly tells me that, but valgrind skips one level in the callstack. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. |
|
From: Tom H. <to...@co...> - 2006-03-07 11:17:53
|
In message <114...@po...>
Patrick Ohly <pat...@in...> wrote:
> I still think the handling of suid binaries called from inside
> programs running under valgrind control should be investigated
> further (#119404, listed as "possibly just close") and the
> failing suppression of uninitialized memory reads inside Linux
> pthreads has also irritated a few people on this list (#119446,
> not listed).
Keeping with all the different builds of glibc that people use
and which all seem to manage to give different backtraces is a
major pain as far as the linuxthreads issue goes.
Some of them are virtually impossible to suppress as I recall
as the write appears to come directly from the user program with
no sign of libpthread at all...
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Patrick O. <pat...@in...> - 2006-03-07 11:09:57
|
On Tue, 2006-03-07 at 02:02 +0000, Julian Seward wrote: > I'd like to release 3.1.1 some time around Friday, if possible. > Many bug fixes have been merged in now and I'm hoping to be in a > code freeze state soon. A good summary of the bugs that have > been fixed are in trunk/docs/internals/3_1_BUGSTATUS.txt. I still think the handling of suid binaries called from inside programs running under valgrind control should be investigated further (#119404, listed as "possibly just close") and the failing suppression of uninitialized memory reads inside Linux pthreads has also irritated a few people on this list (#119446, not listed). Both problems still occur with current SVN head. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. |
|
From: Julian S. <js...@ac...> - 2006-03-07 10:02:37
|
On Tuesday 07 March 2006 07:42, Harry Mangalam wrote: > I have an odd bug - it's actually a non-bug. My vanilla C/terminal app > runs fine when run under valgrind When you say 'runs fine', you mean V reports no errors at all? J |
|
From: Harry M. <hj...@ta...> - 2006-03-07 07:46:47
|
I have an odd bug - it's actually a non-bug. My vanilla C/terminal app runs fine when run under valgrind and also when compiled with the dmalloc memory debugging libs - it loses some memory but it's acceptable at this point. However, when recompiled without dmalloc and run normally under bash, it generates a segfault. Run thru ddd/gdb, I can tell where it's crashing but it's the oddest thing - I'm callocing what should be (and is, according to gdb) a pointer that has been explicitly set to 0x0 and it crashes. I don't even get a chance to check the returned pointer value. I get the awful: Program received signal SIGSEGV, Segmentation fault. 0xb7dc1523 in free () from /lib/tls/i686/cmov/libc.so.6 which seems a bit odd - I wasn't trying to free anything - I was trying to calloc it. Does a calloc() automatically attempt to free the pointer 1st? I found that the referenced /lib/tls.... libc.so.6 can sometimes interfere with free() just as I describe, but after disabling it, I get the same error message with the remaining /lib/libc.so.6 Has anyone else seen this bizarre behavior? hjm |