|
From: Jonathan G. U. <j.u...@op...> - 2005-08-31 13:54:29
|
Hi, I have been running the gsl test suite (http://www.gnu.org/software/gsl/) under valgrind to check for memory errors. While it spotted a few valid ones (all trivial), it is also generating errors for code which is derived from templates for long doubles i.e. the same templates is used to construct code for a number of different data types (int, unsigned int, float, double etc.), but valgrind is *only* throwing up errors when the derived code for the long double type is used. Below I give an example. This is with valgrind 2.4.0, gcc 4.0.1, glibc 2.3.5 (i.e. fedora core 4). Are there any known issues with valgrind and long doubles? Could this be a glibc issue? Any help would be fantastic. Cheers, Jonathan make[2]: Entering directory `/home/jgu/gsl-devel/gsl-valgrind/vector' ==15339== Syscall param write(buf) points to uninitialised byte(s) ==15339== at 0x503B03: __write_nocancel (in /lib/libc-2.3.5.so) ==15339== by 0x4A6378: new_do_write (in /lib/libc-2.3.5.so) ==15339== by 0x4A6474: _IO_do_write@@GLIBC_2.1 (in /lib/libc-2.3.5.so) ==15339== by 0x4A5C6A: _IO_file_close_it@@GLIBC_2.1 (in /lib/libc-2.3.5.so) ==15339== by 0x49CC55: fclose@@GLIBC_2.1 (in /lib/libc-2.3.5.so) ==15339== by 0x804C952: test_complex_long_double_file (test_complex_source.c:433) ==15339== by 0x8063828: main (test.c:205) ==15339== Address 0x1B90D00A is not stack'd, malloc'd or (recently) free'd ==15339== ==15339== Syscall param write(buf) points to uninitialised byte(s) ==15339== at 0x503B03: __write_nocancel (in /lib/libc-2.3.5.so) ==15339== by 0x4A6378: new_do_write (in /lib/libc-2.3.5.so) ==15339== by 0x4A7A0E: _IO_file_xsputn@@GLIBC_2.1 (in /lib/libc-2.3.5.so) ==15339== by 0x49DF57: fwrite (in /lib/libc-2.3.5.so) ==15339== by 0x80755D7: gsl_block_complex_long_double_raw_fwrite (fwrite_source.c:92) ==15339== by 0x806794B: gsl_vector_complex_long_double_fwrite (file_source.c:36) ==15339== by 0x804C949: test_complex_long_double_file (test_complex_source.c:431) ==15339== by 0x8063828: main (test.c:205) ==15339== Address 0x1BA61712 is 10 bytes inside a block of size 4200 alloc'd ==15339== at 0x1B909222: malloc (vg_replace_malloc.c:130) ==15339== by 0x807430D: gsl_block_complex_long_double_alloc (init_source.c:39) ==15339== by 0x806393F: gsl_vector_complex_long_double_alloc (init_source.c:40) ==15339== by 0x80639F5: gsl_vector_complex_long_double_calloc (init_source.c:64) ==15339== by 0x80488E4: create_complex_long_double_vector (test_complex_source.c:33) ==15339== by 0x804C87F: test_complex_long_double_file (test_complex_source.c:415) ==15339== by 0x8063828: main (test.c:205) ==15339== ==15339== Syscall param write(buf) points to uninitialised byte(s) ==15339== at 0x503B03: __write_nocancel (in /lib/libc-2.3.5.so) ==15339== by 0x4A6378: new_do_write (in /lib/libc-2.3.5.so) ==15339== by 0x4A6474: _IO_do_write@@GLIBC_2.1 (in /lib/libc-2.3.5.so) ==15339== by 0x4A6D81: _IO_file_overflow@@GLIBC_2.1 (in /lib/libc-2.3.5.so) ==15339== by 0x4A7991: _IO_file_xsputn@@GLIBC_2.1 (in /lib/libc-2.3.5.so) ==15339== by 0x49DF57: fwrite (in /lib/libc-2.3.5.so) ==15339== by 0x80755B8: gsl_block_complex_long_double_raw_fwrite (fwrite_source.c:107) ==15339== by 0x806794B: gsl_vector_complex_long_double_fwrite (file_source.c:36) ==15339== by 0x804C949: test_complex_long_double_file (test_complex_source.c:431) ==15339== by 0x8063828: main (test.c:205) ==15339== Address 0x1B90D00A is not stack'd, malloc'd or (recently) free'd |
|
From: Julian S. <js...@ac...> - 2005-08-31 16:35:48
|
Interesting. I've used the gsl test suite quite extensively as
a really good way to shake out FP simulation bugs in Valgrind :-)
It strikes me that what you're seeing is due to gcc treating
arrays of long doubles as if each element had size 12. Try
this:
#include <stdio.h>
long double a[10];
int main ( void ) { printf("%d\n", sizeof(a)); return 0; }
So the array which is written to the file looks to V as if it
has 10 bytes defined, 2 bytes garbage, 10 bytes defined, etc.
Which is why it complains.
You could "fix" this by zeroing out the array to start with
so that V thinks it is initialised completely. Either using
memset or by allocating it with calloc.
J
> ==15339== Address 0x1BA61712 is 10 bytes inside a block of size 4200
> alloc'd
> ==15339== at 0x1B909222: malloc (vg_replace_malloc.c:130)
> ==15339== by 0x807430D: gsl_block_complex_long_double_alloc
> (init_source.c:39)
> ==15339== by 0x806393F: gsl_vector_complex_long_double_alloc
> (init_source.c:40)
> ==15339== by 0x80639F5: gsl_vector_complex_long_double_calloc
> (init_source.c:64)
> ==15339== by 0x80488E4: create_complex_long_double_vector
> (test_complex_source.c:33)
> ==15339== by 0x804C87F: test_complex_long_double_file
> (test_complex_source.c:415)
> ==15339== by 0x8063828: main (test.c:205)
> ==15339==
> ==15339== Syscall param write(buf) points to uninitialised byte(s)
> ==15339== at 0x503B03: __write_nocancel (in /lib/libc-2.3.5.so)
> ==15339== by 0x4A6378: new_do_write (in /lib/libc-2.3.5.so)
> ==15339== by 0x4A6474: _IO_do_write@@GLIBC_2.1 (in /lib/libc-2.3.5.so)
> ==15339== by 0x4A6D81: _IO_file_overflow@@GLIBC_2.1 (in
> /lib/libc-2.3.5.so)
> ==15339== by 0x4A7991: _IO_file_xsputn@@GLIBC_2.1 (in
> /lib/libc-2.3.5.so) ==15339== by 0x49DF57: fwrite (in
> /lib/libc-2.3.5.so)
> ==15339== by 0x80755B8: gsl_block_complex_long_double_raw_fwrite
> (fwrite_source.c:107)
> ==15339== by 0x806794B: gsl_vector_complex_long_double_fwrite
> (file_source.c:36)
> ==15339== by 0x804C949: test_complex_long_double_file
> (test_complex_source.c:431)
> ==15339== by 0x8063828: main (test.c:205)
> ==15339== Address 0x1B90D00A is not stack'd, malloc'd or (recently) free'd
>
>
>
>
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> _______________________________________________
> Valgrind-users mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
|
|
From: John R.
|
> It strikes me that what you're seeing is due to gcc treating
> arrays of long doubles as if each element had size 12. Try
> this:
>
> #include <stdio.h>
> long double a[10];
> int main ( void ) { printf("%d\n", sizeof(a)); return 0; }
>
> So the array which is written to the file looks to V as if it
> has 10 bytes defined, 2 bytes garbage, 10 bytes defined, etc.
> Which is why it complains.
Gcc for x86 has done this for many years.
memcheck for x86 probably should have a default option
to pretend that
FSTPT (DB /7) writes 12 bytes (instead of 10)
FBSTP (DF /6) writes 12 bytes (instead of 10)
and probably also
FSTCW (D9 /7) writes 4 bytes (instead of 2)
--
|
|
From: Jonathan G. U. <j.u...@op...> - 2005-08-31 18:43:52
|
Julian Seward wrote:
> Interesting. I've used the gsl test suite quite extensively as
> a really good way to shake out FP simulation bugs in Valgrind :-)
>
> It strikes me that what you're seeing is due to gcc treating
> arrays of long doubles as if each element had size 12. Try
> this:
>
> #include <stdio.h>
> long double a[10];
> int main ( void ) { printf("%d\n", sizeof(a)); return 0; }
>
> So the array which is written to the file looks to V as if it
> has 10 bytes defined, 2 bytes garbage, 10 bytes defined, etc.
> Which is why it complains.
>
> You could "fix" this by zeroing out the array to start with
> so that V thinks it is initialised completely. Either using
> memset or by allocating it with calloc.
>
> J
Excellent, yes, now I see. Many thanks for the concise explanation.
Can't help wondering if there is any reason for gcc to do this? So, do
you mean that long doubles are stored with a 10 byte representation, and
not 12 bytes as the documentation would lead one to believe? Surely I'm
misunderstanding...
Jonathan.
|
|
From: Nicholas N. <nj...@cs...> - 2005-08-31 19:45:58
|
On Wed, 31 Aug 2005, Jonathan G. Underwood wrote: > Excellent, yes, now I see. Many thanks for the concise explanation. > Can't help wondering if there is any reason for gcc to do this? So, do you > mean that long doubles are stored with a 10 byte representation, and not 12 > bytes as the documentation would lead one to believe? Surely I'm > misunderstanding... Long doubles on x86 are 80 bits (10 bytes) in the hardware and in memory. The extra two bytes of padding added in memory would be there to improve alignment, which can speed up memory accesses. Nick |
|
From: Julian S. <js...@ac...> - 2005-08-31 19:50:50
|
> Excellent, yes, now I see. Many thanks for the concise explanation. > Can't help wondering if there is any reason for gcc to do this? So, do > you mean that long doubles are stored with a 10 byte representation, and > not 12 bytes as the documentation would lead one to believe? Surely I'm > misunderstanding... They are 10 bytes long, but when gcc deals with arrays of them, it places them 12 bytes apart, that is, it puts in a 2-byte spacer between each. The usual reason for this is that memory loads/ stores which are sufficiently misaligned (for some value of "sufficiently") are often handled much more slowly by the hardware than normal, because it means the hardware has to issue two fetch/store requests to the L1 cache in the worst case, rather than just one. Although in this case I don't think it'd help much. J |
|
From: Nicholas N. <nj...@cs...> - 2005-08-31 21:02:18
|
On Wed, 31 Aug 2005, Julian Seward wrote: > They are 10 bytes long, but when gcc deals with arrays of them, it > places them 12 bytes apart, that is, it puts in a 2-byte spacer > between each. The usual reason for this is that memory loads/ > stores which are sufficiently misaligned (for some value of > "sufficiently") are often handled much more slowly by the hardware > than normal, because it means the hardware has to issue two > fetch/store requests to the L1 cache in the worst case, rather > than just one. Although in this case I don't think it'd help > much. It might also make address computations simpler? Eg. you could use shifting and/or masking when powers of 2 are involved that you couldn't when multiplying by 10. Nick |
|
From: Julian S. <js...@ac...> - 2005-08-31 21:34:34
|
> It might also make address computations simpler? Eg. you could use > shifting and/or masking when powers of 2 are involved that you couldn't > when multiplying by 10. Well ... the x86 address mode expressions help somewhat. If %a holds the array base pointer and %i is an array index, and %tmp is a spare reg, then you can do leal (%i,%i,2), %tmp // tmp = i + (i << 2) = 5 * i fldt (%a,%tmp,1) // EA = a + (tmp << 1) = a + 10 * i vs leal (%i,%i,1), %tmp // tmp = i + (i << 1) = 3 * i fldt (%a,%tmp,2) // RA = a + (tmp << 2) = a + 12 * i so I guess there's nothing in it. The funny thing is that rounding up to size 12 doesn't avoid the problem that an element straddles a cache line in a cache with lines >= 16 bytes long. Only rounding up to size 16 and forcing the array base to be 16 aligned would help that. J |