Re: [SEToolkit-developer] Segmentation Fault during diskinfo.se on new SUN M5000
Brought to you by:
dmichelsen
|
From: Alex K. <ale...@gm...> - 2007-09-21 13:51:55
|
On 21/09/2007, Jon Craig <can...@gm...> wrote:
> Wondering if anyone else has a SPARC64 VI based platform and/or
> Sol10-09/07 install. We are putting a new M5000 16-Core 32G machine
> into production this weekend and I've run into a spot of trouble with
> "se - Version 3.4 (03:59 PM 01/05/05) for sparcv9". When I run any
> example that excercises diskinfo.se I get a core dump. I've traced it
> down to the portion of diskinfo.se that is using readdir to get file
> names. I've extracted that code into the script (list_dir.se):
>
> main(int argc, string argv[]) {
> string dirname;
> ulong dirh;
> ulong direnth;
> int fdcnt;
> dirent_t dirent;
>
> dirname = sprintf(".");
> if((dirh = opendir(dirname)) != NULL) {
> for(fdcnt = 0; ((direnth = readdir(dirh)) != NULL);fdcnt++) {
> dirent = *((dirent_t *) direnth);
> printf("%04d -> %s\n", fdcnt, dirent.d_name);
> }
> }
> }
>
> When I run this with debug on I eventually get:
>
> fdcnt++;
> dirent = *((dirent_t *) direnth<18446744071550557872>)
> printf(<%04d -> %s\n>, fdcnt<178>, dirent.d_name<c4t5006048AD52EC107d146s0>)
> 0178 -> c4t5006048AD52EC107d146s0
> fdcnt++;
> dirent = *((dirent_t *) direnth<18446744071550557920>)
> printf(<%04d -> %s\n>, fdcnt<179>, dirent.d_name<c4t5006048AD52EC107d146s1>)
> 0179 -> c4t5006048AD52EC107d146s1
> fdcnt++;
> dirent = *((dirent_t *) direnth<18446744071550557968>)
> ksh: 17047 Segmentation Fault
>
> When I get a core dump on a directory, that directory will always dump
> core at the same place, so it's very reproducible. The only thing I
> can see is that the memory address referenced by direnth is fairly
> close to the 64bit boundry, but fairly close shouldn't matter. Any
> thoughts or suggestions would be appreciated. As an aside, we've
> begun using sun drivers and it breaks all the code that assumes the
> target will be an integer number. In fact, the target is now the WWN
> of the device and fairly long. I can send the full debug if anyone wants it.
>
That sounds a lot like the problem which is fixed in SVN tree by the
pulling readdir code into se itself.
--
Alex Kiernan
|