|
From: Carl L. <ce...@li...> - 2024-04-23 15:52:32
|
Paul:
I have been digging some more with gdb. I have also put in some print statements to try and figure out when and what syscall the issue occurs on. The issue occurs after processing a system call and we return to running the user code. While running the user code we encounter the seg fault.
Valgrind calls void VG_(client_syscall) in syswrap-main.c to process a system call. The function calls putSyscallStatusIntoGuestState, twice, as part of processing the system call. I see a variable number of calls to putSyscallStatusIntoGuestState before hitting the seg fault. Note, the number of calls to putSyscallStatusIntoGuestState before the seg fault varies when just running valgrind. I see 2474 or 2478, calls to the function while processing system call 90 or I see 3604 or 3608 function calls for system call 6 before we hit the seg fault. I am puzzled by the inconsistency in the number of calls/sys call number before the segmentation fault, it doesn't "feel" like it is the system call processing per say that is the issue but maybe some other issue, just guessing???? Also, when the failure occurs, it isn't the first time that system call has been handled. It looked like we have processed that system call 10 to 20 times previously without a failure. Don't know if that helps or not?
If you have any thoughts as to a possible root cause, please let me know and I will look into it. Thanks.
Carl
On 4/22/24 15:08, Carl Love via Valgrind-developers wrote:
>
> Paul:
>
> I have isolated the issue to the statement
>
> import numpy as np
>
> in the file bench_hnsw-short.py. I can reproduce the failure by deleting all of the lines in bench_hnsw-short.py except for "import numpy as np". I tried to reproduce the issue on X86-64. It seems to work just fine on X86. I tried -v with valgrind to see if that gives any hints as to the issue. Not seeing anything that is helpful. Not sure if there is some other valgrind debug mode for callgrind that would be insightful.
>
> Carl
>
>
> On 4/22/24 08:16, Carl Love via Valgrind-developers wrote:
>> Paul:
>>
>> I do not get the error with --tool=none or with memcheck. I only seem to see it with callgrind.
>>
>> Carl
>>
>> On 4/19/24 22:14, Paul Floyd via Valgrind-developers wrote:
>>>
>>>
>>> On 19-04-24 15:33, Carl Love via Valgrind-developers wrote:
>>>>
>>>> Valgrind developers:
>>>>
>>>> I ran into an issue with the valgrind callgrind tool running a python AI benchmark:
>>>
>>> Hi Carl,
>>>
>>> Do you get the same error with --tool=none (or memcheck)?
>>>
>>> A+
>>> Paul
>>>
>>>
>>> _______________________________________________
>>> Valgrind-developers mailing list
>>> Val...@li...
>>> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>>
>>
>> _______________________________________________
>> Valgrind-developers mailing list
>> Val...@li...
>> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>
>
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
|