|
From: faustina S. <fau...@gm...> - 2017-08-18 14:36:02
|
I have written a lengthy C function and called it in my R script within a repeat loop, up till 5th iteration, the loop executes properly. In the 6th iteration, my R studio crashes. The last line that is executed before the crash is third line after the function call. In this line I am assigning a list from the C function to an R object. I tried debugging it using valgrind(this is the first time I am using Valgrind), I am unable to interpret the output that i got from valgrind. I've taken a few lines from output and pasted below, I want to know if its a memory leak issue in the C code or the issue is in the R. Please let me know if additional information is required and guide me through how to debug this. *==8061== Invalid read of size 1* *==8061== at 0x4EC0F63: ??? (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4EC26B1: ??? (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F75B7C: ??? (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F76105: ??? (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F34D2A: ??? (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F40F2F: Rf_eval (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F4296D: Rf_applyClosure (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F3CE92: ??? (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F40F2F: Rf_eval (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F4296D: Rf_applyClosure (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F410CC: Rf_eval (in /usr/lib/R/lib/libR.so)* *==8061== by 0x4F45CCE: ??? (in /usr/lib/R/lib/libR.so)* *==8061== Address 0x4231f7a3c8360000 is not stack'd, malloc'd or (recently) free'd* *==8061==* * *** caught segfault **** *address (nil), cause 'unknown'* *aborting ...* *==8061==* *==8061== Process terminating with default action of signal 11 (SIGSEGV)* *==8061== at 0x53FD269: raise (pt-raise.c:35)* *==8061== by 0x53FD38F: ??? (in /lib/x86_64-linux-gnu/libpthread-2.23.so <http://libpthread-2.23.so/>)* *==8061== by 0x4EC0F62: ??? (in /usr/lib/R/lib/libR.so)* *==8061==* *==8061== HEAP SUMMARY:* *==8061== in use at exit: 110,953,432 bytes in 73,516 blocks* *==8061== total heap usage: 125,391 allocs, 51,875 frees, 228,027,816 bytes allocated* *==8061== LEAK SUMMARY:* *==8061== definitely lost: 1,183,860 bytes in 1,355 blocks* *==8061== indirectly lost: 1,220,908 bytes in 16,474 blocks* *==8061== possibly lost: 2,388 bytes in 3 blocks* *==8061== still reachable: 108,546,276 bytes in 55,684 blocks* *==8061== suppressed: 0 bytes in 0 blocks* *==8061==* *==8061== For counts of detected and suppressed errors, rerun with: -v* *==8061== ERROR SUMMARY: 38 errors from 38 contexts (suppressed: 0 from 0)* *Segmentation fault (core dumped)* Thanks, Faustina |
|
From: John R. <jr...@bi...> - 2017-08-18 16:24:14
|
> ==8061== Invalid read of size 1
> ==8061== at 0x4EC0F63: ??? (in /usr/lib/R/lib/libR.so)
...
> ==8061== Address 0x4231f7a3c8360000 is not stack'd, malloc'd or (recently) free'd
Is that the first complaint from valgrind while running the program?
It's better to work on the first complaint. In general, anything after
the first complaint from valgrind(memcheck) is less trustworthy.
The error that memcheck detected might taint anything that is derived from it.
Anyway, the libR.so code at 0x4EC0F63 tried to fetch one byte from address 0x4231f7a3c8360000
which was not a valid address. Valgrind complained, then let the program
fetch from that address, so the program got SIGSEGV because there was nothing
in the address space at that address. The Rstudio code caught the SIGSEGV
and aborted execution.
One way to get more information is to invoke valgrind with:
valgrind --vgdb-error=0 other_valgrind_args /path/to/Rstudio Rstudio_args
and follow the directions. Open another terminal window, run gdb on /path/to/Rstudio,
then copy+paste the "target remote ..." command into the gdb session. Then enter:
(gdb) continue
When valgrind complains, then the gdb session will gain control. Look around:
(gdb) info proc ## get the PID
(gdb) shell cat /proc/PID/maps ## look at the address space mappings
and so on. If this does not give good clues then it may be necessary
to insert debugging printf() into your C code in order to see what is happening.
Please tell us the valgrind version ("valgrind --version") and the hardware architecture.
--
|
|
From: John R. <jr...@bi...> - 2017-08-21 12:12:17
|
> ==8061== Invalid read of size 1
> ==8061== at 0x4EC0F63: ??? (in /usr/lib/R/lib/libR.so)
>
> ...
>> ==8061== Address 0x4231f7a3c8360000 is not stack'd, malloc'd or (recently) free'd
>
> yes, this is the first complaint from valgrind.
Running on valgrind-3.11.0 on x86_64 under Linux under Xen [as shown
elsewhere, which I have not quoted.]
The latest version of valgrind is valgrind-3.13.0 (two versions newer
than 3.11.0), so you should upgrade; although I would guess that
it might not make any difference in this particular case.
>
> When the code crashes, the gdb terminal takes control and gives the following message:
>
> *Program received signal SIGTRAP, Trace/breakpoint trap.*
> *0x0000000004ec0f63 in ?? () from /usr/lib/R/lib/libR.so*
>
> Executing the command *(gdb) shell cat /proc/1668/maps *gives the following information(i've taken an excerpt from the message):
>
> *00400000-00401000 r-xp 00000000 ca:01 108910 /usr/lib/R/bin/exec/R*
[[snip]]
> *052f4000-052ff000 rw-p 002ba000 ca:01 108931 /usr/lib/R/lib/libR.so*
The hope was that "cat /proc/PID/maps" might show something with an address
that is related to 0x4231f7a3c8360000; but I don't see anything like that
in what you report. When interpreted as a double precision number,
then the bits 0x4231f7a3c8360000 are approximately 4.76987e+18,
which looks "random" to me. However, the least-significant bits
are 16 zero bits (0x...0000) which does look less-than-random.
In your first message of this thread:
> I have written a lengthy C function and called it in my R script
> within a repeat loop, up till 5th iteration, the loop executes properly.
> In the 6th iteration, my R studio crashes. The last line that is executed
> before the crash is third line after the function call. In this line
> I am assigning a list from the C function to an R object.
"... assigning a list from the C function to an R object" might require
special handling. What does the documentation for Rstudio say
about such conversions involving non-atomic data (a list, as opposed
to a floating-point value)? Your function might have to build the list
one element at a time, or call a special function inside Rstudio
in order to allocate space for the list. A plain "malloc(...)"
in your C function might not be good enough to interoperate with Rstudio.
So at this point, it seems to me that you should find an example
where someone else has a plugin C function that creates a list of
floating-point values [possibly using malloc() in C], and returns
that list to Rstdio. Adapt your code based on that example.
In particular, construct the list using the same method. The
numerical values of the elements will be different, of course.
--
|