I am getting repeated crashing on my three X 1.8.2 opensips servers, also some crashing on my 1.7.1.
The problem appears to be more noticeable when the profile_get_values is called frequently (every few seconds), however I cant confirm isolation of this command causes the crash.
Please see attached (gdb) bt full
(gdb) bt full
Hi!
The core you attached does not contain any indications that the cause of the crash are the dialog profiles, but rather some statistics you are fetching through MI. Can you provide us the exact MI command you are running?
Best regards,
Răzvan
The only parameter that I have programmed in for get_statistics is the parameter "all", this is ran through xmlrpc
Hi,
1) do you see in the logs any err message before the crash (from the same process) ?
2) in gdb, frame 0, please print : brother, *brother, name, value, *brother->next .
Thanks and regards,
Bogdan
Hi Bogdan,
There was nothing in the log files relevant, it just crashes without any response.
I am not sure if I have correctly ran the commands, but I get the following:
(gdb) f 0 brother
#0 0x7663723a65726f63 in ?? ()
(gdb) f 0 *brother
Cannot access memory at address 0x7663723a65726f63
(gdb) f 0 name
value has been optimised out
(gdb) f 0 value
value has been optimised out
(gdb) f 0 *brother->next
Cannot access memory at address 0x7663723a65726f93
Sorry if this is not correct.
Kind Regards
Jonathan
Jonathan,
Once you are in gdb, run the commands:
f 0
p brother
p *brother
p name
p value
p *brother->name
Regards,
Bogdan
Hi Bogdan,
Thanks for the explanation. Please see the following output:
(gdb) f 0
#0 add_next (flags=2, value_len=26, value=<optimised out>, name_len=<optimised out>, name=<optimised out>, brother=0x7663723a65726f63) at mi/tree.c:184
184 brother->last->next = new;
(gdb) p brother
$1 = (struct mi_node *) 0x7663723a65726f63
(gdb) p *brother
Cannot access memory at address 0x7663723a65726f63
(gdb) p name
$2 = <optimised out>
(gdb) p value
$3 = <optimised out>
(gdb) p *brother->name
Structure has no component named operator*.
Kind Regards
Jonathan.
Ok, it seems the "brother" pointer is overwritten - the pointer is not a valid mem addrs, but decoded as hexa is "core:rcv" which is a statistic name.
Please try more:
f 1
p parent
p *parent
f 4
p *rpl
Also is there any why to get access to the core to speed up the debugging ?
Regards,
Hi Bogdan,
The results of your requested commands is as follows:
(gdb) f 1
#1 add_mi_node_child (value_len=26, value=<optimised out>, name_len=<optimised out>, name=<optimised out>, flags=2, parent=<optimised out>) at mi/tree.c:219
219 return add_next(parent->kids, name, name_len, value, value_len, flags);
(gdb) p parent
$1 = <optimised out>
(gdb) p *parent
value has been optimised out
(gdb) f 4
#4 mi_get_stats (cmd=<optimised out>, param=<optimised out>) at statistics.c:535
535 if (mi_add_module_stats( rpl, &collector->amodules[i] )!=0)
(gdb) p *rpl
$2 = {value = {s = 0x2 <Address 0x2 out of bounds>, len = 0}, name = {s = 0x0, len = -840856056}, flags = 0, kids = 0x7663723a65726f63, next = 0x7365696c7065725f,
last = 0x3730393134203d20, attributes = 0x33}
The dump is 2.1GB, I will get this somewhere you can access it.
Kind Regards
Jonathan
As an update here, for everyone.
The problem seems to be generated by the newest versions of libxmlrpc-c3 library (over 1.06.42) - just vesrion do not properly work against opensips, because the threads usage -> this leads the memory corruption in opensips.
We will need to rework the mi_xmlrpc module to support also the new versions, or at least not to compile / link against them.
Regards,
Bogdan