performing contact dumps from in memory with such commands as:
# openserctl ul show
or
# openserctl online
produces error:
> Aug 11 17:25:06 test /usr/sbin/openser[15267]: DBG:mi_fifo:mi_parse_node:
end of input tree
> Aug 11 17:25:06 test /usr/sbin/openser[15267]:
DBG:mi_fifo:mi_fifo_server: done parsing the mi tree
> Aug 11 17:25:06 test /usr/sbin/openser[15267]: ERROR:core:add_mi_attr: no
more pkg mem (24)
> Aug 11 17:25:06 test /usr/sbin/openser[15267]:
ERROR:mi_fifo:mi_fifo_server: command (ul_dump) processing failed
this comes out after 20 days of openser-1.3.2 operation with couple of
thausands of contacts.
--
Antanas Masevicius
email: antanas.masevicius@ntt.lt
Logged In: YES
user_id=1246013
Originator: NO
this might be memory fragmentation. How many times you run openserctl ul show in this period?
Logged In: YES
user_id=337916
Originator: NO
Hi,
if i remember correctly, Dan also reported this problem some times ago on the devel list. Perhaps we really should investigate to get rid of this pkg_mem pool, and replace it with plain malloc..
In the meantime its probably possible to work around/ delay this problem by increasing the pkg memory pool size, or i'm wrong?
Henning
Logged In: NO
hello miconda,
i've been running "openserctl online | wc" pretty frequently. Lets say from 50 to 100 times in that pediod.
--
Antanas Masevicius
email: antanas.masevicius@ntt.lt
Logged In: YES
user_id=1246013
Originator: NO
Try compile with -DQM_JOIN_FREE and see if you run in same situation. You can add that define in Makefile.defs, there is a section where the defines are listed.
Logged In: YES
user_id=337916
Originator: NO
Hi Antanas,
any updates for this bug, does the recompile with the QM_JOIN_FREE helped you?
BTW, If you're really adventurous (and have a good QA process) then you could also try to recompile without PKG_MALLOC defined. The server will then be use the system malloc instead of the fixed size pool (you need to use a recent branch for this, i did a fix for this case). Performance will probably be somewhat worse, but you if you then still problems with fragmentation, you need to blame the linux kernel/ glibc. :-)
Henning
Hi Antanas,
any updates on this? Thanks,
Henning
Hi henningw,
still monitoring situation. I was still not able to recompile with that flag, because of production system uptime requiremts, but im still planning to do that. I'll post results as soon as i will be able to determine outcome.
Antanas
NTT
Hi,
after compiling with -DQM_JOIN_FREE and running for a month or so i started getting the same errors as noted in my original submission. I guess not much improvement here. Still, ill try running latest 1.4.2 after i test it in my testbed and post results then. Are there (in 1.4.2) any changes related to this problem, or should i try disabling PKG_MALLOC before trying?
We are facing the same problem on an old openser-1.2.0 with compile options :
Makefile.defs :
DEFS+= $(extra_defs) \
-DNAME='"$(MAIN_NAME)"' -DVERSION='"$(RELEASE)"' -DARCH='"$(ARCH)"' \
-DOS='"$(OS)"' -DCOMPILER='"$(CC_VER)"' -D__CPU_$(ARCH) -D__OS_$(OS) \
-D__SMP_$(ISSMP) -DCFG_DIR='"$(cfg-target)"'\
-DPKG_MALLOC \
-DSHM_MEM -DSHM_MMAP \
-DUSE_IPV6 \
-DUSE_MCAST \
-DUSE_TCP \
-DDISABLE_NAGLE \
-DHAVE_RESOLV_RES \
-DSTATISTICS \
-DF_MALLOC \
#-DDBG_QM_MALLOC \
#-DDBG_F_MALLOC \
#-DNO_DEBUG \
#-DNO_LOG \
#-DVQ_MALLOC \
#-DDBG_LOCK \
#-DNOSMP \
#-DEXTRA_DEBUG \
#-DUSE_SHM_MEM
And config.h :
/*used only if PKG_MALLOC is defined*/
#define PKG_MEM_POOL_SIZE 1024*1024
/*used if SH_MEM is defined*/
#define SHM_MEM_SIZE 32
If it can give a clue ....
Hi Antanas,
ok, unfortunally the -DQM_JOIN_FREE does not help in your case. There are no fixes in the 1.4 branch related to this problem yet. So if you can't work around this ul_dump problem (e.g. with using ul_show_contact instead, or query directly the DB in a non-cache DB mode), then i'd suggest that you try with PKG_MEM disabled.
If this problem only happens once a month, then you'll perhaps be able to delay it further by increasing the PKG_MEM pool size, as another option.
Henning