Menu

#501 pkg memory management

ver 1.3.x
open
nobody
core (125)
5
2008-08-13
2008-08-13
No

performing contact dumps from in memory with such commands as:

# openserctl ul show
or
# openserctl online

produces error:

> Aug 11 17:25:06 test /usr/sbin/openser[15267]: DBG:mi_fifo:mi_parse_node:
end of input tree
> Aug 11 17:25:06 test /usr/sbin/openser[15267]:
DBG:mi_fifo:mi_fifo_server: done parsing the mi tree
> Aug 11 17:25:06 test /usr/sbin/openser[15267]: ERROR:core:add_mi_attr: no
more pkg mem (24)
> Aug 11 17:25:06 test /usr/sbin/openser[15267]:
ERROR:mi_fifo:mi_fifo_server: command (ul_dump) processing failed

this comes out after 20 days of openser-1.3.2 operation with couple of
thausands of contacts.

--
Antanas Masevicius
email: antanas.masevicius@ntt.lt

Discussion

  • Daniel-Constantin Mierla

    Logged In: YES
    user_id=1246013
    Originator: NO

    this might be memory fragmentation. How many times you run openserctl ul show in this period?

     
  • Henning Westerholt

    Logged In: YES
    user_id=337916
    Originator: NO

    Hi,

    if i remember correctly, Dan also reported this problem some times ago on the devel list. Perhaps we really should investigate to get rid of this pkg_mem pool, and replace it with plain malloc..

    In the meantime its probably possible to work around/ delay this problem by increasing the pkg memory pool size, or i'm wrong?

    Henning

     
  • Nobody/Anonymous

    Logged In: NO

    hello miconda,

    i've been running "openserctl online | wc" pretty frequently. Lets say from 50 to 100 times in that pediod.

    --
    Antanas Masevicius
    email: antanas.masevicius@ntt.lt

     
  • Daniel-Constantin Mierla

    Logged In: YES
    user_id=1246013
    Originator: NO

    Try compile with -DQM_JOIN_FREE and see if you run in same situation. You can add that define in Makefile.defs, there is a section where the defines are listed.

     
  • Henning Westerholt

    Logged In: YES
    user_id=337916
    Originator: NO

    Hi Antanas,

    any updates for this bug, does the recompile with the QM_JOIN_FREE helped you?

    BTW, If you're really adventurous (and have a good QA process) then you could also try to recompile without PKG_MALLOC defined. The server will then be use the system malloc instead of the fixed size pool (you need to use a recent branch for this, i did a fix for this case). Performance will probably be somewhat worse, but you if you then still problems with fragmentation, you need to blame the linux kernel/ glibc. :-)

    Henning

     
  • Henning Westerholt

    Hi Antanas,

    any updates on this? Thanks,

    Henning

     
  • Nobody/Anonymous

    Hi henningw,

    still monitoring situation. I was still not able to recompile with that flag, because of production system uptime requiremts, but im still planning to do that. I'll post results as soon as i will be able to determine outcome.

    Antanas
    NTT

     
  • Nobody/Anonymous

    Hi,

    after compiling with -DQM_JOIN_FREE and running for a month or so i started getting the same errors as noted in my original submission. I guess not much improvement here. Still, ill try running latest 1.4.2 after i test it in my testbed and post results then. Are there (in 1.4.2) any changes related to this problem, or should i try disabling PKG_MALLOC before trying?

     
  • Laurent Glayal

    Laurent Glayal - 2008-12-03

    We are facing the same problem on an old openser-1.2.0 with compile options :
    Makefile.defs :
    DEFS+= $(extra_defs) \
    -DNAME='"$(MAIN_NAME)"' -DVERSION='"$(RELEASE)"' -DARCH='"$(ARCH)"' \
    -DOS='"$(OS)"' -DCOMPILER='"$(CC_VER)"' -D__CPU_$(ARCH) -D__OS_$(OS) \
    -D__SMP_$(ISSMP) -DCFG_DIR='"$(cfg-target)"'\
    -DPKG_MALLOC \
    -DSHM_MEM -DSHM_MMAP \
    -DUSE_IPV6 \
    -DUSE_MCAST \
    -DUSE_TCP \
    -DDISABLE_NAGLE \
    -DHAVE_RESOLV_RES \
    -DSTATISTICS \
    -DF_MALLOC \
    #-DDBG_QM_MALLOC \
    #-DDBG_F_MALLOC \
    #-DNO_DEBUG \
    #-DNO_LOG \
    #-DVQ_MALLOC \
    #-DDBG_LOCK \
    #-DNOSMP \
    #-DEXTRA_DEBUG \
    #-DUSE_SHM_MEM

    And config.h :
    /*used only if PKG_MALLOC is defined*/
    #define PKG_MEM_POOL_SIZE 1024*1024

    /*used if SH_MEM is defined*/
    #define SHM_MEM_SIZE 32

    If it can give a clue ....

     
  • Henning Westerholt

    Hi Antanas,

    ok, unfortunally the -DQM_JOIN_FREE does not help in your case. There are no fixes in the 1.4 branch related to this problem yet. So if you can't work around this ul_dump problem (e.g. with using ul_show_contact instead, or query directly the DB in a non-cache DB mode), then i'd suggest that you try with PKG_MEM disabled.

    If this problem only happens once a month, then you'll perhaps be able to delay it further by increasing the PKG_MEM pool size, as another option.

    Henning

     

Log in to post a comment.

MongoDB Logo MongoDB