#1445 STAFProc memory leaks

Unix::Linux
open
Sharon Lucas
STAFProc (176)
5
2014-08-23
2012-03-09
Nx
No

We are seeing STAF crash with \'out of memory\' when the tests are run for several days. Based on our investigation, STAF is leaking memory. The VIRT field in the \'top\' command output grows 100\'s of MB. Only way to clear the leaks is to shutdown staf and restart.

# staf local misc version
Response
--------
3.4.8

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.3 (Tikanga)
# uname -a
Linux RH-Linux-53-84 2.6.18-128.el5PAE #1 SMP Wed Dec 17 12:02:33 EST 2008 i686 i686 i386 GNU/Linux

I tried to write a small script to reproduce the memory and was able to reproduce sometimes. (I had to run several times the below loop. if not reproducible, please try increasing the process count at \'set process 60\')

[root@RH-Linux-53-84 staf-mem-leak-test]# for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30; do top -n1 -b |grep STAF; tclsh test.tcl; done
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:06.55 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:06.62 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:06.70 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:06.77 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:06.85 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:06.92 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:06.99 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.06 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.13 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.20 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.27 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.36 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.43 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.50 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.58 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.66 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.74 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.81 STAFProc
23090 root 25 0 90408 5180 3632 S 0.0 0.3 0:07.87 STAFProc
23090 root 25 0 94508 5192 3632 S 0.0 0.3 0:07.95 STAFProc
23090 root 25 0 94508 5192 3632 S 0.0 0.3 0:08.03 STAFProc
23090 root 25 0 98608 5204 3632 S 0.0 0.3 0:08.11 STAFProc
23090 root 25 0 100m 5216 3632 S 0.0 0.3 0:08.18 STAFProc
23090 root 25 0 104m 5228 3632 S 0.0 0.3 0:08.27 STAFProc
23090 root 25 0 104m 5228 3632 S 0.0 0.3 0:08.34 STAFProc
...
...

Content of test.tcl:
package require STAF

if {[STAF::Register \"mem-leak-test\"] != $STAF::kOk} {
TMLog \"Error registering with STAF, RC: $STAF::RC\" $TM_ERROR
exit $TM_ERROR
}

set process 60

for {set i 0 } { $i < $process } { incr i } {
STAF::Submit local process \"start command tclsh parms /tmp/staf-mem-leak-test/testProc.tcl $i \"
}

after 20000

STAF::Submit local process \"free all\"

STAF::UnRegister

exit 0

#
# Content of /tmp/staf-mem-leak-test/testProc.tcl
#
puts \"process: $argv\"
after 10000
exit 0

i ran valgrind and the logs are attached.

Discussion

1 2 > >> (Page 1 of 2)
  • Nx
    Nx
    2012-03-09

    valgrind log

     
    Attachments
  • Nx
    Nx
    2012-03-13

    would you get a chance to look into this issue?

     
  • Sharon Lucas
    Sharon Lucas
    2012-03-13

    • assigned_to: nobody --> slucas
     
  • Sharon Lucas
    Sharon Lucas
    2012-03-13

    I am investigating.

     
  • Nx
    Nx
    2012-04-09

    Any findings on this issue? we keep seeing these leaks.

     
  • Sharon Lucas
    Sharon Lucas
    2012-04-09

    One of the minor memory leaks has already been fixed via Bug #3467922 "Memory leak in unix local connection provider" at https://sourceforge.net/tracker/?func=detail&aid=3467922&group_id=33142&atid=407381 and is contained in STAF V3.4.9 (released March 29, 2012) so you should probably upgrade to STAF V3.4.9 if you haven't already.

    I'm still investigating the other memory leaks.

     
  • Nx
    Nx
    2012-04-09

    Thanks. i will upgrade 3.4.9 and let you know if any improvement.

     
  • Nx
    Nx
    2012-04-10

    I upgraded to 3.4.9. Still I see virtual memory going up in few hours.

    # staf local misc version
    Response
    --------
    3.4.9
    # top -n1 -b |egrep "PID|STAF"
    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    21596 root 17 0 112m 6040 3724 S 0.0 0.3 0:27.56 STAFProc

     
  • Nx
    Nx
    2012-06-26

    i filed a new bug 3538007. I believe it is the side effect of this memory leak issue. could you kindly look into this issue and provide a fix.

     
  • Nx
    Nx
    2012-08-16

    simple script to repro STAF mem leaks

     
    Attachments
1 2 > >> (Page 1 of 2)