Menu

#1465 STAF crash

Unix::Linux
open
nobody
5
2015-01-12
2012-06-26
Nixon
No

The following crash was seen when running STAF based automation for ~5 days.

STAFProc version 3.4.9 initialized
20120620-20:58:24;153734000;00000100;STAFProcess::processMonitorThread: Error opening /dev/tty, errno: 6
20120624-08:59:59;31050608;00000100;Caught unknown exception in STAFProcessService::sendNotificationCallback()
20120624-09:00:02;23030640;00000100;Caught STAFException in STAFProcessService::sendNotificationCallback(), Exception: STAFInvalidObjectException, Text: STA, Error code: 41
*** glibc detected *** /usr/local/staf/bin/STAFProc: free(): invalid pointer: 0xb74151b0 ***
======= Backtrace: =========
/lib/libc.so.6[0x596c71]
/usr/lib/libstdc++.so.6(_ZdlPv+0x22)[0x3ac4592]
/usr/lib/libstdc++.so.6(_ZdaPv+0x1e)[0x3ac45ee]
/usr/local/staf/lib/libSTAF.so(STAFStringDestruct+0x3b)[0x14faab]
/usr/local/staf/lib/libSTAF.so(_Z30STAFObjectFreeSTAFStringTArrayPP24STAFStringImplementationj+0x36)[0x12c2b6]
/usr/local/staf/lib/libSTAF.so(STAFObjectMarshallToString+0x14a1)[0x12f341]
/usr/local/staf/lib/libSTAF.so(STAFObjectMarshallToString+0x99c)[0x12e83c]
/usr/local/staf/lib/libSTAF.so(_ZN10STAFObject8marshallEj+0x40)[0x137db0]
/usr/local/staf/bin/STAFProc[0x80c6bce]
/usr/local/staf/bin/STAFProc(_ZN18STAFProcessService24sendNotificationCallbackEPv+0x25)[0x80c7d85]
/usr/local/staf/lib/libSTAF.so(_ZN17STAFThreadManager12workerThreadEv+0xaa)[0x14757a]
/usr/local/staf/lib/libSTAF.so(_ZN17STAFThreadManager16callWorkerThreadEPv+0x1d)[0x14784d]
/usr/local/staf/lib/libSTAF.so(+0x16a42)[0x126a42]
/lib/libpthread.so.0[0x6f3a09]
/lib/libc.so.6(clone+0x5e)[0x60543e]
======= Memory map: ========
00110000-00196000 r-xp 00000000 08:03 5642848 /usr/local/staf/lib/libSTAF.so
00196000-00198000 rw-p 00086000 08:03 5642848 /usr/local/staf/lib/libSTAF.so
00198000-001a3000 r-xp 00000000 08:03 5642862 /usr/local/staf/lib/libSTAFLIPC.so
001a3000-001a4000 rw-p 0000a000 08:03 5642862 /usr/local/staf/lib/libSTAFLIPC.so
001a4000-001b9000 r-xp 00000000 08:03 5642850 /usr/local/staf/lib/libSTAFTCP.so
001b9000-001ba000 rw-p 00015000 08:03 5642850 /usr/local/staf/lib/libSTAFTCP.so
001ba000-001c6000 r-xp 00000000 08:03 4457367 /lib/libnss_files-2.12.so
001c6000-001c7000 r--p 0000b000 08:03 4457367 /lib/libnss_files-2.12.so
001c7000-001c8000 rw-p 0000c000 08:03 4457367 /lib/libnss_files-2.12.so
001c8000-001ce000 r-xp 00000000 08:03 5642860 /usr/local/staf/lib/libSTAFDSLS.so
001ce000-001cf000 rw-p 00005000 08:03 5642860 /usr/local/staf/lib/libSTAFDSLS.so
0023d000-00242000 r-xp 00000000 08:03 4457365 /lib/libnss_dns-2.12.so
00242000-00243000 r--p 00004000 08:03 4457365 /lib/libnss_dns-2.12.so
00243000-00244000 rw-p 00005000 08:03 4457365 /lib/libnss_dns-2.12.so
0026f000-0035e000 r-xp 00000000 08:03 5642892 /usr/local/staf/lib/libcrypto.so.0.9.8
0035e000-00371000 rw-p 000ef000 08:03 5642892 /usr/local/staf/lib/libcrypto.so.0.9.8
00371000-00374000 rw-p 00000000 00:00 0
0042c000-0042d000 r-xp 00000000 00:00 0 [vdso]
00502000-00520000 r-xp 00000000 08:03 4460142 /lib/ld-2.12.so
00520000-00521000 r--p 0001d000 08:03 4460142 /lib/ld-2.12.so
00521000-00522000 rw-p 0001e000 08:03 4460142 /lib/ld-2.12.so
00528000-006b1000 r-xp 00000000 08:03 4462117 /lib/libc-2.12.so
006b1000-006b2000 ---p 00189000 08:03 4462117 /lib/libc-2.12.so
006b2000-006b4000 r--p 00189000 08:03 4462117 /lib/libc-2.12.so
006b4000-006b5000 rw-p 0018b000 08:03 4462117 /lib/libc-2.12.so
006b5000-006b8000 rw-p 00000000 00:00 0
006ba000-006e2000 r-xp 00000000 08:03 4463036 /lib/libm-2.12.so
006e2000-006e3000 r--p 00027000 08:03 4463036 /lib/libm-2.12.so
006e3000-006e4000 rw-p 00028000 08:03 4463036 /lib/libm-2.12.so
006e6000-006e9000 r-xp 00000000 08:03 4463030 /lib/libdl-2.12.so
006e9000-006ea000 r--p 00002000 08:03 4463030 /lib/libdl-2.12.so
006ea000-006eb000 rw-p 00003000 08:03 4463030 /lib/libdl-2.12.so
006ed000-00704000 r-xp 00000000 08:03 4463025 /lib/libpthread-2.12.so
00704000-00705000 r--p 00016000 08:03 4463025 /lib/libpthread-2.12.so
00705000-00706000 rw-p 00017000 08:03 4463025 /lib/libpthread-2.12.so
00706000-00708000 rw-p 00000000 00:00 0
00837000-0084c000 r-xp 00000000 08:03 4463032 /lib/libresolv-2.12.so
0084c000-0084d000 ---p 00015000 08:03 4463032 /lib/libresolv-2.12.so
0084d000-0084e000 r--p 00015000 08:03 4463032 /lib/libresolv-2.12.so
0084e000-0084f000 rw-p 00016000 08:03 4463032 /lib/libresolv-2.12.so
0084f000-00851000 rw-p 00000000 00:00 0
008cc000-00901000 r-xp 00000000 08:03 5642893 /usr/local/staf/lib/libssl.so.0.9.8
00901000-00904000 rw-p 00034000 08:03 5642893 /usr/local/staf/lib/libssl.so.0.9.8
00904000-00905000 ---p 00000000 00:00 0
00905000-00d05000 rwxp 00000000 00:00 0
00dd7000-00df4000 r-xp 00000000 08:03 4463045 /lib/libgcc_s-4.4.6-20110824.so.1
00df4000-00df5000 rw-p 0001d000 08:03 4463045 /lib/libgcc_s-4.4.6-20110824.so.1
00df5000-00df6000 ---p 00000000 00:00 0
00df6000-011f6000 rwxp 00000000 00:00 0
011f6000-011f7000 ---p 00000000 00:00 0
011f7000-015f7000 rwxp 00000000 00:00 0
0199c000-0199d000 ---p 00000000 00:00 0
0199d000-01d9d000 rwxp 00000000 00:00 0
01d9d000-01d9e000 ---p 00000000 00:00 0
01d9e000-0219e000 rwxp 00000000 00:00 0
0219e000-0219f000 ---p 00000000 00:00 0
0219f000-0259f000 rwxp 00000000 00:00 0
0259f000-025a0000 ---p 00000000 00:00 0
025a0000-029a0000 rwxp 00000000 00:00 0
029a0000-029a1000 ---p 00000000 00:00 0
029a1000-02da1000 rwxp 00000000 00:00 0
02da1000-02da2000 ---p 00000000 00:00 0
02da2000-031a2000 rwxp 00000000 00:00 0
03304000-03305000 ---p 00000000 00:00 0
03305000-03705000 rwxp 00000000 00:00 0
03a15000-03af6000 r-xp 00000000 08:03 5514958 /usr/lib/libstdc++.so.6.0.13
03af6000-03afa000 r--p 000e0000 08:03 5514958 /usr/lib/libstdc++.so.6.0.13
03afa000-03afc000 rw-p 000e4000 08:03 5514958 /usr/lib/libstdc++.so.6.0.13
03afc000-03b02000 rw-p 00000000 00:00 0
03b32000-03b7b000 r-xp 00000000 08:03 4463040 /lib/libfreebl3.so
03b7b000-03b7c000 r--p 00048000 08:03 4463040 /lib/libfreebl3.so
03b7c000-03b7d000 rw-p 00049000 08:03 4463040 /lib/libfreebl3.so
03b7d000-03b81000 rw-p 00000000 00:00 0
03bbf000-03bc6000 r-xp 00000000 08:03 4463041 /lib/libcrypt-2.12.so
03bc6000-03bc7000 r--p 00007000 08:03 4463041 /lib/libcrypt-2.12.so
03bc7000-03bc8000 rw-p 00008000 08:03 4463041 /lib/libcrypt-2.12.so
03bc8000-03bef000 rw-p 00000000 00:00 0
03ef4000-03ef5000 ---p 00000000 00:00 0
03ef5000-042f5000 rwxp 00000000 00:00 0
042f5000-042f6000 ---p 00000000 00:00 0
042f6000-046f6000 rwxp 00000000 00:00 0
046f6000-046f7000 ---p 00000000 00:00 0
046f7000-04af7000 rwxp 00000000 00:00 0
04af7000-04af8000 ---p 00000000 00:00 0
04af8000-04ef8000 rwxp 00000000 00:00 0
04ef8000-04ef9000 ---p 00000000 00:00 0
04ef9000-052f9000 rwxp 00000000 00:00 0
05471000-05472000 ---p 00000000 00:00 0
05472000-05872000 rwxp 00000000 00:00 0
05ac8000-05ac9000 ---p 00000000 00:00 0
05ac9000-05ec9000 rwxp 00000000 00:00 0
05ec9000-05eca000 ---p 00000000 00:00 0
05eca000-062ca000 rwxp 00000000 00:00 0
062ca000-062cb000 ---p 00000000 00:00 0
062cb000-066cb000 rwxp 00000000 00:00 0
066cb000-066cc000 ---p 00000000 00:00 0
066cc000-06acc000 rwxp 00000000 00:00 0
06d43000-06d44000 ---p 00000000 00:00 0
06d44000-07144000 rwxp 00000000 00:00 0
07144000-07145000 ---p 00000000 00:00 0
07145000-07545000 rwxp 00000000 00:00 0
07681000-07682000 ---p 00000000 00:00 0
07682000-07a82000 rwxp 00000000 00:00 0
07b12000-07b13000 ---p 00000000 00:00 0
07b13000-07f13000 rwxp 00000000 00:00 0
08048000-082d2000 r-xp 00000000 08:03 5642621 /usr/local/staf/bin/STAFProc
082d2000-082d4000 rw-p 0028a000 08:03 5642621 /usr/local/staf/bin/STAFProc
082d4000-082d5000 ---p 00000000 00:00 0
082d5000-086d5000 rwxp 00000000 00:00 0
08780000-08781000 ---p 00000000 00:00 0
08781000-08b81000 rwxp 00000000 00:00 0
08e9c000-08e9d000 ---p 00000000 00:00 0
08e9d000-0929d000 rwxp 00000000 00:00 0
092eb000-092ec000 ---p 00000000 00:00 0
092ec000-096ec000 rwxp 00000000 00:00 0
096ec000-096ed000 ---p 00000000 00:00 0
096ed000-09aed000 rwxp 00000000 00:00 0
09db0000-09e9a000 rw-p 00000000 00:00 0 [heap]
10bf4000-22bde000 rw-p 00000000 00:00 0
22bde000-22bdf000 ---p 00000000 00:00 0
22bdf000-22fdf000 rwxp 00000000 00:00 0
22fdf000-22fe0000 ---p 00000000 00:00 0
22fe0000-233e0000 rwxp 00000000 00:00 0
233e0000-233e1000 ---p 00000000 00:00 0
233e1000-237e1000 rwxp 00000000 00:00 0
237e1000-237e2000 ---p 00000000 00:00 0
237e2000-23be2000 rwxp 00000000 00:00 0
23be2000-23be3000 ---p 00000000 00:00 0
23be3000-23fe3000 rwxp 00000000 00:00 0
23fe3000-23fe4000 ---p 00000000 00:00 0
23fe4000-243e4000 rwxp 00000000 00:00 0
243e4000-243e5000 ---p 00000000 00:00 0
243e5000-247e5000 rwxp 00000000 00:00 0
247e5000-247e6000 ---p 00000000 00:00 0
247e6000-24be6000 rwxp 00000000 00:00 0
3567d000-4567d000 rw-p 00000000 00:00 0
4d67d000-5f66e000 rw-p 00000000 00:00 0
5ff6b000-71f9c000 rw-p 00000000 00:00 0
71f9c000-71f9d000 ---p 00000000 00:00 0
71f9d000-7239d000 rwxp 00000000 00:00 0
7239d000-7239e000 ---p 00000000 00:00 0
7239e000-7279e000 rwxp 00000000 00:00 0
7279e000-7279f000 ---p 00000000 00:00 0
7279f000-72b9f000 rwxp 00000000 00:00 0
72b9f000-72ba0000 ---p 00000000 00:00 0
72ba0000-72fa0000 rwxp 00000000 00:00 0
72fa0000-72fa1000 ---p 00000000 00:00 0
72fa1000-733a1000 rwxp 00000000 00:00 0
739f1000-7ca5d000 rw-p 00000000 00:00 0
7ca5d000-7ca5e000 ---p 00000000 00:00 0
7ca5e000-7ce5e000 rwxp 00000000 00:00 0
7cf1a000-85f60000 rw-p 00000000 00:00 0
85f60000-85f61000 ---p 00000000 00:00 0
85f61000-86361000 rwxp 00000000 00:00 0
863c7000-8f34c000 rw-p 00000000 00:00 0
8f34c000-8f34d000 ---p 00000000 00:00 0
8f34d000-8f74d000 rwxp 00000000 00:00 0
8f74d000-8f74e000 ---p 00000000 00:00 0
8f74e000-8fb4e000 rwxp 00000000 00:00 0
8fb4e000-8fb4f000 ---p 00000000 00:00 0
8fb4f000-8ff4f000 rwxp 00000000 00:00 0
97f4f000-97f50000 ---p 00000000 00:00 0
97f50000-98350000 rwxp 00000000 00:00 0
98be5000-a1c51000 rw-p 00000000 00:00 0
aac16000-aac17000 ---p 00000000 00:00 0
aac17000-ab017000 rwxp 00000000 00:00 0
ab34c000-b42d1000 rw-p 00000000 00:00 0
b42d1000-b42d2000 ---p 00000000 00:00 0
b42d2000-b46d2000 rwxp 00000000 00:00 0
b46d2000-b46d3000 ---p 00000000 00:00 0
b46d3000-b4ad3000 rwxp 00000000 00:00 0
b4ad3000-b4ad4000 ---p 00000000 00:00 0
b4ad4000-b4ed4000 rwxp 00000000 00:00 0
b4ed4000-b4ed5000 ---p 00000000 00:00 0
b4ed5000-b52d5000 rwxp 00000000 00:00 0
b55fa000-b55fb000 ---p 00000000 00:00 0
b55fb000-b59fb000 rwxp 00000000 00:00 0
b59fb000-b59fc000 ---p 00000000 00:00 0
b59fc000-b5dfc000 rwxp 00000000 00:00 0
b5dfc000-b5dfd000 ---p 00000000 00:00 0
b5dfd000-b61fd000 rwxp 00000000 00:00 0
b61fd000-b61fe000 ---p 00000000 00:00 0
b61fe000-b65fe000 rwxp 00000000 00:00 0
b65fe000-b65ff000 ---p 00000000 00:00 0
b65ff000-b69ff000 rwxp 00000000 00:00 0
b69ff000-b6a00000 ---p 00000000 00:00 0
b6a00000-b6e00000 rwxp 00000000 00:00 0
b6e00000-b6f00000 rw-p 00000000 00:00 0
b6f00000-b7000000 rw-p 00000000 00:00 0
b7000000-b7100000 rw-p 00000000 00:00 0
b7100000-b71fe000 rw-p 00000000 00:00 0
b71fe000-b7200000 ---p 00000000 00:00 0
b7200000-b72f7000 rw-p 00000000 00:00 0
b72f7000-b7300000 ---p 00000000 00:00 0
b7300000-b73ff000 rw-p 00000000 00:00 0
b73ff000-b7400000 ---p 00000000 00:00 0
b7400000-b7500000 rw-p 00000000 00:00 0
b75c9000-b77c9000 r--p 00000000 08:03 5525287 /usr/lib/locale/locale-archive
b77c9000-b77cd000 rw-p 00000000 00:00 0
b77db000-b77dc000 rw-p 00000000 00:00 0
b77dc000-b77e3000 r--s 00000000 08:03 5514093 /usr/lib/gconv/gconv-modules.cache
b77e3000-b77e5000 rw-p 00000000 00:00 0
bfd35000-bfd4a000 rwxp 00000000 00:00 0 [stack]
20120624-13:56:29;23030640;00000100;Received signal 6 (SIGABRT)

Discussion

  • Nixon

    Nixon - 2012-07-04

    I installed 3.4.10. After running few scripts, the STAF memory usage went up 100M from 800K. Kindly request to look into this issue.

    # top -n1 -b |egrep "STAF|USER"
    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    19269 root 17 0 100m 5424 3752 S 2.0 0.3 0:03.65 STAFProc

     
  • Nixon

    Nixon - 2012-07-12

    Hello Sharon, could you please look at this issue?

     
  • Sharon Lucas

    Sharon Lucas - 2012-07-16

    Are you using the returnstdout/stderr option when running a process via STAF where your process stdout/stderr data is large (or submitting GET FILE requests via the FS service where the file size is large)? If so, this could be causing this problem when trying to send notification(s) that a process completed as this includes the process's stdout/stderr data as a single string so if this data is large, it can cause issues running out of memory.

    Section "8.13 Process Service" in the STAF User's Guide at http://staf.sourceforge.net/current/STAFUG.htm#HDRPROCSRV contains a note that says:

    "3. Since the entire contents of returned files are stored in the result string, if you attempt to return the contents of a very large file, you may run out of memory so it is not recommended that you use the RETURNSTDOUT, RETURNSTDERR, or RETURNFILE options to return large files. To help prevent this problem, you can specify a maximum size for a file returned by this request by setting the MAXRETURNFILESIZE operational parameter in the STAF configuration file on the machine where the process is run, or by setting the STAF/MaxReturnFileSize variable in the request variable pool of the handle that submitted the request. The lowest of these two values is used as the maximum return file size (not including 0 which indicates no limit). "

    You should look at your code to see if you are returning files from a process that can be very large or submitting FS GET FILE requests for files that can be very large. You can set the MAXRETURNFILESIZE operational setting (either in the STAF.cfg file or dynamically by submitting a PROCESS SET MAXRETURNFILESIZE request) on the machines where you are running processes to limit the maximum size returned to help prevent STAFProc from running out of memory.

     
  • Nixon

    Nixon - 2012-07-16

    Thanks Sharon for the reply. I do use options RETURNSTDPUT RETURNSTDERR. however i suspect that is not causing the issue. Based on your explanation, i would expect the memory used should be freed up when the tests are completed.

    what i am seeing is, the reboot the sever and note down the virtual memory usage which is around 800K. Now, i start the test, the virtual memory goes up to >100M, now the test ends, but still the virtual memory is not freed up. if I keep running the test without rebooting the server, the virtual memory keeps increasing and crashes in couple of days.

    I would kindly request to look at bug 3500584 where i have a very simple program to reproduce this issue. we are really bogged down by this issue otherwise STAF works really great. your help resolving this issue is highly appreciated.

     

Log in to post a comment.