Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#53 GT.M process XXXX has been killed by a signal 11 at address

open
nobody
None
5
2012-12-29
2004-04-28
ivan
No

hi folks,

i need some help, because m is very new to me.
i used gt.m sckserv2.m sample server and i wrote a
little perl script to
test it with a lot of connection (i'm interested in
that gt.m is capable
to handle hundreds of connections).

after starting the server and the script i always got
following error
message after nearly equal intervals:
%GTM-F-KILLBYSIGSINFO1, GT.M process XXXX has been
killed by a signal 11
at address 0x4026DDB5 (vaddr 0x6ADD29A4)

i attached strace outputs and core dump.

i tested it on two machines:
first:
debian unstable
linux 2.6.6rc2 smp
libc6 2.3.2-ds1-11
libncurses5 5.4-3
gt.m V4.04-004
strace output: error.log
second:
debian unstable
linux 2.6.2rc2 smp
libc6 2.3.2-ds1-10
libncurses5 5.3.20030510-2
gt.m V4.04-003
end of strace output (full is about 160mb):
error2_end.log error3_end.log

before running the script i initialized the ^PROBA(1)
global

the perl script:
#!/usr/bin/perl

use IO::Socket ;

my %hsck ;
my $rdata, $max_con, $max_cycle ;

$max_con = 100 ;
$max_cycle = 1000 ;

for ($i = 0 ; $i < $max_con ; $i++)
{
$hsck[$i] = IO::Socket::INET -> new('127.0.0.1:10000') ;
}

for ($i = 0 ; $i < $max_con ; $i++)
{
for ($j = 0 ; $j < $max_cycle ; $j++) {
$hsck[$i] -> send("set ^PROBA(1)=^PROBA(1)+1\r\n") ;
$hsck[$i] -> recv($rdata, 1024) ;
if (substr($rdata, 0, 2) != "ok") {
print "S: \$sck: $i, \$cycle: $j, DATA:
|$rdata|\n" ;
}
}
}

for ($i = 0 ; $i < $max_con ; $i++)
{
shutdown($hsck[$i], 2) ;
}

here is the gdb output (i don't know how much usable
is):GNU gdb 6.1-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public
License, and you are
welcome to change it and/or distribute copies of it
under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show
warranty" for details.
This GDB was configured as "i386-linux"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".

Core was generated by `mumps -run anyad'.
Program terminated with signal 3, Quit.

warning: current_sos: Can't read pathname for load map:
Be/kimeneti hiba

Reading symbols from /lib/libncurses.so.5...done.
Loaded symbols for /lib/libncurses.so.5
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libdl.so.2...done.
Loaded symbols for /lib/tls/libdl.so.2
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/local/gtm/libgtmshr.so...done.
Loaded symbols for /usr/local/gtm/libgtmshr.so
Reading symbols from /lib/tls/libnss_compat.so.2...done.
Loaded symbols for /lib/tls/libnss_compat.so.2
Reading symbols from /lib/tls/libnsl.so.1...done.
Loaded symbols for /lib/tls/libnsl.so.1
Reading symbols from /lib/tls/libnss_nis.so.2...done.
Loaded symbols for /lib/tls/libnss_nis.so.2
Reading symbols from /lib/tls/libnss_files.so.2...done.
Loaded symbols for /lib/tls/libnss_files.so.2
#0 0x400ae061 in kill () from /lib/tls/libc.so.6
(gdb) where
#0 0x400ae061 in kill () from /lib/tls/libc.so.6
#1 0x40223910 in gtm_dump_core () from
/usr/local/gtm/libgtmshr.so
#2 0x40223ce5 in gtm_fork_n_core () from
/usr/local/gtm/libgtmshr.so
#3 0x402205a5 in generic_signal_handler () from
/usr/local/gtm/libgtmshr.so
#4 0xffffe440 in __kernel_sigreturn ()
#5 0x4026ddb5 in t_begin () from
/usr/local/gtm/libgtmshr.so
#6 0x4022b17b in gvcst_root_search () from
/usr/local/gtm/libgtmshr.so
#7 0x402276f1 in gv_bind_name () from
/usr/local/gtm/libgtmshr.so
#8 0x401e98fb in op_gvname () from
/usr/local/gtm/libgtmshr.so
#9 0x08114897 in ?? ()
#10 0x00000002 in ?? ()
#11 0x0811485c in ?? ()
#12 0x08114834 in ?? ()
#13 0x402fd65c in __DTOR_END__ () from
/usr/local/gtm/libgtmshr.so
#14 0x401bb210 in timezone () from /lib/tls/libc.so.6
#15 0xbffffb30 in ?? ()
#16 0xbfffeac8 in ?? ()
#17 0x401d5833 in gtm_main () from
/usr/local/gtm/libgtmshr.so
Previous frame identical to this frame (corrupt stack?)

thanks for advance,
ivan

p.s.: thanks for gt to gave this powerful tool to us

Discussion

  • ivan
    ivan
    2004-04-28

    logs and core dump

     
    Attachments
  • Logged In: YES
    user_id=97924

    Ivan, thank you for your interest in GT.M. Could you please
    test your program with fewer number of connections (less
    than 64) and see if it passes. We would like to find out if
    the issue is with the number of connections.

    Thanks,
    Vinaya

     
  • ivan
    ivan
    2004-04-28

    64 connection script output

     
  • Logged In: YES
    user_id=97924

    Ivan, GT.M currently does not support 64 or more socket
    connections. Unfortunately, this limit is not enforced
    correctly within GT.M, and so the bad data reply at 64
    connections and abnormal termination with more than 64
    connections. We have an action item to improve GT.M behavior
    to extend the limit (possibly to the maximum allowed by the
    system), and also improve error handling if the limit is
    crossed. We will keep you posted when this issue gets
    addressed in a future release. Meanwhile, the workaround is
    to use fewer than 64 connections, or perhaps run multiple
    servers to accept less than 64 connections each. We
    apologize for any inconveniece caused.

    Thanks,
    Vinaya

     
  • ivan
    ivan
    2004-04-29

    Logged In: YES
    user_id=1031262

    hi folks,

    i found a solution for the problem.

    i set MAX_N_SOCKET to a higher number (from 64 to 2048) in
    sr_port/iosocketdef.h in the source and compiled a new binary.

    i tested the new binary with 1000 connections and worked
    perfectly.

    ivan

     
  • Logged In: YES
    user_id=97924

    Ivan, we are glad that you found a solution that works for
    you. Please bear in mind that you do not want to reach or
    cross the 2048 connection limit (nor the system wide file
    descriptor limit).

    Thanks,
    Vinaya