|
From: Guilhem B. <gu...@my...> - 2004-07-26 22:32:45
|
Hello,
Using Linux 2.4.22 LinuxThreads.
The original problem is that when run in Valgrind 2.0.0 (and older),
the MySQL daemon (mysqld) reacts to the TERM signal (that is, it does
what it is supposed to do: exit gracefully), but when run in Valgrind
2.1.2 (and, I believe, 2.1.0), mysqld does not react to TERM at all.
Signal catching in mysqld is left to one thread, which does a
sigwait() until it gets a TERM signal.
Note that mysqld, which is a multi-threaded application, shows up as
one unique process when in 2.0.0, and several when in 2.1.2:
In 2.0.0:
[guilhem@gbichot2 guilhem]$ ps -elf | grep mysqld
0 S guilhem 7444 2687 19 71 0 - 15621 nanosl 00:14 pts/3 00:00:03 /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
vs in 2.1.2:
[guilhem@gbichot2 guilhem]$ ps -elf | grep mysqld
0 S guilhem 800 2687 3 69 0 - 394306 poll 23:25 pts/3 00:00:06 valgrind --tool=memcheck /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
1 S guilhem 801 800 0 69 0 - 394306 pipe_w 23:25 pts/3 00:00:00 valgrind --tool=memcheck /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
1 S guilhem 802 800 0 69 0 - 394306 rt_sig 23:25 pts/3 00:00:00 valgrind --tool=memcheck /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
1 S guilhem 803 800 0 69 0 - 394306 nanosl 23:25 pts/3 00:00:00 valgrind --tool=memcheck /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
1 S guilhem 804 800 0 69 0 - 394306 pipe_w 23:25 pts/3 00:00:00 valgrind --tool=memcheck /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
1 S guilhem 1074 800 0 69 0 - 394306 pipe_w 23:26 pts/3 00:00:00 valgrind --tool=memcheck /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
1 S guilhem 1075 800 0 69 0 - 394306 pipe_w 23:26 pts/3 00:00:00 valgrind --tool=memcheck /home/mysql_src/mysql-4.0/sql/mysqld --defaults-file=/home/mysql_src/my_master.cnf --user=guilhem --datadir=/m/data/4/1 --server-id=1 --log-bin --language=/home/mysql_src/mysql-4.0/sql/share/english/ --skip-grant-tables --skip-innodb --skip-bdb --debug
Mono-thread applications runs fine in Valgrind 2.1.2 as far as signal
catching is concerned.
I have written a test program which demonstrates something is strange
(either in Valgrind or in my test program):
#include <stdlib.h>
#include <signal.h>
void *sigcatch(void *arg)
{
sigset_t set;
int sig, i;
printf("SIGCATCH started\n");
if (pthread_detach(pthread_self()))
printf("SIGCATCH could not detach\n");
sigemptyset(&set);
for (i= 0; i<32; i++)
sigaddset(&set, i);
while(1)
{
printf("SIGCATCH sigwait\n");
sigwait(&set, &sig); // rt_sig in 'ps'
printf("SIGCATCH saw signal %d\n", sig);
}
}
main()
{
pthread_t sighandler;
sigset_t set;
if (pthread_create(&sighandler, NULL, sigcatch, NULL))
printf("MAIN could not create thread sigcatch\n");
printf("MAIN thread sigcatch created\n");
sleep(1000); // nanosl in 'ps'
}
When run in 2.0.0,
gcc -lpthread -g a.c ;valgrind ./a.out
I see:
[guilhem@gbichot2 guilhem]$ ps -elf | grep a.out
0 S guilhem 7524 732 1 75 0 - 5193 nanosl 00:16 pts/4 00:00:00 ./a.out
and if I send it 3 TERM signals:
[guilhem@gbichot2 tmp]$ gcc -lpthread -g a.c ;valgrind ./a.out
==7524== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux.
==7524== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==7524== Using valgrind-2.0.0, a program supervision framework for x86-linux.
==7524== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
==7524== Estimated CPU clock rate is 1662 MHz
==7524== For more details, rerun with: -v
==7524==
MAIN thread sigcatch created
SIGCATCH started
SIGCATCH sigwait
SIGCATCH saw signal 15
SIGCATCH sigwait
SIGCATCH saw signal 15
SIGCATCH sigwait
SIGCATCH saw signal 15
SIGCATCH sigwait
When run in 2.1.0 (--tool=memcheck), I see
[guilhem@gbichot2 guilhem]$ ps -elf | grep a.out
0 R guilhem 7985 732 5 78 0 - 6205 - 00:25 pts/4 00:00:00 ./a.out
1 S guilhem 7991 7985 0 69 0 - 6205 nanosl 00:25 pts/4 00:00:00 ./a.out
1 S guilhem 7992 7985 0 74 0 - 6205 rt_sig 00:25 pts/4 00:00:00 ./a.out
and when I do a kill -TERM on the rt_sig process,
program just terminates:
[guilhem@gbichot2 tmp]$ gcc -lpthread -g a.c ;valgrind ./a.out
==7985== Memcheck, a memory error detector for x86-linux.
==7985== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==7985== Using valgrind-2.1.0, a program supervision framework for x86-linux.
==7985== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
==7985== Estimated CPU clock rate is 1664 MHz
==7985== For more details, rerun with: -v
==7985==
MAIN thread sigcatch created
SIGCATCH started
SIGCATCH sigwait
==7985==
==7985== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==7985== malloc/free: in use at exit: 200 bytes in 1 blocks.
==7985== malloc/free: 2 allocs, 1 frees, 212 bytes allocated.
==7985== For a detailed leak analysis, rerun with: --leak-check=yes
==7985== For counts of detected errors, rerun with: -v
Terminated
Looks like the signal is not delivered to the "good" thread?
Please, what are the rules for which thread gets the signal in
Valgrind?
Thanks for any help you could provide. Maybe I am doing something
wrong. But the MySQL code hasn't been changed for years and it used to
work in Valgrind <= 2.0.0.
Thank you again for providing Valgrind to us!!
--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Mr. Guilhem Bichot <gu...@my...>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Full-Time Software Developer
/_/ /_/\_, /___/\___\_\___/ Bordeaux, France
<___/ www.mysql.com
|