Re: [Refdb-users] Recovering after errors
Status: Beta
Brought to you by:
mhoenicka
From: Daniel O'D. <dan...@ul...> - 2006-07-11 23:30:08
|
Here's l=7 output of my last session. I tried to add a single entry in a text file ~/one.ris using the following in refdbc; things just hung on me and I had to abort using ^C: selectdb refdbib addref -f one.ris (I started refdbc in ~/) refdbd.log: > 7:pid=6427:Tue Jul 11 23:21:05 2006:dbi_driver_dir went to: > 7:pid=6427:Tue Jul 11 23:21:05 2006: > 7:pid=6427:Tue Jul 11 23:21:05 2006:dbi is up using default driver dir > 6:pid=6427:Tue Jul 11 23:21:05 2006:Available libdbi database drivers: > 6:pid=6427:Tue Jul 11 23:21:05 2006:mysql > 6:pid=6427:Tue Jul 11 23:21:05 2006:Requested libdbi driver found > 6:pid=6427:Tue Jul 11 23:21:05 2006:Database directory: > 6:pid=6427:Tue Jul 11 23:21:05 2006:/usr/var/lib/refdb/db > 6:pid=6427:Tue Jul 11 23:21:05 2006:application server started > 6:pid=6427:Tue Jul 11 23:21:05 2006:share extended notes by default > 7:pid=6427:Tue Jul 11 23:21:05 2006:use /tmp/refdbd_fifo6427 as fifo > 6:pid=6427:Tue Jul 11 23:21:05 2006:server waiting n_max_fd=5 > 6:pid=6427:Tue Jul 11 23:21:35 2006:adding client 127.0.0.1 on fd 6 > 6:pid=6427:Tue Jul 11 23:21:35 2006:server waiting n_max_fd=6 > 7:pid=6429:Tue Jul 11 23:21:35 2006:try to read from client > 6:pid=6429:Tue Jul 11 23:21:35 2006:serving client on fd 6 with protocol version 4 > 7:pid=6429:Tue Jul 11 23:21:35 2006:210-21-04-49 > 7:pid=6429:Tue Jul 11 23:21:35 2006:send pseudo-random string to client > 7:pid=6429:Tue Jul 11 23:21:35 2006:selectdb refdbib -u dan -w 072035094057068069114113082 > 6:pid=6429:Tue Jul 11 23:21:35 2006:dbi is up > 7:pid=6429:Tue Jul 11 23:21:35 2006:localhost > 7:pid=6429:Tue Jul 11 23:21:35 2006:dan > 7:pid=6429:Tue Jul 11 23:21:35 2006:SecretPassWord > 7:pid=6429:Tue Jul 11 23:21:35 2006: > 7:pid=6429:Tue Jul 11 23:21:35 2006:3306 > 7:pid=6429:Tue Jul 11 23:21:35 2006:mysql > 7:pid=6429:Tue Jul 11 23:21:35 2006:/usr/var/lib/refdb/db > 7:pid=6429:Tue Jul 11 23:21:35 2006: > 7:pid=6429:Tue Jul 11 23:21:35 2006:refdb > 7:pid=6429:Tue Jul 11 23:21:35 2006:connected to database server using database:7:pid=6429:Tue Jul 11 23:21:35 2006:refdb > 3:pid=6429:Tue Jul 11 23:21:35 2006:could not open version file: > 3:pid=6429:Tue Jul 11 23:21:35 2006:/usr/local/var/lib/refdb/db/DB_VERSION > 7:pid=6429:Tue Jul 11 23:21:35 2006:Main database looks ok: > 7:pid=6429:Tue Jul 11 23:21:35 2006:refdb > 7:pid=6429:Tue Jul 11 23:21:35 2006:localhost > 7:pid=6429:Tue Jul 11 23:21:35 2006:dan > 7:pid=6429:Tue Jul 11 23:21:35 2006:SecretPassWord > 7:pid=6429:Tue Jul 11 23:21:35 2006: > 7:pid=6429:Tue Jul 11 23:21:35 2006:3306 > 7:pid=6429:Tue Jul 11 23:21:35 2006:mysql > 7:pid=6429:Tue Jul 11 23:21:35 2006:/usr/var/lib/refdb/db > 7:pid=6429:Tue Jul 11 23:21:35 2006: > 7:pid=6429:Tue Jul 11 23:21:35 2006:refdb > 7:pid=6429:Tue Jul 11 23:21:35 2006:connected to database server using database:7:pid=6429:Tue Jul 11 23:21:35 2006:refdb > 7:pid=6429:Tue Jul 11 23:21:35 2006:localhost > 7:pid=6429:Tue Jul 11 23:21:35 2006:dan > 7:pid=6429:Tue Jul 11 23:21:35 2006:SecretPassWord > 7:pid=6429:Tue Jul 11 23:21:35 2006: > 7:pid=6429:Tue Jul 11 23:21:35 2006:3306 > 7:pid=6429:Tue Jul 11 23:21:35 2006:mysql > 7:pid=6429:Tue Jul 11 23:21:35 2006:/usr/var/lib/refdb/db > 7:pid=6429:Tue Jul 11 23:21:35 2006: > 7:pid=6429:Tue Jul 11 23:21:35 2006:refdbib > 7:pid=6429:Tue Jul 11 23:21:35 2006:connected to database server using database:7:pid=6429:Tue Jul 11 23:21:35 2006:refdbib > 7:pid=6429:Tue Jul 11 23:21:35 2006:SELECT meta_app,meta_type,meta_dbversion from t_meta > 7:pid=6429:Tue Jul 11 23:21:35 2006:command processing done, finish dialog now > 6:pid=6429:Tue Jul 11 23:21:35 2006:child finished client on fd 6 > 6:pid=6427:Tue Jul 11 23:21:35 2006:parent removing client on fd 6 > 6:pid=6427:Tue Jul 11 23:21:35 2006:server waiting n_max_fd=5 > 6:pid=6427:Tue Jul 11 23:21:35 2006:child exited with code 0 > 6:pid=6427:Tue Jul 11 23:21:35 2006:server waiting n_max_fd=5 > 6:pid=6427:Tue Jul 11 23:21:46 2006:adding client 127.0.0.1 on fd 6 > 6:pid=6427:Tue Jul 11 23:21:46 2006:server waiting n_max_fd=6 On Tue, 2006-11-07 at 23:56 +0200, Markus Hoenicka wrote: > Hi Dan, > > Dan O'Donnell writes: > > I've been having trouble lately with refdb crashing after errors and > > then being extremely difficult to restart. A typical cause for something > > like this might be an incorrect command-line option (I once used -t for > > the file name instead of -f and was forced to use control c to abort). > > After an error like this, refdb seems to lose contact with its databases > > Mysql, however, is still working. > > > > What command were you trying to run? Do you remember the exact command > line? I'd like to replay what was going wrong here. It is likely that > refdbd lacks a few sanity checks for variable values. > > > I've tried several things to get it running again from using refdbctl to > > stop and start, to restarting mysql and apache2, to removing my > > configuration files and reinstalling refdb with the installation script > > (often when I come to do this there is one or more refdbd sessions not > > properly killed off and immune to refdbctl [I kill them manually]) > > > > refdbctl kills only the process that registered its PID in the > appropriate file. If you bypass refdbctl and start refdbd manually, > you may end up running two processes, one of which can't be killed > with refdbctl. Also, if something goes grossly wrong, you may have the > parent and the child around at the same time. If the child hangs > (which it should never do, of course) you can only kill it manually > from the process list. > > > I usually get set and viewstat to work in refdba, and can usually > > selectdb and use whichdb in refdbc. But things hang up the moment I try > > to add any references. > > > > Any ideas what might be making it so unstable? Here's the log of my last > > session. I image the problem is the lost version file > > at /usr/local/var/lib/refdb/db/DB_VERSION. In refdbdrc this path is > > given as /usr/var/lib/refdb/db/ so I'm not sure what is telling it to > > look in this (non-existent) directory unless something in the setup has > > missed my original prefix parameter: > > > > Is that the log of a failed addref command? You should re-run this > test with the log level set to 7. The "error" message does not seem to > have much to do with the DB_VERSION stuff. The latter is only a means > to give a packaging tool a hint about the database version without > having to look at the database itself (which might require username > and password info). If refdbd can't update this file it will continue > without a hitch. The file or the write attempt has no meaning for the > running process. > > > > > > 6:pid=5412:Tue Jul 11 20:13:15 2006:adding client 127.0.0.1 on fd 6 > > > 6:pid=5412:Tue Jul 11 20:13:15 2006:server waiting n_max_fd=6 > > > 6:pid=5456:Tue Jul 11 20:13:15 2006:serving client on fd 6 with protocol version 4 > > > 6:pid=5456:Tue Jul 11 20:13:15 2006:dbi is up > > > 3:pid=5456:Tue Jul 11 20:13:15 2006:could not open version file: > > > 3:pid=5456:Tue Jul 11 20:13:15 2006:/usr/local/var/lib/refdb/db/DB_VERSION > > > 4:pid=5456:Tue Jul 11 20:19:29 2006:error > > > 6:pid=5456:Tue Jul 11 20:19:29 2006:child finished client on fd 6 > > > > My only advice (until I get more thorough debug info) is to make sure > that you kill all hanging child processes if things go wrong. On many > OSes the process IDs count up, so the child is usually the process > with the higher ID. Other OSes pick random numbers, so you'll have to > kill them all. > > regards, > Markus > -- Daniel Paul O'Donnell, PhD Associate Professor and Chair Director, Digital Medievalist Project <http://www.digitalmedievalist.org/> Department of English University of Lethbridge Lethbridge AB T1K 3M4 Tel. +1 (403) 329-2378 Fax. +1 (403) 382-7191 :@wiglaf (dapper ubuntu) |