From: Ricardo J. B. <ric...@da...> - 2012-02-15 17:43:56
|
Hi, Today one of our mfsmasters crashed and had to be rebooted. While the master server was hung, the metalogger tried to re-connect but it segfaulted: Feb 15 13:46:15 bkpmds02 mfsmetalogger[1992]: connecting ... Feb 15 13:46:39 bkpmds02 mfsmetalogger[1992]: connection failed, error: EHOSTUNREACH (No route to host) Feb 15 13:46:40 bkpmds02 mfsmetalogger[1992]: connecting ... Feb 15 13:46:41 bkpmds02 kernel: mfsmetalogger[1992]: segfault at 0000000000000060 rip 0000003536c612ed rsp 00007ffff2247a10 error 4 At that time, from the master logs you can see it was hung: Feb 15 13:45:00 bkpmds01 mfsmaster[2437]: total: usedspace: 30279830740992 (28200.29 GiB), totalspace: 32242667642880 (30028.32 GiB), usage: 93.91% Feb 15 13:49:36 bkpmds01 syslogd 1.4.1: restart. Feb 15 13:49:36 bkpmds01 kernel: klogd 1.4.1, log source = /proc/kmsg started. Feb 15 13:49:36 bkpmds01 kernel: Linux version 2.6.18-274.17.1.el5 (moc...@bu...) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-51)) #1 SMP Tue Jan 10 17:25:58 EST 2012 Both servers have their time synchronized via ntp. both are CentOS 5.7 64 bits with mfs 1.6.20 installed from RepoForge.org. I'm still investigating the cause of the master hang, as nothing gets logged (I'm thinking RAM or CPU problems) but I'm reporting the segfault in case it can be debugged. Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! ------------------------------------------ |