From: <tho...@ao...> - 2007-06-26 17:13:07
|
Hi all, Had a perfectly working RAID 5 set with evms 2.5.5 on Linux 2.6.12 32-bit, and then my RAID set went haywire. Here is log output from the system log and evms-engine. It looks like 3 members (hard drives) just left the RAID set inexplicably. But the system still sees those drives when I do fdisk -l. Please help...we have some very valuable data on here we would like to recover. Thank you so much in advance! System Log ----------- At 08:15, see how there are 8 child objects numbered 0 thru 7. At 09:06, suddenly, some childs (1 and 2) disappear and there are now multiple child 6's. At 09:28, now childs start disappearing altogether. ----------- [Tue Jun 26 08:04:13 2007] query:children,/dev/evms/Exchange [Tue Jun 26 08:04:15 2007] query:children,/dev/evms/FileStore [Tue Jun 26 08:04:16 2007] query:children,/dev/evms/MasterControl [Tue Jun 26 08:15:34 2007] query:Containers [Tue Jun 26 08:15:35 2007] query:children,lvm2/RAID5 [Tue Jun 26 08:15:36 2007] q:ei, md/md0 [Tue Jun 26 08:15:37 2007] q:ei, md/md0 [Tue Jun 26 08:15:39 2007] q:ei, md/md0 ,? child_object0 [Tue Jun 26 08:15:39 2007] q:ei, md/md0 ,? child_object1 [Tue Jun 26 08:15:41 2007] q:ei, md/md0 ,? child_object2 [Tue Jun 26 08:15:41 2007] q:ei, md/md0 ,? child_object3 [Tue Jun 26 08:15:43 2007] q:ei, md/md0 ,? child_object4 [Tue Jun 26 08:15:44 2007] q:ei, md/md0 ,? child_object5 [Tue Jun 26 08:15:44 2007] q:ei, md/md0 ,? child_object6 [Tue Jun 26 08:15:46 2007] q:ei, md/md0 ,? child_object7 [Tue Jun 26 08:15:46 2007] query:Containers,container=lvm2/RAID5 [Tue Jun 26 08:15:47 2007] query:extended info,lvm2/RAID5 [Tue Jun 26 08:15:49 2007] query:children,lvm2/RAID5 [Tue Jun 26 08:15:49 2007] query:children,md/md0 [Tue Jun 26 09:05:27 2007] query:parent,lvm2/RAID5 [Tue Jun 26 09:05:27 2007] query:parent, lvm2/RAID5/Freespace [Tue Jun 26 09:05:28 2007] query:parent, lvm2/RAID5/RegExchange [Tue Jun 26 09:05:29 2007] query:parent, lvm2/RAID5/RegMasterControl [Tue Jun 26 09:05:30 2007] query:parent, lvm2/RAID5/RegFileStore [Tue Jun 26 09:05:32 2007] query:children,lvm2/RAID5 [Tue Jun 26 09:05:32 2007] q:ei, md/md0 [Tue Jun 26 09:05:34 2007] query:extended info, md/md0 [Tue Jun 26 09:05:35 2007] query:Containers,container=lvm2/RAID5 [Tue Jun 26 09:05:36 2007] query:Containers,container=lvm2/RAID5,Freespace [Tue Jun 26 09:05:39 2007] query:Containers,container=lvm2/RAID5 [Tue Jun 26 09:05:40 2007] query:Containers,container=lvm2/RAID5,Freespace [Tue Jun 26 09:05:59 2007] query:children,lvm2/RAID5 [Tue Jun 26 09:05:59 2007] q:ei, md/md0 [Tue Jun 26 09:06:00 2007] q:ei, md/md0 [Tue Jun 26 09:06:01 2007] q:ei, md/md0 ,? child_object0 [Tue Jun 26 09:06:02 2007] q:ei, md/md0 ,? child_object3 [Tue Jun 26 09:06:03 2007] q:ei, md/md0 ,? child_object4 [Tue Jun 26 09:06:04 2007] q:ei, md/md0 ,? child_object5 [Tue Jun 26 09:06:05 2007] q:ei, md/md0 ,? child_object6 [Tue Jun 26 09:06:06 2007] q:ei, md/md0 ,? child_object6 [Tue Jun 26 09:06:06 2007] q:ei, md/md0 ,? child_object6 [Tue Jun 26 09:06:07 2007] q:ei, md/md0 ,? child_object7 [Tue Jun 26 09:06:08 2007] query:Containers,container=lvm2/RAID5 [Tue Jun 26 09:06:09 2007] query:extended info,lvm2/RAID5 [Tue Jun 26 09:06:10 2007] query:parent,lvm2/RAID5 [Tue Jun 26 09:06:11 2007] query:parent, lvm2/RAID5/Freespace [Tue Jun 26 09:06:12 2007] query:parent, lvm2/RAID5/RegExchange [Tue Jun 26 09:06:14 2007] query:parent, lvm2/RAID5/RegMasterControl [Tue Jun 26 09:06:15 2007] query:parent, lvm2/RAID5/RegFileStore [Tue Jun 26 09:06:16 2007] query:children,lvm2/RAID5 [Tue Jun 26 09:06:17 2007] query:children,md/md0 [Tue Jun 26 09:06:19 2007] query:Objects [Tue Jun 26 09:27:57 2007] query:Containers [Tue Jun 26 09:27:58 2007] query:children,lvm2/RAID5 [Tue Jun 26 09:27:59 2007] q:ei, md/md0 [Tue Jun 26 09:28:01 2007] q:ei, md/md0 [Tue Jun 26 09:28:02 2007] q:ei, md/md0 ,? child_object0 [Tue Jun 26 09:28:03 2007] q:ei, md/md0 ,? child_object3 [Tue Jun 26 09:28:04 2007] q:ei, md/md0 ,? child_object4 [Tue Jun 26 09:28:05 2007] q:ei, md/md0 ,? child_object5 [Tue Jun 26 09:28:06 2007] q:ei, md/md0 ,? child_object7 [Tue Jun 26 09:28:08 2007] query:Containers,container=lvm2/RAID5 [Tue Jun 26 09:28:08 2007] query:extended info,lvm2/RAID5 [Tue Jun 26 09:28:10 2007] query:children,lvm2/RAID5 [Tue Jun 26 09:28:10 2007] query:children,md/md0 [Tue Jun 26 09:28:11 2007] query:parent,lvm2/RAID5 [Tue Jun 26 09:28:13 2007] query:parent, lvm2/RAID5/Freespace [Tue Jun 26 09:28:14 2007] query:parent, lvm2/RAID5/RegExchange [Tue Jun 26 09:28:15 2007] query:parent, lvm2/RAID5/RegMasterControl [Tue Jun 26 09:28:16 2007] query:parent, lvm2/RAID5/RegFileStore EVMS-Engine.log ---------------- Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:04 prodserver_3_ LocalDskMgr: check_multipath: Cannot get list of DM devices. Jun 26 10:38:05 prodserver_3_ Engine: engine_ioctl_object: ioctl to object md/md0 failed with error code 19: No such device Jun 26 10:38:05 prodserver_3_ MDRaid5RegMgr: md_analyze_volume: Object sdg1 is out of date. Jun 26 10:38:05 prodserver_3_ MDRaid5RegMgr: md_analyze_volume: Object sdf1 is out of date. Jun 26 10:38:05 prodserver_3_ MDRaid5RegMgr: md_analyze_volume: Object sdb1 is out of date. Jun 26 10:38:05 prodserver_3_ MDRaid5RegMgr: md_analyze_volume: Found 3 stale objects in region md/md0. Jun 26 10:38:05 prodserver_0_ MDRaid5RegMgr: sb0_analyze_sb: MD region md/md0 is corrupt Jun 26 10:38:05 prodserver_3_ MDRaid5RegMgr: md_fix_dev_major_minor: MD region md/md0 is corrupt. Jun 26 10:38:05 prodserver_0_ Engine: plugin_user_message: Message is: MDRaid5RegMgr: Region md/md0 : MD superblocks found in object(s) [sdg1 sdf1 sdb1 ] are not valid.? [sdg1 sdf1 sdb1 ] will not be activated and should be removed from the region. ? Jun 26 10:38:05 prodserver_0_ Engine: plugin_user_message: Message is: MDRaid5RegMgr: RAID5 region md/md0 is corrupt.? The number of raid disks for a full functional array is 7.? The number of active disks is 4. Jun 26 10:38:05 prodserver_2_ MDRaid5RegMgr: raid5_read: MD Object md/md0 is corrupt, data is suspect ? ________________________________________________________________________ AOL now offers free email to everyone. Find out more about what's free from AOL at AOL.com. |