I'm having a strange problem with a one of my guest configs. I made a copy of a functioning FC2 guest and upgraded it to FC4 using yum. The upgrade appears to have gone fine once I resolved some package conflicts and etc.
The only problem that I see is that it won't boot every time. Whenever I shut it down the next time it won't boot, and then when I try again it does boot fine again. If I was only using it interactivly I wouldn't worry about this, but it is part of my automated build system for wxPython so I need it to be able to booted reliably from another script running on another machine.
Has anybody else seen something like this? Is there some config change I can make in my FC4 guest to work around this?
Here is my coLinux config:
<?xml version="1.0" encoding="UTF-8"?>
<block_device index="0" path="\DosDevices\c:\coLinux\VMs\fc4.root_fs.img" alias="hda5" enabled="true" />
<block_device index="1" path="\DosDevices\c:\coLinux\VMs\fc4.swap.img" alias="hda6" enabled="true" />
<initrd path="..\initrd.gz" />
<image path="..\vmlinux" />
<memory size="256" />
<network index="0" type="tap" name="TAP 05" mac="00:11:11:11:11:34" />
And here is a log of a failed boot:
c:\coLinux\VMs>..\colinux-daemon.exe -c fc4.colinux.xml -d
searching TAP device named "TAP 05"
found TAP device named "TAP 01"
found TAP device named "TAP 00"
found TAP device named "TAP 04"
found TAP device named "TAP 02"
found TAP device named "TAP 05"
opening TAP: "TAP 05"
driver version 8.1
Cooperative Linux Daemon, 0.6.3
Compiled on Sun Feb 5 20:25:03 2006
Linux version 2.6.11-co-0.6.3 (george@CoDebianDevel) (gcc version 3.4.4 20050314 (prerelease) (Debian 3.4.3-13)) #1 Sun 256MB LOWMEM available.
initrd enabled: start: 0xcfe10000 size: 0x001ef78a)
On node 0 totalpages: 65536
DMA zone: 0 pages, LIFO batch:1
Normal zone: 65536 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
Built 1 zonelists
Kernel command line: root=/dev/hda5 ro
Setting proxy interrupt vectors
PID hash table entries: 2048 (order: 11, 32768 bytes)
Using cooperative for high-res timesource
Console: colour CoCON 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 253952k/262144k available (1537k kernel code, 0k reserved, 521k data, 108k init, 0k highmem)
Calibrating delay loop... 418.61 BogoMIPS (lpj=2093056)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: After all inits, caps: bfebfbff 00000000 00000000 00000080 00004400 00000000 00000000
CPU: Intel(R) Xeon(TM) CPU 2.40GHz stepping 09
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
checking if image is initramfs...it isn't (no cpio magic); looks like an initrd
Freeing initrd memory: 1981k freed
NET: Registered protocol family 16
devfs: 2004-01-31 Richard Gooch (firstname.lastname@example.org)
devfs: boot_options: 0x0
cofuse init 0.1 (API version 2.2)
Initializing Cryptographic API
serio: cokbd at irq 1
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
cobd: loaded (max 32 devices)
alias for cobd0 is hda5
cobd alias cobd0 -> hda5 created
alias for cobd1 is hda6
cobd alias cobd1 -> hda6 created
loop: loaded (max 8 devices)
conet: loaded (max 16 devices)
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on cokbd
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 4, 65536 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
NET: Registered protocol family 1
NET: Registered protocol family 17
RAMDISK: Compressed image found at block 0
VFS: Mounted root (ext2 filesystem).
ReiserFS: hda5: warning: sh-2021: reiserfs_fill_super: can not find reiserfs on hda5
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Cannot open root device "hda5" or unknown-block(3,5)
Please append a correct "root=" boot option
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(3,5)
I see a similar effect. Curiously, if I don't shut linux down the correct way (i.e. wait for all the drives to unmount, etc), then the next boot is ok (I'm using ext3). So, if I just kill colinux, all works well. (Or seems to)
Yep, same here. I should have mentioned that in my original message.
I'm guessing that there is some status is being set somewhere during the normal shutdown that is causing it to get confused during the next startup. But I'm not sure where to start looking for it other than to debug all the initscripts. (yuck!)
Well, just on a lark I tried what Henry suggested in an unrelated bug report and disabled the use of the initrd since it is only needed for installing modules on new images or updated kernels. Now my FC4 guest has booted 6 times in a row. WooHoo!
have a modifired version of initrd.
This should run every boots.
Thanks Henry, that works well.