From: Adrian D. <da...@pm...> - 2002-10-29 22:17:16
|
[cross-posted to fink-devel] Luned=EC, ottobre 28, 2002, alle 11:10 , Martin Costabel ha scritto su=20= [fink-beginners]: > You should try to get some more information about what is happening.=20= > Keep cmd-v pressed at boot time until the boot messages start = appearing=20 > on the screen. > > You could also try to boot into single user mode (cmd-s on boot) and=20= > check a couple of permissions with ls -la, for example > > % ls -lad / /etc /etc/ /var /var/ > drwxrwxr-t 39 root admin 1326 Oct 23 19:19 / > lrwxrwxr-t 1 root admin 11 Oct 23 19:19 /etc ->=20 > private/etc > drwxr-xr-x 103 root wheel 3502 Oct 17 11:29 /etc/ > lrwxrwxr-t 1 root admin 11 Oct 23 19:19 /var ->=20 > private/var > drwxr-xr-x 22 root wheel 748 Oct 23 19:19 /var/ Thanks, Martin, first of all for telling me how to coax a Mac into=20 booting more verbosely and in single-user mode, second for "hitting the=20= nail on the head" with your suggestion. More details are given below, but here the diagnosis in a nutshell: the=20= filesystem was only slightly damaged, and no system file had been=20 modified. The cause was indeed wrong rights on the directory / and the=20= links /etc and /var: $> ls -lad / /etc /etc/ /var /var/ drwx------ 44 2011 staff 1452 Oct 29 12:27 / lrwx------ 1 2011 staff 11 Oct 29 12:27 /etc -> private/etc drwxr-xr-x 72 root wheel 2404 Oct 29 11:04 /etc/ lrwx------ 1 2011 staff 11 Oct 29 12:27 /var -> private/var drwxr-xr-x 19 root wheel 602 Oct 29 12:28 /var/ Changing owner and rights back to reasonable values solved the problem: $> ls -lad / /etc /etc/ /var /var/ drwxr-xr-t 44 root staff 1452 Oct 29 12:27 / lrwxr-xr-t 1 root staff 11 Oct 29 12:27 /etc -> private/etc drwxr-xr-x 72 root wheel 2404 Oct 29 11:04 /etc/ lrwxr-xr-t 1 root staff 11 Oct 29 12:27 /var -> private/var drwxr-xr-x 19 root wheel 602 Oct 29 12:28 /var/ I'd be curious to know how you came to suspect problems with precisely=20= these files... I have strictly no clue where that user 2011 could come from (how do you=20= ask NetInfo if it knows about it ?). The problem is however=20 reproducible: installing the binary installer package anew changes again=20= the owner of /,/etc and /var to 2011 and the access rights to 700, even=20= if I remove the /sw tree beforehand. A second attempt at installing fink now fully succeeded. Of course I=20 didn't re-install the whole system, so it is hard to tell what exactly=20= was different. So the two following advices for people installing fink=20= are really only superstitious ideas until someone finds out where the=20 problem came from. - don't install the bin-installer-package twice, - at the first usage of dselect, don't select many packages yet.=20 Possibly add one or two base packages like anacron or daemonic, then do=20= the installation: this is where install-scripts prompt you to authorize=20= adding new users to the NetInfo base. Then you can go back and select=20 new packages. (when I selected many packages at once the first time, I=20= got a number of error messages. I didn't get any the second time. As I=20= said: only circumstantial evidence, not even a plausible explanation). If any developer would like more information, I have a dump of modified=20= files and can answer questions (and even reproduce the problem now that=20= I know how to get the system back on track...). Merci encore Martin ! and thank's a lot to all you Fink developers for this fine system ! Adrian some more details about what I could find out: Booting was completed, but then the login window crashed and restarted=20= periodically. Except for system.log, nothing interesting could be found=20= among the recently modified files. According to system.log, the boot=20 sequence was normal (at least at a rough glance) until the login server=20= started: [...] Oct 28 21:01:43 manray configd[125]: executing = /usr/sbin/DirectoryService Oct 28 21:01:43 manray automount[265]: automount version 23 Oct 28 21:01:43 manray lookupd[213]: _lookup_all(getfsent) failed Oct 28 21:01:43 manray lookupd[213]: _lookup_all(getfsent) failed Oct 28 21:01:51 manray slpd: STATE: *** slpd started *** Oct 28 21:01:53 manray mach_init[2]: added notification for = sub-bootstrap Oct 28 21:01:56 manray /usr/libexec/CrashReporter: Failed writing crash=20= report: /private/var/tmp/loginwindow.crash.log Oct 28 21:01:56 manray mach_init[2]: notified that requestor of subset=20= 7427 died Oct 28 21:01:56 manray WindowServer[74]: loginwindow connection closed;=20= closing server. Oct 28 21:01:56 manray mach_init[2]: Service WindowServer deleted -=20 bootstrap deleted Oct 28 21:01:56 manray mach_init[2]: Service=20 NSApplication-MainThread-148496975# deleted - bootstrap deleted Oct 28 21:01:56 manray mach_init[2]: Service DockClient-20001-0=20 deleted - bootstrap deleted Oct 28 21:01:56 manray WindowServer[294]: Display 0x4248068: Unit 0;=20 Vendor 0x610 Model 0x9215 S/N 33555754; online (0,0)[1024 x 768], base=20= addr 0xa000b000 Oct 28 21:01:56 manray mach_init[2]: added notification for = sub-bootstrap Oct 28 21:01:57 manray mach_init[2]: notified that requestor of subset=20= 3987 died Oct 28 21:01:57 manray mach_init[2]: Service WindowServer deleted -=20 bootstrap deleted Oct 28 21:01:57 manray mach_init[2]: Service=20 NSApplication-MainThread-1148242017# deleted - bootstrap deleted Oct 28 21:01:57 manray mach_init[2]: Service DockClient-20001-0=20 deleted - bootstrap deleted That's where the system would remain eternally, the last part (from the=20= message "..loginwindow connection closed...") then repeating every 30=20 seconds. The loginwindow.crash.log mentioned above was unfortunately=20 empty. A strange error message was output to a terminal window and to the file=20= /private/var/log/mail.log: Oct 28 19:21:56 manray sendmail[3201]: NOQUEUE: SYSERR(root):=20 /etc/mail/submit.cf: line 416: readcf: option RunAsUser: unknown user=20 smmsp No idea if this user "smmsp" has anything to do with fink, or if the=20 error was induced by an unreadable filesystem. Apart from that the file system was slightly damaged, but not beyond=20 fsck's capabilities, and repairing it did not change the repeated=20 loginwindow crashes at boot time. |