From: Jon A. H. <jo...@gm...> - 2011-08-17 16:50:45
|
2011/8/16 Eray Ozkural <exa...@gm...>: > hello, > > all right, this nfs4 thing is b0rked as hell. i have no idea why they are > even distributing this stuff. lame. (greetings to its authors: you are > natural born lamers.) NFS4 is a much better protocol than NFS3 is... but our stack is quite complex, and I have also experienced troubles with it, that is why by default the first boot entry on the bootloader tries to use NFS3 instead of NFS4. It is a complex because live nodes require a writable overlay over the readonly NFS4 shared directory, and probably the union filesystem (aufs) is failing at somepoint. > i'm using ubuntu 11.04 and the kestrel apt source, the latest thing. > > the strange part is that i can't even mount nfs4 shares properly on the > frontend node. That is quite strange... :-S. Does it show any message, or it gets freezed trying to mount the nfs4 share? Does the nfs3 work? > this might be due to the architecture, i'm using x86_64, the uid and gid's > turn out to be -2 (shown as a big unsigned integer in ls)! so they are not > even nobody/nogroup, they don't correspond to any user, and of course > nothing works like that. i think the code simply doesn't work, maybe it's > trying to set it to nobody but it's mixing things up in 64-bit, i don't > know. no amount of fiddling with idmapd.conf fixes this, puzzling. > > so i tried to use nfs3, i changed the exports, and on the frontend node, You should not change the autogenerated exports, they should work for both nfs3 and nfs4. What changes is the name of the export. With NFS3 you need to use the whole path (like on the frontend) : $ mount -t nfs3 <ip>:/export/kestrel/<image_name> /mnt While with NFS4 it would be : $ mount -t nfs4 <ip>:/kestrel/<image_name> /mnt If you reconfigure kestrel, it will recreate the export file. > i can mount root with anonuid and anongid options and by explicitly passing > vers=3. however, in dracut on the compute node, it is impossible to mount > nfs3 volumes, the mount command passes vers=4 by default, if you pass vers=3 > (you must pass it as -o 'vers=3') it claims incorrect mount option and > halts. while of course there is no such thing. it also fails to apply the > nfs options in the dracut boot cmdline (using > nfs:server:/exports/kestrel/image1 for root). i'll try a few other things, > and if that doesn't work out, begin rolling my own diskless using another > initramfs setup, this is way too much hassle. and i couldn't find a package > for nfs-user-server on 11.04, i want to get rid of this terrible nfsv4 > immediately. we used to have NIS/NFS3, and NFS3 alone on previous clusters > and it worked perfectly. i have no idea why they can't even fix a simple uid > bug on 64-bit systems, but i have no intention to find out! The current nfs server allows to use both protocols, and you can always use NFS3, and in fact by default Kestrel uses NFS3. > i'll let you know if i find a solution (using kestrel or otherwise) on > ubuntu 11.04 amd64 version. right now, i think you can assume that > kestrelhpc definitely doesn't work on 11.04 amd64 due to inexplicable nfsv4 > errors. either the volume can't be booted (nfsv3) or the uid/gid's are all > wrong (they are -2, -2, and do not correspond to either root or nobody on > dracut!) if you have any suggestions at all, well, i'm open to them! First try to check if NFS3 exports work. They don't require idmapd, etc... they should work, if they don't then there is something really wrong. Are you using a firewall? Currently kestrel fails if you enable a firewall because it uses dinamic ports by default (It is a feature implemented on the next version). Regards, JonAn. |