Re: [SSI-users] Debian Lenny OpenSSI with LVM and RAID 1
Brought to you by:
brucewalker,
rogertsang
|
From: John H. <jo...@Ca...> - 2010-08-11 11:11:43
|
Some explanation of how OpenSSI interacts with filesystems and so on. When talking about using various filesystems with OpenSSI it's worth keeping two ideas in mind: 1. Cross node access 2. Failover. By "cross node access" I mean the ability for a process on one node to access files on another node. This is good for making the cluster look like one machine and necessary for migration of processes from one node to another. Cross node access can be made possible in two ways: 1. CFS - the I/O requests to a filesystem are handled on behalf on all nodes by a server node. 2. parallel mounts - the filesystem is mounted on all nodes. CFS is easy, the node that mounts the actual Linux filesystem stacks a CFS layer on top of it, all other nodes send their I/O requests to the CFS server. Parallel mounts needs a "physical" filesystem that can be accessed by multiple nodes. A simple example is NFS - each OpenSSI node directly accesses the remote NFS server. More complicated examples of parallel mounts are Lustre and so on. CFS Failover is necessary if we want to use CFS in a fault tolerant cluster. If the CFS server node goes down some other node has to take over its job. In order for this to work the other node needs to have physical access to the disks the filesystem was stored on - either by having used DRBD to make the data available to both the CFS primary and secondary nodes, or by actually having a physical path from both nodes to the disks (SAS, SCSI, iSCSI, FC or whatever). Also the filesystem under the CFS mount needs to be journaled, or the CFS failover will be forced to wait for a fsck before the filesystem is available again. It's failover that makes handling things like RAID and LVM exciting. Both the primary and secondary node need to access the RAID/LVM setup, but you they need to co-ordinate this access very carefully. for LVM there is CLVM (cluster LVM) which could probably be ported to OpenSSI. For RAID you'd need to modify OpenSSI to activate the RAID volumes on the secondary node during the failover. It should be possible, but it's not going to be an easy job. I'd really spend some time with the basic system, trying various failure scenarios, seeing how things work before taking on a big job like this. |