From: Travis H. <tra...@tr...> - 2011-12-14 00:51:06
|
I discovered this today found my solution, so sharing it. Using moosefs 1.6.20-2, with only one chunk server in the system, where I had set goal = 1. The single chunk server VM had 6 250GB (virtual) disks allocated to it. /dev/sdb1 on /var/lib/mfs/sdb1 type ext4 (rw,noatime) /dev/sdc1 on /var/lib/mfs/sdc1 type ext4 (rw,noatime) /dev/sdd1 on /var/lib/mfs/sdd1 type ext4 (rw,noatime) /dev/sde1 on /var/lib/mfs/sde1 type ext4 (rw,noatime) /dev/sdf1 on /var/lib/mfs/sdf1 type ext4 (rw,noatime) /dev/sdg1 on /var/lib/mfs/sdg1 type ext4 (rw,noatime) So I was interested in reclaiming one of the disks because the file system was only 5% in use. I edited the mfshdd.cfg on the chunk server to mark one disk (sdg1) as for removal /var/lib/mfs/sdb1 /var/lib/mfs/sdc1 /var/lib/mfs/sdd1 /var/lib/mfs/sde1 /var/lib/mfs/sdf1 */var/lib/mfs/sdg1 And I noticed after several hours of idle (no clients mounting it even), there were no chunk replications occurring to reorganize the chunks located on the disk marked as removal onto the other disks on the chunk server. Now really this makes sense, because I have goal = 1 and only one chunk server to work with. And since chunks seem to replicate to balance the utilization among all disks in all chunk servers over time, I guess I thought I could do a remove a disk on a single chunk server. When likely the process seems to assume more than one chunk server than the number of goals to work with. What I did to get it to migrate the chunks off, was to temporarily create a second chunk server process on the same VM as the chunk server, with only that disk I had marked as for removal. * copy the mfschunkserver.cfg to mfschunkserver-2.cfg ** change the DATA_PATH e.g. to /var/lib/mfs-2 ** change the CSSERV_LISTEN_PORT ** edit iptables if required to allow the new listen port ** change the HDD_CONF_FILENAME e.g. mfshdd-2 as well * copy the mfshdd.cffg to mfshdd-2.cfg ** remove all but the one disk marked as removal from mfshdd-2.cfg ** remove only the disk marked as removal from mfshdd.cfg * create an empty /var/lib/mfs-2 folder and chown it to nobody user (or user of mfschunkserver) * start the second chunk server with command line mfschunkserver -c /etc/mfschunkserver-2.cfg or, as I did, I copied my init script and modified it to specify the second config file. * monitor the CGI service, when the number of chunks under goal due to disks maked as removal goes to zero, turn off the second chunk server process, unmount the disk marked as for removal, and clean up these temporary "-2" files I created to launch the second chunk server process on the same machine as the regular chunk server process. (I guess I could have moved the virtual disk like a physical disk to a different machine and configured a new chunk server there too. this was just more convenient for me at the time). In the CGI monitor, now see two chunk servers, all of the disks, as before, but more important to my current interests, the chunks temporarily under goal due to them living on a disk marked as for removal are going down as the chunk replication operation to redistribute the chunks to other disks not marked as for removal is now working. So, was that a reasonable design decision to not have chunk replication for disk marked as removal within a single chunk server setup (as in this example), because files within a given "goal" correspond to the existence on a chunk server and not at the level of disks within the chunk server? Also, what I found interesting, and possibly helpful to anyone else playing with chunk servers, was that (for this version of moosefs anyway) the "disk" that belongs to a chunkserver can be relocated to a new chunk server process (assuming the original chunk server no longer is also using it of course), and configured to point to the mfsmaster as before. (chunk servers with many disks can be split apart into individual chunk servers. Now, I expect it could be very bad to go the other way, combining separate chunk servers with a single disk into a single chunk server with many disks ? |