You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Travis H. <tra...@tr...> - 2011-12-21 14:47:42
|
Yes, this is unknown to me why. My theory was there is one or more chunkserver processes that are not active. But when I simulated this by shutting down my chunk servers the i was still able to "ls" the files and folders, the file contents were just are not available without their chunkserver. Perhaps the mfsmount is not really mounting it. e.g. "mount" show it is mounted. Quite possibly an issue with your fuse installation? Or, make sure you are really mounting the correct mfsmaster, I have accidentally mounted other test instances at times. Also, make sure you have not somehow accidentally erased the mfsmaster data files. with the mfsmaster process running, while looking at your /var/log/messages for mfsmaster machine, stop, and then start up all the chunk servers. Another handy utility is the CGI monitor, I like to inspect that to see the view of what chunks are currently at goal 0 (because of chunk servers containing them being offline). I wonde On 11-12-20 5:49 PM, Zachary Wagner wrote: > Hello, > > On my MooseFS test system, I unmounted the client. Now, when I try to > mount it again, it does not show the sub-directory that I created > before with files in it. Does anybody know how I could access these > files again? Are they still there or do they get deleted during > unmounting? I tried reading the documentation but I can't find > anything on this. Perhaps I missed it. Any help would be greatly > appreciated. > > Thank you, > > Zach > > > ------------------------------------------------------------------------------ > Write once. Port to many. > Get the SDK and tools to simplify cross-platform app development. Create > new or port existing apps to sell to consumers worldwide. Explore the > Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join > http://p.sf.net/sfu/intel-appdev > > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Travis H. <tra...@tr...> - 2011-12-21 13:13:42
|
I am pretty sure the auto balancing feature is not possible to turn off. It exists by design to automatically handle situations when a chunk server dies, as in a production environment where there would be dozens or hundreds of nodes and thus a higher probability of having a commodity machine hardware failure. Perhaps this pattern of use could be better handled by turning on the 3rd computer, having it not attached to the file system and just be an individual disk, that you'd invoke rsync or other copy from the moosefs instance into. In that use, a USB external drive is also just as good for this. If power savings is really a concern, Also, the effort to do the updates to the chunk copies that were offline to be in the same version and state as the other two that were online is not the same kind of experience as plugging in a USB external drive and invoking rsync; there is no finite and exact amount of time this will take, and not practical to determine when the updates will start or finish. On 11-12-21 4:03 AM, Mike wrote: > I have a 3 storage server MFS cluster in my basement. One is also the > meta master, the other two are metaloggers. > > I have all chunks set to 2 copies, except for a few personal items which > are set to 3 copies. > > now, I don't *need* two copies to be online all the time, just to be > able to recover from problems. Therefore, I should be able to shut down > one storage server, wake it up once a day, sync changes, and then shut > it down again. My power bill will then drop and the world will be a > greener place. :) > > Anyone see any problems with this configuration? how can I stop MFS from > trying to re-balanace/copy whatever's left on the other two servers? > > > > > ------------------------------------------------------------------------------ > Write once. Port to many. > Get the SDK and tools to simplify cross-platform app development. Create > new or port existing apps to sell to consumers worldwide. Explore the > Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join > http://p.sf.net/sfu/intel-appdev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Mike <isp...@gm...> - 2011-12-21 09:21:14
|
I have a 3 storage server MFS cluster in my basement. One is also the meta master, the other two are metaloggers. I have all chunks set to 2 copies, except for a few personal items which are set to 3 copies. now, I don't *need* two copies to be online all the time, just to be able to recover from problems. Therefore, I should be able to shut down one storage server, wake it up once a day, sync changes, and then shut it down again. My power bill will then drop and the world will be a greener place. :) Anyone see any problems with this configuration? how can I stop MFS from trying to re-balanace/copy whatever's left on the other two servers? |
From: Zachary W. <zw...@ha...> - 2011-12-20 22:49:09
|
Hello, On my MooseFS test system, I unmounted the client. Now, when I try to mount it again, it does not show the sub-directory that I created before with files in it. Does anybody know how I could access these files again? Are they still there or do they get deleted during unmounting? I tried reading the documentation but I can't find anything on this. Perhaps I missed it. Any help would be greatly appreciated. Thank you, Zach |
From: Michał B. <mic...@ge...> - 2011-12-20 10:13:59
|
Hi! This is normal that read speads are higher than write speeds. You can also have a look at this FAQ entry: http://www.moosefs.org/moosefs-faq.html#average Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 From: Zachary Wagner [mailto:zw...@ha...] Sent: Monday, December 19, 2011 4:39 PM To: moo...@li... Subject: [Moosefs-users] mfs read and write times Right now my school research group is testing the functionality of MFS. We are new to linux and distributed file systems such as MFS. We have a basic test system of one master, one metalogger, 2 chunkservers, and 1 client machine. Regarding read/write times, so far the read times have been significantly faster than the write times for the 1 GB files we are testing. Is this normal? Is it because of caching? Zachary Wagner |
From: Sébastien M. <seb...@gm...> - 2011-12-20 09:55:06
|
Hi, To complete the discussion I installed an openbsd in Virtual box and make the test. $ mkdir test $ ls -ld test drwxr-xr-x 7 sebastien sebastien 4096 Dec 19 10:26 test $ chgrp test test $ ls -ld test drwxr-xr-x 7 sebastien test 4096 Dec 19 10:36 test $ cd test $ mkdir test2 $ ls -la . drwxr-xr-x 7 sebastien test 4096 Dec 19 10:58 . drwxr-xr-x 7 sebastien test 4096 Dec 19 10:58 test2 So actually it's true, the repository is direclty inherited in BSDs (I assume if it's true in OpenBSD, it must be in FreeBSD and NetBSD, I have no time to install them too). If this is the default behaviour on OpenBSD, I don't think it's a security issue ;-) Regards, Sébastien 2011/12/14 Sébastien Morand <seb...@gm...> > Hi, > > You are right, it's not posix, is GNU : > << > On most systems, if a directory's set-group-ID bit is set, newly created > subfiles inherit the same group as the directory, and newly created > subdirectories inherit the set-group-ID bit of the parent directory > >> > > Extracted from : > > http://www.gnu.org/s/coreutils/manual/html_node/Directory-Setuid-and-Setgid.html > > According to : > http://www.gnu.org/s/mailman/mailman-install/bsd-issues.html > << > Also, the setgid bit is not necessary on BSD systems because group > ownership is automatically inherited on files created in directories. > >> > > Can you make the test and tell me if it's true? > > So maybe a configure option could be a nice thing. > > I'm not convinced about the security issue about inheritance setgid bit, > have you arguments? > > Regards, > Sebastien > > 2011/12/14 Michał Borychowski <mic...@ge...> > >> Hi!**** >> >> ** ** >> >> Unfortunately POSIX dosen’t give any clear specification on this subject. MooseFS >> behaves in a way which is found in most other systems and to be honest is >> the safest one. **** >> >> ** ** >> >> For example at Max OS X (HFS+) we have:**** >> >> (acid: </tmp/aqq>) $ mkdir dir1**** >> >> (acid: </tmp/aqq>) $ ls -ld dir1**** >> >> drwxr-xr-x 2 acid wheel 68 Dec 13 21:15 dir1**** >> >> (acid: </tmp/aqq>) $ chmod g+s dir1**** >> >> (acid: </tmp/aqq>) $ chgrp staff dir1**** >> >> (acid: </tmp/aqq>) $ ls -ld dir1**** >> >> drwxr-xr-x 2 acid staff 68 Dec 13 21:15 dir1**** >> >> (acid: </tmp/aqq>) $ cd dir1**** >> >> (acid: </tmp/aqq/dir1>) $ mkdir dir2**** >> >> (acid: </tmp/aqq/dir1>) $ ls -ld dir2**** >> >> drwxr-xr-x 2 acid staff 68 Dec 13 21:15 dir2**** >> >> ** ** >> >> And at FreeBSD 7.x (UFS) we have:**** >> >> [acid@fbsd7 /tmp/aqq]$ mkdir dir1**** >> >> [acid@fbsd7 /tmp/aqq]$ ls -ld dir1**** >> >> drwxr-xr-x 2 acid wheel 512 Dec 13 21:18 dir1**** >> >> [acid@fbsd7 /tmp/aqq]$ chmod g+s dir1**** >> >> [acid@fbsd7 /tmp/aqq]$ chgrp users dir1**** >> >> [acid@fbsd7 /tmp/aqq]$ ls -ld dir1**** >> >> drwxr-xr-x 2 acid users 512 Dec 13 21:18 dir1**** >> >> [acid@fbsd7 /tmp/aqq]$ cd dir1**** >> >> [acid@fbsd7 /tmp/aqq/dir1]$ mkdir dir2**** >> >> [acid@fbsd7 /tmp/aqq/dir1]$ ls -ld dir2**** >> >> drwxr-xr-x 2 acid users 512 Dec 13 21:18 dir2**** >> >> ** ** >> >> ** ** >> >> The behaviour of sgid bit described in your email is probably only on >> Linux. In the future we could think of "LINUX SUGID COMPATIBILITY" config >> option.**** >> >> ** ** >> >> ** ** >> >> Kind regards**** >> >> Michał Borychowski **** >> >> MooseFS Support Manager**** >> >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _**** >> >> Gemius S.A.**** >> >> ul. Wołoska 7, 02-672 Warszawa**** >> >> Budynek MARS, klatka D**** >> >> Tel.: +4822 874-41-00**** >> >> Fax : +4822 874-41-01**** >> >> ** ** >> >> *From:* Sébastien Morand [mailto:seb...@gm...] >> *Sent:* Tuesday, December 13, 2011 7:42 PM >> *To:* moo...@li... >> *Cc:* Aleksandra Rudnitska; Aleksandra Rudnitska >> *Subject:* **** >> >> ** ** >> >> Hi, >> >> I'm currently using the mfs-1.6.20-2 and figured out that the setgid bit >> is not correctly handled. >> >> $ groups >> toto test >> $ cd $HOME >> $ mkdir dir1 >> $ ls -ld dir1 >> drwxr-xr-x 2 toto toto 4096 Dec 13 18:36 dir1 >> $ chmod g+s dir1 >> $ chgrp test dir1 >> $ ls -ld dir1 >> drwxr-xr-x 2 toto test 4096 Dec 13 18:36 dir1 >> $ cd dir1 >> $ mkdir dir2 >> $ ls -ld dir2 >> drwxr-xr-x 2 toto test 4096 Dec 13 18:36 dir2 >> >> dir2 should have the setgid bit set, here it is the expected result : >> $ ls -ld dir2 >> drwxr-sr-x 2 toto test 4096 Dec 13 18:36 dir2 >> >> I'm attaching the patch for interested people. Only the mfsmaster is >> concerned. Sorry if this is corrected in later version. >> >> Regards, >> Sebastien**** >> > > |
From: Chris P. <ch...@ec...> - 2011-12-20 09:08:13
|
Hi List I have a moosefs installation in my test environment consisting of 4 pcs, each with 2x80 Gb and 2x 1TB drives. They are running a corosync/pacemaker cluster with any one of the 4 machines acting as the master, and all 4 running metaloggers (I am happy to share the ocf script if anyone would like to look at it) They are running chunk servers (and I have 4 extra chunk servers as well) We are hosting kvm images on the cluster (goal=3) However, we had a problem where the temperature in the lab went too high, and some of the HDDs shut down. No new files were being created at the time, but there was read/write access to most of the existing files. The current master kernel paniced, and a backup metalogger was promoted to master (using mfsmetarestore). Two of the other chunk servers had a 1Tb drive fail in each of them. So all-in-all, a fairly bad problem, where we lost 3 copies of some of the data (and goal was 3). This was overnight, and I only saw the problem this morning. However. The missing data should have still been on the drives (kernel paniced master, and failed hdds on other machines) So I have now done: Power off all machines Reseat all drives fsck all drives (no filesystem errors found) restart the master, metalogger and chunkservers The CGI is showing 44 chunks which have zero copies. (It also shows some chunks with 4 copies, and some chunks with 5 copies - which implies that the undergoal chunks were being replicated after the problem happened.) My question is - why would there be any chunks with zero copies? No new files were being added or deleted - the metalogger/masters would all have had the same data. The failed drives have started again, with no filesystem errors. Where are my missing chunks?? Any help would be appreciated Chris Disclaimer The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful. This email has been scanned for viruses and malware, and automatically archived by Mimecast SA (Pty) Ltd, an innovator in Software as a Service (SaaS) for business. Mimecast Unified Email Management ™ (UEM) offers email continuity, security, archiving and compliance with all current legislation. To find out more, contact Mimecast. |
From: Zachary W. <zw...@ha...> - 2011-12-19 16:10:46
|
Right now my school research group is testing the functionality of MFS. We are new to linux and distributed file systems such as MFS. We have a basic test system of one master, one metalogger, 2 chunkservers, and 1 client machine. Regarding read/write times, so far the read times have been significantly faster than the write times for the 1 GB files we are testing. Is this normal? Is it because of caching? Zachary Wagner |
From: René P. <ly...@lu...> - 2011-12-16 17:26:26
|
Hello, Michał! I have an update on the metadata case. On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said: > … > This is still too little information about how we could help you... What > about metaloggers? Don't you have metadata_ml.mfs.back and > changelog_ml.*.mfs files? You could put them on ftp as tar.gz and give us > a link so that we try to recover them. I will take some time to compile this information, because the servers in questions are managed by different teams (long story) and the coordination did not run smoothly (i.e. the metalogger's logs might have been overwritten, because the MooseFS was continued to be used after the master server crash). I searched through 129 GB of salvaged data from the master and found not a single file with a metadata signature. > In "emergency" situations MooseFS tries to write metadata file on the > master machine in these locations: … You can try to find them there, but > as RAID went broken it is possible you won't see them. Correct, we've not found any *emergency* files on the dump either. Since the state of the MooseFS moved back in time to 28 June 2011 we suspect that this is the time when the data corruption started, and since no warnings or errors were logged by the servers or the hardware controllers no one noticed. Best, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-16 17:19:03
|
On Dec 15, 2011 at 1800 -0500, Robert Sandilands appeared and said: > Jumbo frames are more of a concern at a much lower level. There where > you configure your switches, NIC's and/or OS. … Make sure you check the size of jumbo frames your components support. The size is unfortunately different between vendors and equipment used. Since MooseFS uses TCP for communication I think you can mix equipment with different jumbo frame sizes (TCP negotiates the size supported by both endpoints). Best, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: Travis H. <tra...@tr...> - 2011-12-16 14:57:19
|
Do you mean migrate as in mark a disk as for removal? I noticed when I do that, the chunks get copied to other chunk server, but not removed from the original disk that was marked for removal. What I do is look in the CGI monitor, on the info tab, select the "switch to regular" link to display the chunks that does not include the ones marked as for removal, and then wait until there are no more chunks (in your case with goal zero). that is how I know the replication is done and it is safe to remove the disk marked for removal. But the chunks on that disk never go get removed. On 11-12-15 8:47 PM, cuidongping wrote: > > Hi, > > Here is the details: > > I got 2 chunkservers, and set the goal as 1, then I start to migrate > the data from chunkserver A to chunkserver B. I found that the chunks > on A will copy to B, but won't delete after the migration is done. > > Here is the question: it suppose to delete the chunks that on > chunkserser A once the migration is done, right ? or I miss something > ? pls help me with this, thx! > > Best Regards > ???Jophy Cui > System Integration & Testing > Mobile: +86 150 1358 2086 > > > > ------------------------------------------------------------------------------ > Learn Windows Azure Live! Tuesday, Dec 13, 2011 > Microsoft is holding a special Learn Windows Azure training event for > developers. It will provide a great way to learn Windows Azure and what it > provides. You can attend the event by watching it streamed LIVE online. > Learn more at http://p.sf.net/sfu/ms-windowsazure > > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: cuidongping <cu...@ui...> - 2011-12-16 01:48:11
|
Hi, Here is the details: I got 2 chunkservers, and set the goal as 1, then I start to migrate the data from chunkserver A to chunkserver B. I found that the chunks on A will copy to B, but won’t delete after the migration is done. Here is the question: it suppose to delete the chunks that on chunkserser A once the migration is done, right ? or I miss something ? pls help me with this, thx! Best Regards 崔东平 Jophy Cui System Integration & Testing Mobile: +86 150 1358 2086 |
From: Robert S. <rsa...@ne...> - 2011-12-15 23:15:51
|
Jumbo frames are more of a concern at a much lower level. There where you configure your switches, NIC's and/or OS. From what I have seen is that the master does a lot of small transfers which are significantly smaller than the normal frame size. So there is not much to be gained from using jumbo frames on the master. Larger transfers does happen between mfsmount and mfschunkserver and you may see some benefit there. Daemons (like mfs) and other user-land applications generally don't know and don't care about the frame size. Robert On 12/15/11 5:07 PM, Zachary Wagner wrote: > Hello, > > I've read that Jumbo Frames are supported with MFS, but are they > automatically supported? Are they inherently fragmented, with some > setting I need to activate? Or does MFS automatically accept them? > > Thank you, > Zachary Wagner > > > ------------------------------------------------------------------------------ > 10 Tips for Better Server Consolidation > Server virtualization is being driven by many needs. > But none more important than the need to reduce IT complexity > while improving strategic productivity. Learn More! > http://www.accelacomm.com/jaw/sdnl/114/51507609/ > > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Zachary W. <zw...@ha...> - 2011-12-15 22:38:26
|
Hello, I've read that Jumbo Frames are supported with MFS, but are they automatically supported? Are they inherently fragmented, with some setting I need to activate? Or does MFS automatically accept them? Thank you, Zachary Wagner |
From: wkmail <wk...@bn...> - 2011-12-15 02:24:45
|
Our experience has been uneven. We do use it, because its convenient and we regard these as 'emergency' copies only and use more traditional backups for archival work. However, on Large VM's (> 10G) we saw a lot of stress put on the system, based on the scrolling timeouts that start occuring right after you do the snapshot and continue until the entire copy has been made. Its even worse if you have lots deletions and replications going on at the same time and/or if the VM was busy. On our test system, we could actually force VMs into read-only if we tried to stack up the Snapshots, which is easy to fall into on a shell script because the snapshot process exits and the snapshot files are created right away (the actual copy occurs when things start to change). We ended up adding a line to sleep 1 minute for each GB of VM size before moving on to the next snapshot. Once we made that adjustment, the system works, though a certain sluggishness is noted. It would be nice, if the actually snapshot data copy that occurs was optionally niced/delayed and made less intrusive like we can do with replication and deletions. Of course with a VM, you also have the consistancy issue, but I assume you sync'ed/froze/shutdown and/or were ok dealing with the fsck/myisamchk of your snapshot copy if you have to take it live <grin>. -bill On 12/13/2011 1:02 PM, Elliot Finley wrote: > Hello all, > > Has anyone used snapshots extensively? I'm just curios how robust > they are. I have 50+ VMs running with their virtual disk on MFS. For > backing up the VM images, I'd like to take a snapshot, copy it to > remote location and then delete snapshot. > > Has anyone tried this or similar? If so, would you mind sharing your > experience? > > Thanks, > Elliot > > ------------------------------------------------------------------------------ > Systems Optimization Self Assessment > Improve efficiency and utilization of IT resources. Drive out cost and > improve service delivery. Take 5 minutes to use this Systems Optimization > Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Ricardo J. B. <ric...@da...> - 2011-12-14 21:35:19
|
El Miércoles 14 Diciembre 2011, René Pfeiffer escribió: > On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said: > > No, but you can try with sth like: > > > > "find . -type f | xargs mfsfilerepair" > > I did that, and we're currently assessing the file system. > I asked for the recursive option, because it is easier for operators to > deal with these situations. Also a "--dry-run" option would be nice (just > as rsync has), so mfsfilerepair can generate a preview before an admin > decides to run the actual repair task. Exactly my concern. Whenever I get unavailable chunks (not many times fortunately) but I don't know which files they belong to, I do this: find /path/to/mfs/mount -type f | xargs mfsfileinfo > /tmp/mfsfileinfo.log grep -B2 "no valid copies" /tmp/mfsfileinfo.log And then I mfsfilrepair only those files that have chunks with invalid copies. But, I don't know before hand if the missing chunk will be zeroed or restored from another (maybe older) version, so a --dry-run would be more than welcome :) Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! ------------------------------------------ |
From: Elliot F. <efi...@gm...> - 2011-12-14 20:55:20
|
Hello, I have 3555 blocks that are "goal of 0 but have 1 copy". They've been there for two days now. Usually when I see blocks that have a goal of zero, they go away within several minutes. Does anyone know what could cause this? Thanks, Elliot |
From: Michał B. <mic...@ge...> - 2011-12-14 14:07:20
|
At the beginning of metadata file there is: "MFSM 1.5" Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Wednesday, December 14, 2011 2:25 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said: > No, but you can try with sth like: > > "find . -type f | xargs mfsfilerepair" I did that, and we're currently assessing the file system. I asked for the recursive option, because it is easier for operators to deal with these situations. Also a "--dry-run" option would be nice (just as rsync has), so mfsfilerepair can generate a preview before an admin decides to run the actual repair task. Do the *.emergency files have a signature that can be used to recognise them? Since the JFS on the RAID1 volumes has massive metadata damages, we have to scan the content of every file to look for the *.emergency files. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-14 13:24:52
|
On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said: > No, but you can try with sth like: > > "find . -type f | xargs mfsfilerepair" I did that, and we're currently assessing the file system. I asked for the recursive option, because it is easier for operators to deal with these situations. Also a "--dry-run" option would be nice (just as rsync has), so mfsfilerepair can generate a preview before an admin decides to run the actual repair task. Do the *.emergency files have a signature that can be used to recognise them? Since the JFS on the RAID1 volumes has massive metadata damages, we have to scan the content of every file to look for the *.emergency files. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: Steve <st...@bo...> - 2011-12-14 13:08:36
|
Hi, Does anyone have a backup methology and script to take the meatlogger data off the mfs system perhaps with several generations -------Original Message------- From: Michał Borychowski Date: 14/12/2011 08:39:11 To: 'René Pfeiffer' Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadatadistribution/recovery Hi! This is still too little information about how we could help you... What about metaloggers? Don't you have metadata_ml.mfs.back and changelog_ml.* mfs files? You could put them on FTP as tar.gz and give us a link so that we try to recover them. In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations: /metadata.mfs.emergency /tmp/metadata.mfs.emergency /var/metadata.mfs.emergency /usr/metadata.mfs.emergency /usr/share/metadata.mfs.emergency /usr/local/metadata.mfs.emergency /usr/local/var/metadata.mfs.emergency /usr/local/share/metadata.mfs.emergency You can try to find them there, but as RAID went broken it is possible you won't see them. And it is impossible to recover metadata from the chunks themselves. But file data are in the chunks so if you need to find something specific it would be possible. Each chunk has 5kb header (plain text) so simple 'grep' would be enough to find what you are looking at. BTW. How much data was kept on this MooseFS installation? Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Tuesday, December 13, 2011 2:17 PM To: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 12, 2011 at 1554 -0800, wkmail appeared and said: > On 12/12/2011 3:36 PM, René Pfeiffer wrote: > > … > > The biggest problem is that we cannot figure out what the RAID > > controller exactly did to the file system of the master server, and > > we haven't found any traces of a more recent metadata file. The > > metalogger system had no problem, but can it be that the metalogger > > was/is out of sync due to the silent file system corruption on the master system? > > That is a question for the devs, but early in our MFS testing with > essentially throwaway kit, we had a master fail with a broken raid. In > that case the underlying disk system had been essentially readonly for > a few days and no recent data was in /usr/local/var/mfs. > > However, the metalogger DID have accurate information and we simply > recovered using that data using the restore process and then copying > over metadata file to the now fixed master. Except for the 'on the fly > files' lost when the damm thing crashed, no other data was lost, > including files that had been received and written to chunkserver > during the time the disk subsystem was out of order. > > So my guess is that the metaloggers get their info from the masters > memory, not from a file on the master. Ok, this might be the reason then, because the master went down hard two times (first time 21 October, second time 9 December) because the RAID controller totally locked the system. I assume this could explain some missing metadata. > But that is something that should be confirmed by the devs. Thanks, René. -- )\._.,--....,'``. FL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php ----------------------------------------------------------------------------- Cloud Computing - Latest Buzzword or a Glimpse of the Future? This paper surveys cloud computing today: What are the benefits? Why are businesses embracing it? What are its payoffs and pitfalls? http://www.accelacomm.com/jaw/sdnl/114/51425149/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michał B. <mic...@ge...> - 2011-12-14 13:00:59
|
No, but you can try with sth like: "find . -type f | xargs mfsfilerepair" Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Wednesday, December 14, 2011 1:22 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 14, 2011 at 1159 +0100, Michał Borychowski appeared and said: > If you have "invalid copies" you should use mfsfilerepair which will > put in metadata proper file version. It's important that you have all > chunkservers running while doing mfsfilerepair. Is there a way to run mfsfilerepair recursively? The man page did not show such an option. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-14 12:22:25
|
On Dec 14, 2011 at 1159 +0100, Michał Borychowski appeared and said: > If you have "invalid copies" you should use mfsfilerepair which will put > in metadata proper file version. It's important that you have all > chunkservers running while doing mfsfilerepair. Is there a way to run mfsfilerepair recursively? The man page did not show such an option. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: Sébastien M. <seb...@gm...> - 2011-12-14 11:25:22
|
Hi, You are right, it's not posix, is GNU : << On most systems, if a directory's set-group-ID bit is set, newly created subfiles inherit the same group as the directory, and newly created subdirectories inherit the set-group-ID bit of the parent directory >> Extracted from : http://www.gnu.org/s/coreutils/manual/html_node/Directory-Setuid-and-Setgid.html According to : http://www.gnu.org/s/mailman/mailman-install/bsd-issues.html << Also, the setgid bit is not necessary on BSD systems because group ownership is automatically inherited on files created in directories. >> Can you make the test and tell me if it's true? So maybe a configure option could be a nice thing. I'm not convinced about the security issue about inheritance setgid bit, have you arguments? Regards, Sebastien 2011/12/14 Michał Borychowski <mic...@ge...> > Hi!**** > > ** ** > > Unfortunately POSIX dosen’t give any clear specification on this subject. MooseFS > behaves in a way which is found in most other systems and to be honest is > the safest one. **** > > ** ** > > For example at Max OS X (HFS+) we have:**** > > (acid: </tmp/aqq>) $ mkdir dir1**** > > (acid: </tmp/aqq>) $ ls -ld dir1**** > > drwxr-xr-x 2 acid wheel 68 Dec 13 21:15 dir1**** > > (acid: </tmp/aqq>) $ chmod g+s dir1**** > > (acid: </tmp/aqq>) $ chgrp staff dir1**** > > (acid: </tmp/aqq>) $ ls -ld dir1**** > > drwxr-xr-x 2 acid staff 68 Dec 13 21:15 dir1**** > > (acid: </tmp/aqq>) $ cd dir1**** > > (acid: </tmp/aqq/dir1>) $ mkdir dir2**** > > (acid: </tmp/aqq/dir1>) $ ls -ld dir2**** > > drwxr-xr-x 2 acid staff 68 Dec 13 21:15 dir2**** > > ** ** > > And at FreeBSD 7.x (UFS) we have:**** > > [acid@fbsd7 /tmp/aqq]$ mkdir dir1**** > > [acid@fbsd7 /tmp/aqq]$ ls -ld dir1**** > > drwxr-xr-x 2 acid wheel 512 Dec 13 21:18 dir1**** > > [acid@fbsd7 /tmp/aqq]$ chmod g+s dir1**** > > [acid@fbsd7 /tmp/aqq]$ chgrp users dir1**** > > [acid@fbsd7 /tmp/aqq]$ ls -ld dir1**** > > drwxr-xr-x 2 acid users 512 Dec 13 21:18 dir1**** > > [acid@fbsd7 /tmp/aqq]$ cd dir1**** > > [acid@fbsd7 /tmp/aqq/dir1]$ mkdir dir2**** > > [acid@fbsd7 /tmp/aqq/dir1]$ ls -ld dir2**** > > drwxr-xr-x 2 acid users 512 Dec 13 21:18 dir2**** > > ** ** > > ** ** > > The behaviour of sgid bit described in your email is probably only on > Linux. In the future we could think of "LINUX SUGID COMPATIBILITY" config > option.**** > > ** ** > > ** ** > > Kind regards**** > > Michał Borychowski **** > > MooseFS Support Manager**** > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _**** > > Gemius S.A.**** > > ul. Wołoska 7, 02-672 Warszawa**** > > Budynek MARS, klatka D**** > > Tel.: +4822 874-41-00**** > > Fax : +4822 874-41-01**** > > ** ** > > *From:* Sébastien Morand [mailto:seb...@gm...] > *Sent:* Tuesday, December 13, 2011 7:42 PM > *To:* moo...@li... > *Cc:* Aleksandra Rudnitska; Aleksandra Rudnitska > *Subject:* **** > > ** ** > > Hi, > > I'm currently using the mfs-1.6.20-2 and figured out that the setgid bit > is not correctly handled. > > $ groups > toto test > $ cd $HOME > $ mkdir dir1 > $ ls -ld dir1 > drwxr-xr-x 2 toto toto 4096 Dec 13 18:36 dir1 > $ chmod g+s dir1 > $ chgrp test dir1 > $ ls -ld dir1 > drwxr-xr-x 2 toto test 4096 Dec 13 18:36 dir1 > $ cd dir1 > $ mkdir dir2 > $ ls -ld dir2 > drwxr-xr-x 2 toto test 4096 Dec 13 18:36 dir2 > > dir2 should have the setgid bit set, here it is the expected result : > $ ls -ld dir2 > drwxr-sr-x 2 toto test 4096 Dec 13 18:36 dir2 > > I'm attaching the patch for interested people. Only the mfsmaster is > concerned. Sorry if this is corrected in later version. > > Regards, > Sebastien**** > |
From: Michał B. <mic...@ge...> - 2011-12-14 10:59:43
|
If you have "invalid copies" you should use mfsfilerepair which will put in metadata proper file version. It's important that you have all chunkservers running while doing mfsfilerepair. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Wednesday, December 14, 2011 11:01 AM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said: > Hi! > > This is still too little information about how we could help you... Sorry, I know, it's been a bit stressful since the server failure. > What about metaloggers? Don't you have metadata_ml.mfs.back and > changelog_ml.*.mfs files? Yes, I have, and we used the metalogger's data in order to set up a new master, but the access errors (invalid chunks and such) stayed. > You could put them on ftp as tar.gz and give us a link so that we try > to recover them. I will prepare an archive and make it available. > In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations: > /metadata.mfs.emergency > /tmp/metadata.mfs.emergency > /var/metadata.mfs.emergency > /usr/metadata.mfs.emergency > /usr/share/metadata.mfs.emergency > /usr/local/metadata.mfs.emergency > /usr/local/var/metadata.mfs.emergency > /usr/local/share/metadata.mfs.emergency > > You can try to find them there, but as RAID went broken it is possible you won't see them. The data rescue company gave us the salvaged data and we found not a single *.emergency file on the dump. The file might have been renamed though. I suspect that the RAID controller suffered from a data corruption condition for weeks prior to completely failing. Not even the Linux kernel got sensible warnings from the block device. Since the RAID controller totally locked up the system I'm not sure if the master process was able to write the data to disk (especially because all storage containers were managed by the faulty RAID controller on this server). > And it is impossible to recover metadata from the chunks themselves. Yes, I know, but I just asked, because I thought that there's a way to scan the data on the chunkservers themselves. > But file data are in the chunks so if you need to find something > specific it would be possible. Each chunk has 5kb header (plain text) > so simple 'grep' would be enough to find what you are looking at. We mainly deal with image files (JPG and PNG). I guess we'll walk through the chunks this way. > BTW. How much data was kept on this MooseFS installation? About 7.7 GB, mostly image files and a few videos. The images are the most important data, but the application dealing with these files unfortunately used hashes and a drectory structure similar to the Squid proxy (multiple directories, path based on the hashes) to store the data. I'm not sure if the developers can reconstruct this information without the metadata. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-14 10:00:56
|
On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said: > Hi! > > This is still too little information about how we could help you... Sorry, I know, it's been a bit stressful since the server failure. > What about metaloggers? Don't you have metadata_ml.mfs.back and > changelog_ml.*.mfs files? Yes, I have, and we used the metalogger's data in order to set up a new master, but the access errors (invalid chunks and such) stayed. > You could put them on ftp as tar.gz and give us a link so that we try to > recover them. I will prepare an archive and make it available. > In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations: > /metadata.mfs.emergency > /tmp/metadata.mfs.emergency > /var/metadata.mfs.emergency > /usr/metadata.mfs.emergency > /usr/share/metadata.mfs.emergency > /usr/local/metadata.mfs.emergency > /usr/local/var/metadata.mfs.emergency > /usr/local/share/metadata.mfs.emergency > > You can try to find them there, but as RAID went broken it is possible you won't see them. The data rescue company gave us the salvaged data and we found not a single *.emergency file on the dump. The file might have been renamed though. I suspect that the RAID controller suffered from a data corruption condition for weeks prior to completely failing. Not even the Linux kernel got sensible warnings from the block device. Since the RAID controller totally locked up the system I'm not sure if the master process was able to write the data to disk (especially because all storage containers were managed by the faulty RAID controller on this server). > And it is impossible to recover metadata from the chunks themselves. Yes, I know, but I just asked, because I thought that there's a way to scan the data on the chunkservers themselves. > But file data are in the chunks so if you need to find something specific > it would be possible. Each chunk has 5kb header (plain text) so simple > 'grep' would be enough to find what you are looking at. We mainly deal with image files (JPG and PNG). I guess we'll walk through the chunks this way. > BTW. How much data was kept on this MooseFS installation? About 7.7 GB, mostly image files and a few videos. The images are the most important data, but the application dealing with these files unfortunately used hashes and a drectory structure similar to the Squid proxy (multiple directories, path based on the hashes) to store the data. I'm not sure if the developers can reconstruct this information without the metadata. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |