You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <leo...@ar...> - 2010-12-11 14:32:54
|
On Fri, 10 Dec 2010 09:16:56 -0700, Thomas S Hatch wrote: On Fri, Dec 10, 2010 at 1:37 AM, wrote: Greetings! The questions are about redundant master solution for MFS (http://www.moosefs.org/mini-howtos.html [3]). 1) It is NOT CLEAR whether the passage "... would get two or three newest "changelog" files from any chunkserver (just by using "scp"), would also start mfsmetarestore and then mfsmaster..." relates to the case when we use version 1.6.5 and above with mfsmetalogger... If yes, why do we need to '... get two or three newest "changelog" files from any chunkserver (just by using "scp")...' at all? Couldn't mfsmetalogger do it automagically by its means? You are right, the mfsmetalogger takes care of that now, - BUT - the mfsmetalogger needed a number of fixes to reliable copy metadata, these will be in the next release 1.6.18. So, with the version 1.6.17, for example, I must not do anything described above, just run mfsmetalogger? ( BTW, what does the said mean: isn't metadata copying reliable at the moment?...) 2) What about the case of 'REAL MASTER' COMING UP?... Is this situation handled automagically with mfsmetalogger or one needs to do some actions manually. If the last is true, WHAT ARE THESE ACTIONS (and why they are not documented)? And no, these actions are not documented and must be scripted into the ucarp sequence. ------------------------------------------------------------------------------ Yes, it is clear that we should use ucarp, vrrpd etc. But, WHAT IS THE SEQUENCE? Is it TO COPY METADATA FROM (ACTIVE/ONE OF?) FAILOVER NODES AND RUN MFSRESTORE as it is described in the master recover docs section? It would be great if mfs used a true multimaster metadata storage. Did anybody think about that?... _______________________________________________ moosefs-users mailing list moo...@li... [4] https://lists.sourceforge.net/lists/listinfo/moosefs-users [5] I have been actively working on the ucarp interface for MooseFS failover and have it it just about where I want it. I have been failing over a 140T MooseFS install with very high traffic. As for the scripts and interface to enable this failover I will have them bundled together shortly, I have just been far to busy over the last few weeks. -Tom Links: ------ [1] http://leosat.it [2] http://ariel.ru [3] http://www.moosefs.org/mini-howtos.html [4] mailto:moo...@li... [5] https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Thomas S H. <tha...@gm...> - 2010-12-11 06:13:29
|
I am a Sr engineer at Beyond Oblivion, we are using Moose in a 140T and growing setup. We are using kvm for virtual machines, the performance is adequate. We also tried a number of other distributed filesystems and moose was by far the best option. For what it is worth, glusterfs was a disaster, we saw rampant file corruption, and the distribution of files was inconsistent and unfortunately ceph is not ready yet, I figure moose will be giving it a run for its money when it is. -Thomas S Hatch On Fri, Dec 10, 2010 at 2:57 PM, Jun Cheol Park <jun...@gm...>wrote: > One more question: > I am planning how to use KVM on MFS. Is there any use example of this > combination? > > Thanks, > > -Jun > > > On Fri, Dec 10, 2010 at 2:33 PM, Jun Cheol Park > <jun...@gm...> wrote: > > Hi, > > > > I would like to know how many use examples of MFS are in real > > production so far. And also wondering how big they are. > > > > Is there anyone who can give me comments on how substantially reliable > > MFS is for production? > > > > Thanks in advance, > > > > -Jun > > > > > ------------------------------------------------------------------------------ > Oracle to DB2 Conversion Guide: Learn learn about native support for > PL/SQL, > new data types, scalar functions, improved concurrency, built-in packages, > OCI, SQL*Plus, data movement tools, best practices and more. > http://p.sf.net/sfu/oracle-sfdev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Anh K. H. <ky...@vi...> - 2010-12-11 00:03:03
|
On Fri, 10 Dec 2010 14:33:32 -0700 Jun Cheol Park <jun...@gm...> wrote: > Hi, > > I would like to know how many use examples of MFS are in real > production so far. And also wondering how big they are. > > Is there anyone who can give me comments on how substantially > reliable MFS is for production? http://sourceforge.net/mailarchive/message.php?msg_id=26166132 There's also an example in FAQ. > ... -- Anh Ky Huynh at UTC+7 |
From: Jun C. P. <jun...@gm...> - 2010-12-10 21:57:13
|
One more question: I am planning how to use KVM on MFS. Is there any use example of this combination? Thanks, -Jun On Fri, Dec 10, 2010 at 2:33 PM, Jun Cheol Park <jun...@gm...> wrote: > Hi, > > I would like to know how many use examples of MFS are in real > production so far. And also wondering how big they are. > > Is there anyone who can give me comments on how substantially reliable > MFS is for production? > > Thanks in advance, > > -Jun > |
From: Jun C. P. <jun...@gm...> - 2010-12-10 21:33:39
|
Hi, I would like to know how many use examples of MFS are in real production so far. And also wondering how big they are. Is there anyone who can give me comments on how substantially reliable MFS is for production? Thanks in advance, -Jun |
From: Thomas S H. <tha...@gm...> - 2010-12-10 16:17:03
|
On Fri, Dec 10, 2010 at 1:37 AM, <leo...@ar...> wrote: > Greetings! > > The questions are about redundant master solution for MFS ( > http://www.moosefs.org/mini-howtos.html). > > 1) It is *not clear* whether the passage "... would get two or three > newest "changelog" files from any chunkserver (just by using "scp"), would > also start mfsmetarestore and then mfsmaster..." relates to the case when > we use version 1.6.5 and above with mfsmetalogger... If yes, why do we need > to '... get two or three newest "changelog" files from any chunkserver (just > by using "scp")...' at all? Couldn't mfsmetalogger do it automagically by > its means? > You are right, the mfsmetalogger takes care of that now, - BUT - the mfsmetalogger needed a number of fixes to reliable copy metadata, these will be in the next release 1.6.18. > 2) What about the case of *'real master' coming up*?... Is this situation > handled automagically with mfsmetalogger or one needs to do some actions > manually. If the last is true, *what are these actions* (and why they are > not documented)? > > And no, these actions are not documented and must be scripted into the ucarp sequence. > > ------------------------------------------------------------------------------ > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > I have been actively working on the ucarp interface for MooseFS failover and have it it just about where I want it. I have been failing over a 140T MooseFS install with very high traffic. As for the scripts and interface to enable this failover I will have them bundled together shortly, I have just been far to busy over the last few weeks. -Tom |
From: Thomas S H. <tha...@gm...> - 2010-12-10 16:04:46
|
Thanks guys! Thats what I needed to know, that also helps since some of our chunkservers have more space than others, I will continue to wait then and continue to trust the moose! -Tom 2010/12/10 Michał Borychowski <mic...@ge...> > But please remember that this process takes quite a long time (it may be > 2-3 > weeks). Performance is the most important thing for the system, not > rebalancing. > > > Regards > Michał > > -----Original Message----- > From: Laurent Wandrebeck [mailto:lw...@hy...] > Sent: Friday, December 10, 2010 10:43 AM > To: moo...@li... > Subject: Re: [Moosefs-users] Balance chunks > > On Thu, 9 Dec 2010 12:56:09 -0700 > Thomas S Hatch <tha...@gm...> wrote: > > > I am wondering, I have a large moosefs setup right now with over 140T, > > but we built the chunkservers out one at a time and the file load is > > very different on all the servers, I have some servers using %45 of > > thier disk space and some using above %90. > > > > This of course happened because we were running on a few servers, > > added a lot of files and then added more servers, I am just curious, > > as I have been unable to find this in the docs anywhere, but is it > > possible to tell the mfsmaster to start moving chunks around for a > > more even distribution of files in this case? Or should it just even out > over time? > > > > -Tom Hatch > That's pretty strange, mfs balances by default data load. > If your fs is 80% full, your say 3 chunkservers should be about 80% full > each. try restarting the mfschunkserver service on a lightly loaded > chunkserver ? > There's no way afaik to tell mfsmaster to balance. It does this by default. > HTH, > -- > Laurent Wandrebeck > HYGEOS, Earth Observation Department / Observation de la Terre > Euratechnologies > 165 Avenue de Bretagne > 59000 Lille, France > tel: +33 3 20 08 24 98 > http://www.hygeos.com > GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C > D17C > F64C > > |
From: Michał B. <mic...@ge...> - 2010-12-10 10:14:26
|
But please remember that this process takes quite a long time (it may be 2-3 weeks). Performance is the most important thing for the system, not rebalancing. Regards Michał -----Original Message----- From: Laurent Wandrebeck [mailto:lw...@hy...] Sent: Friday, December 10, 2010 10:43 AM To: moo...@li... Subject: Re: [Moosefs-users] Balance chunks On Thu, 9 Dec 2010 12:56:09 -0700 Thomas S Hatch <tha...@gm...> wrote: > I am wondering, I have a large moosefs setup right now with over 140T, > but we built the chunkservers out one at a time and the file load is > very different on all the servers, I have some servers using %45 of > thier disk space and some using above %90. > > This of course happened because we were running on a few servers, > added a lot of files and then added more servers, I am just curious, > as I have been unable to find this in the docs anywhere, but is it > possible to tell the mfsmaster to start moving chunks around for a > more even distribution of files in this case? Or should it just even out over time? > > -Tom Hatch That's pretty strange, mfs balances by default data load. If your fs is 80% full, your say 3 chunkservers should be about 80% full each. try restarting the mfschunkserver service on a lightly loaded chunkserver ? There's no way afaik to tell mfsmaster to balance. It does this by default. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: Laurent W. <lw...@hy...> - 2010-12-10 09:50:54
|
On Fri, 10 Dec 2010 15:18:07 +0800 kuer ku <ku...@gm...> wrote: > Hi, all, > > In my mfs environment, I found there are many 'reserved' files. from the > manual, I learned that they are files that used (opened) by some processes. > > So I shutdown all my app procecsses, but those files are still there. I do > not know how to handle them . > > reserved]# ls -l > total 434940601 > -rw-rw-r-- 1 search search 1073754587 Nov 9 21:42 > 0000002D|store|data|g4|data.0000786 > -rw-rw-r-- 1 search search 1073744785 Nov 8 00:24 > 0000003A|store|data|g5|data.0000765 > -rw-rw-r-- 1 search search 1073744393 Nov 2 21:09 > 00000067|store|data|g4|data.0000707 > -rw-rw-r-- 1 search search 1073759773 Nov 8 12:35 > 00000073|store|data|g5|data.0000772 > -rw-rw-r-- 1 search search 1073745982 Nov 3 06:35 > 00000079|store|data|g6|data.0000714 > -rw-rw-r-- 1 search search 1073744685 Nov 5 20:55 > 000000A5|store|data|g6|data.0000743 > -rw-rw-r-- 1 search search 1073743131 Nov 6 21:56 > 000000F6|store|data|g5|data.0000754 > -rw-rw-r-- 1 search search 1073748960 Nov 8 19:51 > 00000121|store|data|g7|data.0000677 > -rw-rw-r-- 1 search search 1073755075 Nov 3 16:01 > 000001EA|store|data|g4|data.0000718 > -rw-rw-r-- 1 search search 1073758969 Nov 10 05:43 > 00000208|store|data|g5|data.0000792 > -rw-rw-r-- 1 search search 1073742040 Nov 2 00:06 > 00000299|store|data|g6|data.0000695 > -rw-rw-r-- 1 search search 1073752202 Nov 11 10:53 > 000002DE|store|data|g5|data.0000806 > > > some files are very old ( about 1 month ago). > > the corresponding files is gone, how to handle these "reserved" files ? > > Can someone kindly help me out ? > > Thanks > > -- kuer I've had the same problem with a couple files. The only way to get rid of these is to unmount and remount the mfs volume on the boxes that created these reserved files. Then they'll get classicaly deleted by mfsmaster himself. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: Laurent W. <lw...@hy...> - 2010-12-10 09:48:27
|
On Fri, 10 Dec 2010 14:58:29 +0800 dft3000 <df...@gm...> wrote: > My mfs, the following error: > Appear in the client: > Dec 10 12:46:51 www mfsmount[2627]: file: 941671, index: 0, chunk: 3075082, > version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) > Dec 10 12:50:35 www mfsmount[2627]: file: 1657763, index: 0, chunk: 5236293, > version: 1 - writeworker: connection with (DEDE2069:9422) was timed out > (unfinished writes: 2; try counter: 1) > Dec 10 12:50:36 www mfsmount[2627]: file: 1657762, index: 0, chunk: 5236292, > version: 1 - writeworker: connection with (DEDE2069:9422) was timed out > (unfinished writes: 2; try counter: 1) > Dec 10 12:50:36 www mfsmount[2627]: file: 1657761, index: 0, chunk: 5236291, > version: 1 - writeworker: connection with (DEDE206A:9422) was timed out > (unfinished writes: 2; try counter: 1) > Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection > timed out > Dec 10 13:04:27 www mfsmount[2627]: file: 447712, index: 0, chunk: 443587, > version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) > Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection > timed out > Dec 10 13:04:27 www mfsmount[2627]: file: 447735, index: 0, chunk: 443610, > version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) > Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection > timed out > Dec 10 13:04:27 www mfsmount[2627]: file: 447905, index: 0, chunk: 443780, > version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) > Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection > timed out > Dec 10 13:04:27 www mfsmount[2627]: file: 447808, index: 0, chunk: 443683, > version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) > > Who can help me? > Thanks Looks like you got a quite severe disk problem (readblock error) on DEDE2069, followed by a DEDE2069 crash (connection with (DEDE2069:9422) was timed out + readblock; tcpread error: Connection timed out) ? or your network went down or something ? Did you find anything in DEDE2069 logs ? And check DEDE206A logs too. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: Laurent W. <lw...@hy...> - 2010-12-10 09:43:33
|
On Thu, 9 Dec 2010 12:56:09 -0700 Thomas S Hatch <tha...@gm...> wrote: > I am wondering, I have a large moosefs setup right now with over 140T, but > we built the chunkservers out one at a time and the file load is very > different on all the servers, I have some servers using %45 of thier disk > space and some using above %90. > > This of course happened because we were running on a few servers, added a > lot of files and then added more servers, I am just curious, as I have been > unable to find this in the docs anywhere, but is it possible to tell the > mfsmaster to start moving chunks around for a more even distribution of > files in this case? Or should it just even out over time? > > -Tom Hatch That's pretty strange, mfs balances by default data load. If your fs is 80% full, your say 3 chunkservers should be about 80% full each. try restarting the mfschunkserver service on a lightly loaded chunkserver ? There's no way afaik to tell mfsmaster to balance. It does this by default. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: <leo...@ar...> - 2010-12-10 08:36:51
|
Greetings! The questions are about redundant master solution for MFS (http://www.moosefs.org/mini-howtos.html). 1) It is NOT CLEAR whether the passage "... would get two or three newest "changelog" files from any chunkserver (just by using "scp"), would also start mfsmetarestore and then mfsmaster..." relates to the case when we use version 1.6.5 and above with mfsmetalogger... If yes, why do we need to '... get two or three newest "changelog" files from any chunkserver (just by using "scp")...' at all? Couldn't mfsmetalogger do it automagically by its means? 2) What about the case of 'REAL MASTER' COMING UP?... Is this situation handled automagically with mfsmetalogger or one needs to do some actions manually. If the last is true, WHAT ARE THESE ACTIONS (and why they are not documented)? |
From: kuer ku <ku...@gm...> - 2010-12-10 07:18:13
|
Hi, all, In my mfs environment, I found there are many 'reserved' files. from the manual, I learned that they are files that used (opened) by some processes. So I shutdown all my app procecsses, but those files are still there. I do not know how to handle them . reserved]# ls -l total 434940601 -rw-rw-r-- 1 search search 1073754587 Nov 9 21:42 0000002D|store|data|g4|data.0000786 -rw-rw-r-- 1 search search 1073744785 Nov 8 00:24 0000003A|store|data|g5|data.0000765 -rw-rw-r-- 1 search search 1073744393 Nov 2 21:09 00000067|store|data|g4|data.0000707 -rw-rw-r-- 1 search search 1073759773 Nov 8 12:35 00000073|store|data|g5|data.0000772 -rw-rw-r-- 1 search search 1073745982 Nov 3 06:35 00000079|store|data|g6|data.0000714 -rw-rw-r-- 1 search search 1073744685 Nov 5 20:55 000000A5|store|data|g6|data.0000743 -rw-rw-r-- 1 search search 1073743131 Nov 6 21:56 000000F6|store|data|g5|data.0000754 -rw-rw-r-- 1 search search 1073748960 Nov 8 19:51 00000121|store|data|g7|data.0000677 -rw-rw-r-- 1 search search 1073755075 Nov 3 16:01 000001EA|store|data|g4|data.0000718 -rw-rw-r-- 1 search search 1073758969 Nov 10 05:43 00000208|store|data|g5|data.0000792 -rw-rw-r-- 1 search search 1073742040 Nov 2 00:06 00000299|store|data|g6|data.0000695 -rw-rw-r-- 1 search search 1073752202 Nov 11 10:53 000002DE|store|data|g5|data.0000806 some files are very old ( about 1 month ago). the corresponding files is gone, how to handle these "reserved" files ? Can someone kindly help me out ? Thanks -- kuer |
From: dft3000 <df...@gm...> - 2010-12-10 06:58:35
|
My mfs, the following error: Appear in the client: Dec 10 12:46:51 www mfsmount[2627]: file: 941671, index: 0, chunk: 3075082, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 12:50:35 www mfsmount[2627]: file: 1657763, index: 0, chunk: 5236293, version: 1 - writeworker: connection with (DEDE2069:9422) was timed out (unfinished writes: 2; try counter: 1) Dec 10 12:50:36 www mfsmount[2627]: file: 1657762, index: 0, chunk: 5236292, version: 1 - writeworker: connection with (DEDE2069:9422) was timed out (unfinished writes: 2; try counter: 1) Dec 10 12:50:36 www mfsmount[2627]: file: 1657761, index: 0, chunk: 5236291, version: 1 - writeworker: connection with (DEDE206A:9422) was timed out (unfinished writes: 2; try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447712, index: 0, chunk: 443587, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447735, index: 0, chunk: 443610, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447905, index: 0, chunk: 443780, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447808, index: 0, chunk: 443683, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Who can help me? Thanks |
From: dft3000 <df...@gm...> - 2010-12-10 05:21:01
|
My mfs, the following error: Appear in the client: Dec 10 12:46:51 www mfsmount[2627]: file: 941671, index: 0, chunk: 3075082, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 12:50:35 www mfsmount[2627]: file: 1657763, index: 0, chunk: 5236293, version: 1 - writeworker: connection with (DEDE2069:9422) was timed out (unfinished writes: 2; try counter: 1) Dec 10 12:50:36 www mfsmount[2627]: file: 1657762, index: 0, chunk: 5236292, version: 1 - writeworker: connection with (DEDE2069:9422) was timed out (unfinished writes: 2; try counter: 1) Dec 10 12:50:36 www mfsmount[2627]: file: 1657761, index: 0, chunk: 5236291, version: 1 - writeworker: connection with (DEDE206A:9422) was timed out (unfinished writes: 2; try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447712, index: 0, chunk: 443587, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447735, index: 0, chunk: 443610, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447905, index: 0, chunk: 443780, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Dec 10 13:04:27 www mfsmount[2627]: readblock; tcpread error: Connection timed out Dec 10 13:04:27 www mfsmount[2627]: file: 447808, index: 0, chunk: 443683, version: 1, cs: DEDE2069:9422 - readblock error (try counter: 1) Who can help me? Thanks |
From: Thomas S H. <tha...@gm...> - 2010-12-09 19:56:10
|
I am wondering, I have a large moosefs setup right now with over 140T, but we built the chunkservers out one at a time and the file load is very different on all the servers, I have some servers using %45 of thier disk space and some using above %90. This of course happened because we were running on a few servers, added a lot of files and then added more servers, I am just curious, as I have been unable to find this in the docs anywhere, but is it possible to tell the mfsmaster to start moving chunks around for a more even distribution of files in this case? Or should it just even out over time? -Tom Hatch |
From: Thomas S H. <tha...@gm...> - 2010-12-07 15:35:36
|
Thanks Michal, will do! 2010/12/7 Michał Borychowski <mic...@ge...> > Hi Thomas! > > > > These errors were caused by a "disconnected" hdd. If you looked in the cgi > monitor you would see a disk with "damaged" status. > > > > The strange thing is that this bad chunk was retested after 10 seconds. It > should have been removed after the first test. And unfortunately in this > case these errors caused that MooseFS marked the hdd as damaged. But this > was a "logical" error not a physical one. Probably you should run "fsck" on > this hard drive. On the other hand we will make a patch so that system > doesn't test the same chunk in a loop. > > > > > > Kind regards > > Michal > > > > *From:* Thomas S Hatch [mailto:tha...@gm...] > *Sent:* Friday, December 03, 2010 6:05 PM > *To:* moosefs-users > *Subject:* [Moosefs-users] Errors and then "crash" > > > > This is the second time a chunkserver has issued this type of failure in > out environment, after giving this log message the chunkserver does not > crash, but all files on the chunk become unavailable and it shows %0 usage > on the mfsmaster > > > > Dec 3 14:58:08 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/6C/chunk_000000000008F96C_00000001.mfs > > Dec 3 14:58:18 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs > > Dec 3 14:58:18 localhost mfschunkserver[6969]: chunk_readcrc: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version > in header (000000000001BB0D_00000000) > > Dec 3 14:58:18 localhost mfschunkserver[6969]: hdd_io_begin: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: > Unknown error > > Dec 3 14:58:28 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs > > Dec 3 14:58:28 localhost mfschunkserver[6969]: chunk_readcrc: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version > in header (000000000001BB0D_00000000) > > Dec 3 14:58:28 localhost mfschunkserver[6969]: hdd_io_begin: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: > Unknown error > > Dec 3 14:58:38 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs > > Dec 3 14:58:38 localhost mfschunkserver[6969]: chunk_readcrc: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version > in header (000000000001BB0D_00000000) > > Dec 3 14:58:38 localhost mfschunkserver[6969]: hdd_io_begin: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: > Unknown error > > Dec 3 14:58:38 localhost mfschunkserver[6969]: 3 errors occurred in 60 > seconds on folder: /mnt/moose1/ > > Dec 3 14:58:39 localhost mfschunkserver[6969]: replicator: hdd_create > status: 21 > > > > > > I am running the prerelease of 1.6.18 on Ubuntu 10.04. > > > > After restarting the chunkserver everything comes back online without > problems. > > > > Any ideas as to what could be causing this? > > > > -Tom Hatch > |
From: Michał B. <mic...@ge...> - 2010-12-07 09:53:17
|
Hi Ólafur! Thank you for your tip! We added this information in our FAQ entry: http://www.moosefs.org/moosefs-faq.html#mtu Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Ólafur Ósvaldsson [mailto:osv...@ne...] Sent: Wednesday, December 01, 2010 11:41 AM To: moo...@li... Subject: [Moosefs-users] MooseFS and Generic Segmentation Offload in CentOS Hi, I just wanted to give a heads up if anyone gets into the same problems as we did here. Our new setup is like this: 1 x Master (Ubuntu 10.04) 1 x Metalogger (Ubuntu 10.04) 10 x Chunkservers (Ubuntu 10.04) And then we have for testing 8 CentOS 5.5 servers using the MFS. All Ubuntu servers have Broadcom BCM5708 network cards and the CentOS blades have BCM5709 and this is all connected through Cisco 3560 and 2960 switches. Everything was working fine after the initial install and then I set Jumbo-Frames on the switches and changed the MTU on all the servers to 9000, and by that point I could only get a listing of files but we were unable to read the contents of large files on MFS mounts from the CentOS machines, but if the filesystem was mounted from the Master it worked fine. Direct communication seemed to work fine, ping with different size packets, ssh and other services did not fail. After a couple of days looking at this I found that the operating systems have different driver settings for atleast Broadcom cards, while Ubuntu has a switch called "generic segmentation offload" set by default to on, CentOS has the switch set to off, by changing this setting on the CentOS machines it now runs fine just like before. To set it you use "ethtool -K gso on" /Oli -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne... phone: +354 517 3418 ---------------------------------------------------------------------------- -- Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michał B. <mic...@ge...> - 2010-12-07 09:52:46
|
Hi! At the moment file locking works only "locally". So if you lock a file on the client (machine A) its kernel remembers it is locked but another client (machine B) doesn't know about it. Global locking should be introduced in 1.7 branch but I cannot tell exactly when it happens. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Leonid Satanovsky [mailto:leo...@ar...] Sent: Thursday, December 02, 2010 4:43 PM To: moo...@li... Subject: [Moosefs-users] flock, fcntl file locking, Cyrus IMAP and MooseFS. Greetings! The question is: can we use MooseFS as a storage for Cyrus IMAP server? In its' docs it is said that it extensively uses file locking (through flock and/or fcntl system calls). As I understand, it is just a matter of MooseFS support for this locking mechanisms. Is it already available and if not, for what release is it planned? Best regards, Leonid. ---------------------------------------------------------------------------- -- Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michał B. <mic...@ge...> - 2010-12-07 09:04:46
|
Hi! Thank you for your interest in MooseFS! Writing of each chunks is made independently according to this schema. So if a file is bigger than 64MB there are two or more such processes working independently. One finishes writing chunk at position 0 (0B - 64MB) and the second starts writing at position 1 (64MB - 128MB). It is somehow like two different files of 64MB were written. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 From: lingjie [mailto:li...@gy...] Sent: Monday, December 06, 2010 8:14 AM To: moo...@li... Subject: [Moosefs-users] A question about Mfs's Write process Dear friend: Frist , I'm Chinese . My English is just so-so ,and hope you understanding.. When I browse the http://www.moosefs.org , I discover a picture: This picture is a Moosefs Write process sketch map .Now I hava a question about it. When the Master receive a request from Client, he will assign a Chunk to connect to Client. And at the second connection ,the Client will connect to the Chunk to write data . In the Mfs , using 64MB as a hard-coded chunk size . Now, I hava a question . If a data's size greater than 64MB , the data will be splited , there have a difference : 1. When the Client and Chunk's connection is builded ,the connection has established finished until the data transmission . 2. The Client and Chunk's connection is builded , When the data achieve 64MB, the connection will disconnect ,and the Client will connect to Master to recalculate, and reassign a new Chunk to Client, the Chunk maybe the previous one ,maybe a new one. Above is my question, I hope someone can help me . If my description is not distinctly ,please contect me. MSN: sev...@ho... Email: li...@gy... Thanks! |
From: Michał B. <mic...@ge...> - 2010-12-07 08:53:25
|
Hi Thomas! These errors were caused by a "disconnected" hdd. If you looked in the cgi monitor you would see a disk with "damaged" status. The strange thing is that this bad chunk was retested after 10 seconds. It should have been removed after the first test. And unfortunately in this case these errors caused that MooseFS marked the hdd as damaged. But this was a "logical" error not a physical one. Probably you should run "fsck" on this hard drive. On the other hand we will make a patch so that system doesn't test the same chunk in a loop. Kind regards Michal From: Thomas S Hatch <mailto:[mailto:tha...@gm...]> [mailto:tha...@gm...] Sent: Friday, December 03, 2010 6:05 PM To: moosefs-users Subject: [Moosefs-users] Errors and then "crash" This is the second time a chunkserver has issued this type of failure in out environment, after giving this log message the chunkserver does not crash, but all files on the chunk become unavailable and it shows %0 usage on the mfsmaster Dec 3 14:58:08 localhost mfschunkserver[6969]: testing chunk: /mnt/moose1/6C/chunk_000000000008F96C_00000001.mfs Dec 3 14:58:18 localhost mfschunkserver[6969]: testing chunk: /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs Dec 3 14:58:18 localhost mfschunkserver[6969]: chunk_readcrc: file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version in header (000000000001BB0D_00000000) Dec 3 14:58:18 localhost mfschunkserver[6969]: hdd_io_begin: file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: Unknown error Dec 3 14:58:28 localhost mfschunkserver[6969]: testing chunk: /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs Dec 3 14:58:28 localhost mfschunkserver[6969]: chunk_readcrc: file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version in header (000000000001BB0D_00000000) Dec 3 14:58:28 localhost mfschunkserver[6969]: hdd_io_begin: file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: Unknown error Dec 3 14:58:38 localhost mfschunkserver[6969]: testing chunk: /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs Dec 3 14:58:38 localhost mfschunkserver[6969]: chunk_readcrc: file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version in header (000000000001BB0D_00000000) Dec 3 14:58:38 localhost mfschunkserver[6969]: hdd_io_begin: file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: Unknown error Dec 3 14:58:38 localhost mfschunkserver[6969]: 3 errors occurred in 60 seconds on folder: /mnt/moose1/ Dec 3 14:58:39 localhost mfschunkserver[6969]: replicator: hdd_create status: 21 I am running the prerelease of 1.6.18 on Ubuntu 10.04. After restarting the chunkserver everything comes back online without problems. Any ideas as to what could be causing this? -Tom Hatch |
From: Ioannis A. <ias...@fl...> - 2010-12-06 21:07:30
|
You can always use my software for that... pylsyncd : http://www.deathwing00.org/wordpress/?page_id=199 On Fri, Dec 3, 2010 at 8:14 PM, jose maria <let...@us...> wrote: > El vie, 03-12-2010 a las 18:44 +0100, Ioannis Aslanidis escribió: >> Hello, >> >> Any updates on this feature? Do you think it'll be ready soon? >> >> Best regards. >> > > * Until the feature "Goal per Rac, and it is not foreseen" not "Location > awareness" comes, > Pater Noster qui es in Polonia .............. > you need N independent cluster's, and rsync or similar is the only > pseudo solution, PYrsyncD "Python Inotify Rsync Daemon" or other, if he > wants it in pseudo(real-time) after inotify events. > > > > > > > ------------------------------------------------------------------------------ > Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! > Tap into the largest installed PC base & get more eyes on your game by > optimizing for Intel(R) Graphics Technology. Get started today with the > Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. > http://p.sf.net/sfu/intelisp-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- Ioannis Aslanidis System and Network Administrator Flumotion Services, S.A. E-Mail: iaslanidis at flumotion dot com Office Phone: +34 93 508 63 59 Mobile Phone: +34 672 20 45 75 |
From: jose m. <let...@us...> - 2010-12-03 19:14:43
|
El vie, 03-12-2010 a las 18:44 +0100, Ioannis Aslanidis escribió: > Hello, > > Any updates on this feature? Do you think it'll be ready soon? > > Best regards. > * Until the feature "Goal per Rac, and it is not foreseen" not "Location awareness" comes, Pater Noster qui es in Polonia .............. you need N independent cluster's, and rsync or similar is the only pseudo solution, PYrsyncD "Python Inotify Rsync Daemon" or other, if he wants it in pseudo(real-time) after inotify events. |
From: Michał B. <mic...@ge...> - 2010-12-03 17:59:22
|
Unfortunately not yet. There would some improvements in metaloggers in 1.6.18 Kind regards Michał Borychowski -----Original Message----- From: Ioannis Aslanidis [mailto:ias...@fl...] Sent: Friday, December 03, 2010 6:44 PM To: moo...@li... Subject: Re: [Moosefs-users] Grouping chunk servers Hello, Any updates on this feature? Do you think it'll be ready soon? Best regards. 2010/9/15 Alexander Akhobadze <akh...@ri...>: > > Hi all! > > As far as I understand "Location awareness" is not exactly what Ioannis expects. > In scenario when whole POP1 goes down and Metadata server was located in POP1 > we have data ambiguity because POP1 may be just disconnected by WAN network failure > and real Metadata server may be alive. So, in this case in POP2 we can't > promote Metalogger to Master role. > > wbr > Alexander Akhobadze > > ====================================================== > Вы писали 8 сентября 2010 г., 21:44:50: > ====================================================== > > According to the roadmap (http://www.moosefs.org/roadmap.html), this is slated for the future: > > "Location awareness" of chunkserver - optional file mapping IP_address->location_number. > As a location we understand a rack in which the chunkserver is located. > The system would then be able to optimize some operations > (eg. prefer chunk copy which is located in the same rack). > > ----- Original Message ----- > From: "Ioannis Aslanidis" <ias...@fl...> > To: moo...@li... > Sent: Wednesday, September 8, 2010 11:37:08 AM > Subject: [Moosefs-users] Grouping chunk servers > > Hello, > > I am testing out MooseFS for around 50 to 100 TeraBytes of data. > > I have been successful to set up the whole environment. It was pretty > quick and easy actually. I was able to replicate with goal=3 and it > worked really nicely. > > At this point, there is only one requirement that I was not able to > accomplish. I require to have 3 copies of a certain chunk, but my > storage machines are distributed in two points of presence. > > I require that each of the points of presence contains at least one > copy of the chunks. This is fine when you have 3 chunk servers, but it > won't work if you have 6 chunk servers. The scenario is the following: > > POP1: 4 chunk servers (need 2 replicas here) > POP2: 2 chunk servers (need 1 replica here) > > I need this because if the whole POP1 or the whole POP2 go down, I > need to still be able to access the contents. Writes are normally only > performed in POP1, so there are normally only reads in POP2. > > The situation is worse if I add 2 more chunk servers in POP1 and 1 > more chunk server in POP2. > > Is there a way to somehow tell MooseFS that the 4 chunk servers of > POP1 are in one group and that there should be at least 1 replica in > this group and that the 2 chunk servers of POP2 are in another group > and that there should be at least 1 replica in this group? > > Is there any way to accomplish this? > > Regards. > > -- Ioannis Aslanidis System and Network Administrator Flumotion Services, S.A. E-Mail: iaslanidis at flumotion dot com Office Phone: +34 93 508 63 59 Mobile Phone: +34 672 20 45 75 ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Ioannis A. <ias...@fl...> - 2010-12-03 17:44:52
|
Hello, Any updates on this feature? Do you think it'll be ready soon? Best regards. 2010/9/15 Alexander Akhobadze <akh...@ri...>: > > Hi all! > > As far as I understand "Location awareness" is not exactly what Ioannis expects. > In scenario when whole POP1 goes down and Metadata server was located in POP1 > we have data ambiguity because POP1 may be just disconnected by WAN network failure > and real Metadata server may be alive. So, in this case in POP2 we can't > promote Metalogger to Master role. > > wbr > Alexander Akhobadze > > ====================================================== > Вы писали 8 сентября 2010 г., 21:44:50: > ====================================================== > > According to the roadmap (http://www.moosefs.org/roadmap.html), this is slated for the future: > > "Location awareness" of chunkserver - optional file mapping IP_address->location_number. > As a location we understand a rack in which the chunkserver is located. > The system would then be able to optimize some operations > (eg. prefer chunk copy which is located in the same rack). > > ----- Original Message ----- > From: "Ioannis Aslanidis" <ias...@fl...> > To: moo...@li... > Sent: Wednesday, September 8, 2010 11:37:08 AM > Subject: [Moosefs-users] Grouping chunk servers > > Hello, > > I am testing out MooseFS for around 50 to 100 TeraBytes of data. > > I have been successful to set up the whole environment. It was pretty > quick and easy actually. I was able to replicate with goal=3 and it > worked really nicely. > > At this point, there is only one requirement that I was not able to > accomplish. I require to have 3 copies of a certain chunk, but my > storage machines are distributed in two points of presence. > > I require that each of the points of presence contains at least one > copy of the chunks. This is fine when you have 3 chunk servers, but it > won't work if you have 6 chunk servers. The scenario is the following: > > POP1: 4 chunk servers (need 2 replicas here) > POP2: 2 chunk servers (need 1 replica here) > > I need this because if the whole POP1 or the whole POP2 go down, I > need to still be able to access the contents. Writes are normally only > performed in POP1, so there are normally only reads in POP2. > > The situation is worse if I add 2 more chunk servers in POP1 and 1 > more chunk server in POP2. > > Is there a way to somehow tell MooseFS that the 4 chunk servers of > POP1 are in one group and that there should be at least 1 replica in > this group and that the 2 chunk servers of POP2 are in another group > and that there should be at least 1 replica in this group? > > Is there any way to accomplish this? > > Regards. > > -- Ioannis Aslanidis System and Network Administrator Flumotion Services, S.A. E-Mail: iaslanidis at flumotion dot com Office Phone: +34 93 508 63 59 Mobile Phone: +34 672 20 45 75 |