You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jakub Kruszona-Z. <ac...@mo...> - 2018-05-20 09:33:40
|
> On 19 May 2018, at 18:42, Gandalf Corvotempesta <gan...@gm...> wrote: > > il sab 19 mag 2018, 17:48 Marco Milano <mar...@gm... <mailto:mar...@gm...>> ha scritto: > Tea Leaves :-) > > Seriously, where did you get this info? > How HA works in moosefs? Is failover automatic? MooseFS 4.x now is in “close beta” (or even more “close release-candidate”) stage. Currently we started to use it by ourselves. When we will see that there are no obvious bugs then we will release it as open-source product under GPL-v2 (or even GPL-v2+) licence. In the meantime if you want to participate in tests of MFS 4.x - please let us know, we will send you packages of MFS 4.x for you OS. Marco Milano is one of our “testers” and this is why he knows more about MFS 4.x HA in MFS 4.x works fully automatic. You just need to define group of IP numbers in your DNS for your “mfsmaster” name. Then install master servers and run them on machines with those IP numbers and thats it. One of them will became “LEADER” of the group and other “FOLLOWER”-s of the group. If for any reason “LEADER” goes down then one of the FOLLOWER’s is chosen as an “ELECT” and when more than half of chunkservers connects to it then it automatically switches to “LEADER” state and starts working as main master server. When your previous leader becomes available again then it usually joins the group as a “FOLLOWER”. This is rather “fire and forget” solution. They will synchronise metadata between them automatically, chunkservers and clients will reconnect to current leader also automatically etc. Regards, Jakub Kruszona-Zawadzki |
From: Gandalf C. <gan...@gm...> - 2018-05-19 16:42:27
|
il sab 19 mag 2018, 17:48 Marco Milano <mar...@gm...> ha scritto: > Tea Leaves :-) > Seriously, where did you get this info? How HA works in moosefs? Is failover automatic? > |
From: Marco M. <mar...@gm...> - 2018-05-19 15:48:22
|
Tea Leaves :-) On 05/19/2018 11:29 AM, Gandalf Corvotempesta wrote: > Where did you read that v4 will have HA? > > Il sab 19 mag 2018, 17:02 Marco Milano <mar...@gm... <mailto:mar...@gm...>> ha scritto: > > Marin, > > Relax. > > MooseFS v4 will be free and will have all the features. > (HA, EC) > > If you need professional support, you have to pay for that. > > Very simple. > (Similar to Ubuntu model.) > > -- Marco > > On 05/19/2018 06:24 AM, Marin Bernard wrote: > > Hi, > > > > It's been a few weeks since I posted this message which was left > > unanswered. Since then, the MooseFS team published a blog post > > announcing that MooseFS 4.0 is now production ready. This version was > > long-expected, and this is really great news! > > > > As of now, it seems version 4.0 is yet to be released: I did not find > > any source or binary package for it. Could someone provide an estimated > > release date for this version, and a quick recap of the included/paid > > features? > > > > I do know that the CE edition is best-effort software, but I'm in a > > position where I have to decide whether to use MooseFS as mid/long-term > > storage solution, and I need to know where it is going. > > > > Thanks, > > > > Marin. > > > >> Hi, > >> > >> I have been using MooseFS CE for 4 or 5 years now and I'm really > >> happy > >> with it. Now I have a few questions about the project roadmap, and > >> the > >> differences between MooseFS CE and Pro editions. > >> > >> As far I as know, MooseFS CE and Pro edition ship with exactly the > >> same > >> feature set, except for mfsmaster HA which is only available with a > >> Pro > >> license.However, the moosefs.com <http://moosefs.com> website mentions several features > >> which > >> seem not included in the current release of MooseFS CE. Here is a > >> short > >> list: > >> > >> * Computation on Nodes > >> * Erasure coding with up to 9 parity sums > >> * Support of standard ACLs in addition to Unix file modes > >> * SNMP management interface > >> * Data compression or data deduplication? (not sure on this; deduced > >> from "MooseFS enables users to save a lot of HDD space maintaining > >> the > >> same data redundancy level.") > >> > >> I may be wrong but as far as I know, none of those wonderful > >> features > >> are part of the MooseFS 3.x CE branch. Will these features ship with > >> the > >> next major ("MooseFS 4.0") version?More importantly: will those > >> features > >> require Pro licensing? > >> > >> Could you please clarify those points for me? > >> > >> Thank you, > >> > >> Marin. > >> > >> > >> ------------------------------------------------------------------------------ > >> Check out the vibrant tech community on one of the world's most > >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot > >> > >> > >> _________________________________________ > >> moosefs-users mailing list > >> moo...@li... <mailto:moo...@li...> > >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... <mailto:moo...@li...> > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Gandalf C. <gan...@gm...> - 2018-05-19 15:29:23
|
Where did you read that v4 will have HA? Il sab 19 mag 2018, 17:02 Marco Milano <mar...@gm...> ha scritto: > Marin, > > Relax. > > MooseFS v4 will be free and will have all the features. > (HA, EC) > > If you need professional support, you have to pay for that. > > Very simple. > (Similar to Ubuntu model.) > > -- Marco > > On 05/19/2018 06:24 AM, Marin Bernard wrote: > > Hi, > > > > It's been a few weeks since I posted this message which was left > > unanswered. Since then, the MooseFS team published a blog post > > announcing that MooseFS 4.0 is now production ready. This version was > > long-expected, and this is really great news! > > > > As of now, it seems version 4.0 is yet to be released: I did not find > > any source or binary package for it. Could someone provide an estimated > > release date for this version, and a quick recap of the included/paid > > features? > > > > I do know that the CE edition is best-effort software, but I'm in a > > position where I have to decide whether to use MooseFS as mid/long-term > > storage solution, and I need to know where it is going. > > > > Thanks, > > > > Marin. > > > >> Hi, > >> > >> I have been using MooseFS CE for 4 or 5 years now and I'm really > >> happy > >> with it. Now I have a few questions about the project roadmap, and > >> the > >> differences between MooseFS CE and Pro editions. > >> > >> As far I as know, MooseFS CE and Pro edition ship with exactly the > >> same > >> feature set, except for mfsmaster HA which is only available with a > >> Pro > >> license.However, the moosefs.com website mentions several features > >> which > >> seem not included in the current release of MooseFS CE. Here is a > >> short > >> list: > >> > >> * Computation on Nodes > >> * Erasure coding with up to 9 parity sums > >> * Support of standard ACLs in addition to Unix file modes > >> * SNMP management interface > >> * Data compression or data deduplication? (not sure on this; deduced > >> from "MooseFS enables users to save a lot of HDD space maintaining > >> the > >> same data redundancy level.") > >> > >> I may be wrong but as far as I know, none of those wonderful > >> features > >> are part of the MooseFS 3.x CE branch. Will these features ship with > >> the > >> next major ("MooseFS 4.0") version?More importantly: will those > >> features > >> require Pro licensing? > >> > >> Could you please clarify those points for me? > >> > >> Thank you, > >> > >> Marin. > >> > >> > >> > ------------------------------------------------------------------------------ > >> Check out the vibrant tech community on one of the world's most > >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot > >> > >> > >> _________________________________________ > >> moosefs-users mailing list > >> moo...@li... > >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Marco M. <mar...@gm...> - 2018-05-19 15:01:34
|
Marin, Relax. MooseFS v4 will be free and will have all the features. (HA, EC) If you need professional support, you have to pay for that. Very simple. (Similar to Ubuntu model.) -- Marco On 05/19/2018 06:24 AM, Marin Bernard wrote: > Hi, > > It's been a few weeks since I posted this message which was left > unanswered. Since then, the MooseFS team published a blog post > announcing that MooseFS 4.0 is now production ready. This version was > long-expected, and this is really great news! > > As of now, it seems version 4.0 is yet to be released: I did not find > any source or binary package for it. Could someone provide an estimated > release date for this version, and a quick recap of the included/paid > features? > > I do know that the CE edition is best-effort software, but I'm in a > position where I have to decide whether to use MooseFS as mid/long-term > storage solution, and I need to know where it is going. > > Thanks, > > Marin. > >> Hi, >> >> I have been using MooseFS CE for 4 or 5 years now and I'm really >> happy >> with it. Now I have a few questions about the project roadmap, and >> the >> differences between MooseFS CE and Pro editions. >> >> As far I as know, MooseFS CE and Pro edition ship with exactly the >> same >> feature set, except for mfsmaster HA which is only available with a >> Pro >> license.However, the moosefs.com website mentions several features >> which >> seem not included in the current release of MooseFS CE. Here is a >> short >> list: >> >> * Computation on Nodes >> * Erasure coding with up to 9 parity sums >> * Support of standard ACLs in addition to Unix file modes >> * SNMP management interface >> * Data compression or data deduplication? (not sure on this; deduced >> from "MooseFS enables users to save a lot of HDD space maintaining >> the >> same data redundancy level.") >> >> I may be wrong but as far as I know, none of those wonderful >> features >> are part of the MooseFS 3.x CE branch. Will these features ship with >> the >> next major ("MooseFS 4.0") version?More importantly: will those >> features >> require Pro licensing? >> >> Could you please clarify those points for me? >> >> Thank you, >> >> Marin. >> >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Marin B. <li...@ol...> - 2018-05-19 10:39:48
|
Hi, It's been a few weeks since I posted this message which was left unanswered. Since then, the MooseFS team published a blog post announcing that MooseFS 4.0 is now production ready. This version was long-expected, and this is really great news! As of now, it seems version 4.0 is yet to be released: I did not find any source or binary package for it. Could someone provide an estimated release date for this version, and a quick recap of the included/paid features? I do know that the CE edition is best-effort software, but I'm in a position where I have to decide whether to use MooseFS as mid/long-term storage solution, and I need to know where it is going. Thanks, Marin. > Hi, > > I have been using MooseFS CE for 4 or 5 years now and I'm really > happy > with it. Now I have a few questions about the project roadmap, and > the > differences between MooseFS CE and Pro editions. > > As far I as know, MooseFS CE and Pro edition ship with exactly the > same > feature set, except for mfsmaster HA which is only available with a > Pro > license.However, the moosefs.com website mentions several features > which > seem not included in the current release of MooseFS CE. Here is a > short > list: > > * Computation on Nodes > * Erasure coding with up to 9 parity sums > * Support of standard ACLs in addition to Unix file modes > * SNMP management interface > * Data compression or data deduplication? (not sure on this; deduced > from "MooseFS enables users to save a lot of HDD space maintaining > the > same data redundancy level.") > > I may be wrong but as far as I know, none of those wonderful > features > are part of the MooseFS 3.x CE branch. Will these features ship with > the > next major ("MooseFS 4.0") version?More importantly: will those > features > require Pro licensing? > > Could you please clarify those points for me? > > Thank you, > > Marin. |
From: Wilson, S. M <st...@pu...> - 2018-05-14 17:19:15
|
Thanks, Casper and Raf, for your suggestions! I am using MooseFS in a workgroup setting in an office environment and not in a server room so my options are somewhat limited. Central IT services provide several 1Gbps network ports per office. These are then connected to directly switches found in equpment rooms on each floor of the building. Most of my chunk servers have dual NICs so I thought that I could add a "replication network" between chunk servers (at least those that are located in the same room) using a switch that I install in the room. One NIC in each server would be connected to the network provided by central IT and the other NIC would be connected to my local network. I had given some thought to using bonding/teaming but I don't see a good way to take advantage of this concept in our environment. Thanks again! Steve ________________________________ From: R.C. <mil...@gm...> Sent: Saturday, May 12, 2018 12:43 PM To: moo...@li... Subject: Re: [MooseFS-Users] Separate network for chunk servers Hi Steve, if your concern is about reducing network traffic on your switches and you plan to install MooseFS (master and chuncks) on dedicated HW in a dedicated rack or so, just place a switch between MooseFS units and the rest of your network. The switch will keep the MooseFS internal traffic right behind the switch. You can then connect this switch to the rest of your network with SFP port or a dedicated uplink connection (if your current switches have one of these) instead of a standard cat6 cable, which would of course become in this case a bottleneck. Once MooseFS is "behind" the switch, you can furtherly improve its bandwidth by implementing Bonding (as suggested by Casper) or Teaming. See here for a complete comparison: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-comparison_of_network_teaming_to_bonding Hope it helps Bye Raf Il 12/05/2018 17:51, Casper Langemeijer ha scritto: Hi Steve, As an alternative I'd like to suggest Linux Network Bonding to you. You should at least read up on it if you don't know it already. https://wiki.linuxfoundation.org/networking/bonding Bonding allows you to use the extra network bandwidth effectively, for *both* replication and client traffic. This also possibly eliminates the network as single point of failure in your setup. Choose your bonding mode carefully. Greetings, Casper Op ma 7 mei 2018 om 16:37 schreef Wilson, Steven M <st...@pu...<mailto:st...@pu...>>: Hi, I'm considering implementing a dedicated network for our chunk servers to use soley for replication among themselves. By doing this, I hope to separate the chunk traffic from the clients from the replication traffic that takes place among the chunk servers. If my understanding is correct, this is not achieved by using the REMAP_* options in mfsmaster.cfg which only separates out the traffic to/from the master. If anyone else has done this, I'd be grateful to hear about your experience, especially in these two areas: 1) what level of performance improvement was seen 2) what needed to be done in the MooseFS configuration and OS networking to implement it Thanks! Steve ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot_________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: R.C. <mil...@gm...> - 2018-05-12 16:43:23
|
Hi Steve, if your concern is about reducing network traffic on your switches and you plan to install MooseFS (master and chuncks) on dedicated HW in a dedicated rack or so, just place a switch between MooseFS units and the rest of your network. The switch will keep the MooseFS internal traffic right behind the switch. You can then connect this switch to the rest of your network with SFP port or a dedicated uplink connection (if your current switches have one of these) instead of a standard cat6 cable, which would of course become in this case a bottleneck. Once MooseFS is "behind" the switch, you can furtherly improve its bandwidth by implementing Bonding (as suggested by Casper) or Teaming. See here for a complete comparison: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-comparison_of_network_teaming_to_bonding Hope it helps Bye Raf Il 12/05/2018 17:51, Casper Langemeijer ha scritto: > Hi Steve, > > As an alternative I'd like to suggest Linux Network Bonding to you. > You should at least read up on it if you don't know it already. > https://wiki.linuxfoundation.org/networking/bonding Bonding allows you > to use the extra network bandwidth effectively, for *both* replication > and client traffic. This also possibly eliminates the network as > single point of failure in your setup. Choose your bonding mode > carefully. > > Greetings, Casper > > Op ma 7 mei 2018 om 16:37 schreef Wilson, Steven M <st...@pu... > <mailto:st...@pu...>>: > > Hi, > > > I'm considering implementing a dedicated network for our chunk > servers to use soley for replication among themselves. By doing > this, I hope to separate the chunk traffic from the clients from > the replication traffic that takes place among the chunk servers. > If my understanding is correct, this is not achieved by using the > REMAP_* options in mfsmaster.cfg which only separates out the > traffic to/from the master. > > > If anyone else has done this, I'd be grateful to hear about your > experience, especially in these two areas: > > 1) what level of performance improvement was seen > > 2) what needed to be done in the MooseFS configuration and OS > networking to implement it > > > Thanks! > > Steve > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! > http://sdm.link/slashdot_________________________________________ > moosefs-users mailing list > moo...@li... > <mailto:moo...@li...> > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Casper L. <cas...@pr...> - 2018-05-12 16:15:29
|
Hi Steve, As an alternative I'd like to suggest Linux Network Bonding to you. You should at least read up on it if you don't know it already. https://wiki.linuxfoundation.org/networking/bonding Bonding allows you to use the extra network bandwidth effectively, for *both* replication and client traffic. This also possibly eliminates the network as single point of failure in your setup. Choose your bonding mode carefully. Greetings, Casper Op ma 7 mei 2018 om 16:37 schreef Wilson, Steven M <st...@pu...>: > Hi, > > > I'm considering implementing a dedicated network for our chunk servers to > use soley for replication among themselves. By doing this, I hope to > separate the chunk traffic from the clients from the replication traffic > that takes place among the chunk servers. If my understanding is correct, > this is not achieved by using the REMAP_* options in mfsmaster.cfg which > only separates out the traffic to/from the master. > > > If anyone else has done this, I'd be grateful to hear about your > experience, especially in these two areas: > > 1) what level of performance improvement was seen > > 2) what needed to be done in the MooseFS configuration and OS > networking to implement it > > > Thanks! > > Steve > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Marco M. <mar...@gm...> - 2018-05-11 16:47:15
|
On 05/11/2018 12:16 PM, Gandalf Corvotempesta wrote: > Il giorno ven 11 mag 2018 alle ore 17:10 Marco Milano <mar...@gm...> > ha scritto: >> And why would EC 8+2 be overkill ? I am assuming that you are at least > using >> 2 copies now, which means that if you had 1PB of real data, you will need > 2PB >> of raw disk space. If you use EC 8+2, you will need only 1.25PB of raw > disk space >> saving you 0.75PB of raw disk space and giving you the equivalent of 3 > copies. >> You pay less, you get more. :-) > > That's true for huge clusters. > Our current test cluster is about 12TB and we have 6x2TB disk on each > server. > With EC 8+2 I need 10 servers, right ? > > Anyway, EC is for MFS4.0 Pro.... > MFS4.x Pro and MFS4.x CE have the same exact code. MFS4.x Pro has professional support and additional software components, utilities, etc. (You have to pay for MFS4.x Pro) MFS4.x CE has only the community support. MFS4.x CE does have EC and HA Master servers. However, as you pointed out, you can't do EC with such a small cluster. If you have 3 chunkservers, you can install couple of cheap 12TB drives and set copies=3 and you are done. (12TB SATA drive is down to about $400) -- Marco |
From: Gandalf C. <gan...@gm...> - 2018-05-11 16:17:07
|
Il giorno ven 11 mag 2018 alle ore 17:10 Marco Milano <mar...@gm...> ha scritto: > And why would EC 8+2 be overkill ? I am assuming that you are at least using > 2 copies now, which means that if you had 1PB of real data, you will need 2PB > of raw disk space. If you use EC 8+2, you will need only 1.25PB of raw disk space > saving you 0.75PB of raw disk space and giving you the equivalent of 3 copies. > You pay less, you get more. :-) That's true for huge clusters. Our current test cluster is about 12TB and we have 6x2TB disk on each server. With EC 8+2 I need 10 servers, right ? Anyway, EC is for MFS4.0 Pro.... |
From: Marco M. <mar...@gm...> - 2018-05-11 15:10:23
|
On 05/11/2018 09:29 AM, Gandalf Corvotempesta wrote: > Il giorno ven 11 mag 2018 alle ore 11:21 Marco Milano <mar...@gm...> > ha scritto: >> The scenario you describe is only likely to happen if you have only two > copies >> and lots of unaccessed data. If you keep 3 or 4 or more copies it is not > likely >> to happen. > > If cluster is used as backup or archive, replica 3 or EC 8,2 would be > overkill and useless. > It's only a waste of space and resources. And why would EC 8+2 be overkill ? I am assuming that you are at least using 2 copies now, which means that if you had 1PB of real data, you will need 2PB of raw disk space. If you use EC 8+2, you will need only 1.25PB of raw disk space saving you 0.75PB of raw disk space and giving you the equivalent of 3 copies. You pay less, you get more. :-) How is that in your own words "waste of space and resources" ?? -- Marco > >> If you are losing sleep over your data currently, you can write a very > simple script >> that will read your entire filesystem in a loop. > > Sure. But I think that a software-defined-storage should take care of data > consistency by itself > and not rely on external scripts. > > Additionally, MFS already has a scrub feature, but on huge clusters it's > useless, a scrub won't > finish in useful time. > |
From: Gandalf C. <gan...@gm...> - 2018-05-11 13:31:28
|
Il giorno ven 11 mag 2018 alle ore 01:48 Davies Liu <dav...@gm...> ha scritto: > If you have more chunk servers, it will be faster. Why it would be faster ? AFAIK, the crc check is done by each chunk server for each chunks, and not splitted in multiple batches. In example, with replica 3, 3 chunkservers and 4.000.000 chunks, each chunkserver will check 4.000.000 chunks and not 4.000.000/3 |
From: Gandalf C. <gan...@gm...> - 2018-05-11 13:29:37
|
Il giorno ven 11 mag 2018 alle ore 11:21 Marco Milano <mar...@gm...> ha scritto: > The scenario you describe is only likely to happen if you have only two copies > and lots of unaccessed data. If you keep 3 or 4 or more copies it is not likely > to happen. If cluster is used as backup or archive, replica 3 or EC 8,2 would be overkill and useless. It's only a waste of space and resources. > If you are losing sleep over your data currently, you can write a very simple script > that will read your entire filesystem in a loop. Sure. But I think that a software-defined-storage should take care of data consistency by itself and not rely on external scripts. Additionally, MFS already has a scrub feature, but on huge clusters it's useless, a scrub won't finish in useful time. |
From: Marco M. <mar...@gm...> - 2018-05-11 09:20:46
|
On 05/11/2018 03:30 AM, Gandalf Corvotempesta wrote: > Il giorno ven 11 mag 2018 alle ore 03:26 Marco Milano <mar...@gm...> > ha scritto: >> It is also checked at every read. If the CRC doesn't match, >> it will self-heal from the good copy. >> So, CRC is not useless. CRC testing is an additional layer of protection >> on top of "check CRC on every read". It is not the "only" protection. > > Is not the same. > If you have a huge storage, with mostly unaccessed data, like an archive > storage or a backup storage, > most of stored data won't be accessed for year and you won't be able to > detect if something wrong is happening > on you disks. > > It's like with any RAID, every RAID is able to detect read errors and act > accordingly by reading from a different replica or by reconstructing data > from the parity. > But without a monthly/weekly scrub of the whole array, what would happen if > you have a bad sector affecting a file that you haven't read from month, > and then > a disk fails? When you are rebuilding the failed disk, you'll get the URE > on the "good" disk, resulting in data loss (data can't be reconstructed, > because the "good" disk as an unreadable sector) and a punctured raid > (happened to me many times in the past) > The scenario you describe is only likely to happen if you have only two copies and lots of unaccessed data. If you keep 3 or 4 or more copies it is not likely to happen. You can use 8+2 in version 4, which has the same redundancy level of 3 copies but the space overhead is only 25%. If you are using 2 copies now, the space overhead is 50%, 8+2 will give you more protection with much less space overhead. The only catch with EC introduced in v4 is that you have to have a lot of chunkservers, 12 chunkservers if you use 8+2. If you are losing sleep over your data currently, you can write a very simple script that will read your entire filesystem in a loop. If you have disk space, you can also set your copies to 3 for now and convert to 8+2 later. It is possible to do up to 8+8 in v4, which has a space overhead of 50% but it is the equivalent of 9 copies. The bottom line is that there are many ways to safely protect your data both in version 3 and version 4. If you provide more info about your setup, such as your hardware configuration and your data size and usage, the community can provide better suggestions. -- Marco |
From: Gandalf C. <gan...@gm...> - 2018-05-11 07:39:20
|
Il giorno ven 11 mag 2018 alle ore 03:26 Marco Milano <mar...@gm...> ha scritto: > It is also checked at every read. If the CRC doesn't match, > it will self-heal from the good copy. > So, CRC is not useless. CRC testing is an additional layer of protection > on top of "check CRC on every read". It is not the "only" protection. Is not the same. If you have a huge storage, with mostly unaccessed data, like an archive storage or a backup storage, most of stored data won't be accessed for year and you won't be able to detect if something wrong is happening on you disks. It's like with any RAID, every RAID is able to detect read errors and act accordingly by reading from a different replica or by reconstructing data from the parity. But without a monthly/weekly scrub of the whole array, what would happen if you have a bad sector affecting a file that you haven't read from month, and then a disk fails? When you are rebuilding the failed disk, you'll get the URE on the "good" disk, resulting in data loss (data can't be reconstructed, because the "good" disk as an unreadable sector) and a punctured raid (happened to me many times in the past) |
From: Gandalf C. <gan...@gm...> - 2018-05-11 07:21:51
|
Il giorno ven 11 mag 2018 alle ore 01:48 Davies Liu <dav...@gm...> ha scritto: > It's checked on every chunk server, also configurable (HDD_TEST_FREQ), > so it's possible to check all the chunks in 2 months > (HDD_TEST_FREQ=1). If you have more chunk servers, it will be faster. Setting that to 1, will kill the performance a lot. Already tried. |
From: Marco M. <mar...@gm...> - 2018-05-11 01:26:14
|
On 05/10/2018 07:34 PM, Gandalf Corvotempesta wrote: > Clear > So, CRC testing is almost useless, because one chunk every 10 sec, would need years to fully test a cluster It is also checked at every read. If the CRC doesn't match, it will self-heal from the good copy. So, CRC is not useless. CRC testing is an additional layer of protection on top of "check CRC on every read". It is not the "only" protection. -- Marco > > Il ven 11 mag 2018, 00:53 Davies Liu <dav...@gm... <mailto:dav...@gm...>> ha scritto: > > This loop only check status and goals of chunks, no CRC. CRC is > randomly tested by chunk server (1 in every 10 seconds). > > On Fri, May 11, 2018 at 1:19 AM, Gandalf Corvotempesta > <gan...@gm... <mailto:gan...@gm...>> wrote: > > Yes, sorry, my mistake but is still impossibile to scan 4.000.000 chunks > > checking for CRC, in 300 seconds. > > > > > > > > Il gio 10 mag 2018, 18:35 Davies Liu <dav...@gm... <mailto:dav...@gm...>> ha scritto: > >> > >> max(40, 300) = 300 > >> > >> 2018-05-11 0:17 GMT+08:00 Gandalf Corvotempesta > >> <gan...@gm... <mailto:gan...@gm...>>: > >> > Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu > >> > <dav...@gm... <mailto:dav...@gm...>> > >> > ha scritto: > >> >> max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) > >> > > >> > So, > >> > N=4000000 > >> > CHUNKS_LOOP_MAX_CPS = 100000 > >> > CHUNKS_LOOP_MIN_TIME = 300 > >> > > >> > to scan 4.000.000 chunks, I need > >> > > >> > max(4000000/100000, 300) = 40 seconds ? > >> > > >> > That's impossible. > >> > >> > >> > >> -- > >> - Davies > > > > -- > - Davies > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Piotr R. K. <pio...@mo...> - 2018-05-11 00:07:51
|
Hi Guys, as far as I remember it is going to change in v. 4.x and will be available as a parameter in MB/s. @acid please correct me if I’m wrong. Thanks, Best regards, Peter -- Piotr Robert Konopelko | +48 601 476 440 MooseFS Client Support Team | moosefs.com // Sent from my phone, sorry for condensed form > On 11 May 2018, at 8:34 AM, Gandalf Corvotempesta <gan...@gm...> wrote: > > Clear > So, CRC testing is almost useless, because one chunk every 10 sec, would need years to fully test a cluster > > Il ven 11 mag 2018, 00:53 Davies Liu <dav...@gm...> ha scritto: >> This loop only check status and goals of chunks, no CRC. CRC is >> randomly tested by chunk server (1 in every 10 seconds). >> >> On Fri, May 11, 2018 at 1:19 AM, Gandalf Corvotempesta >> <gan...@gm...> wrote: >> > Yes, sorry, my mistake but is still impossibile to scan 4.000.000 chunks >> > checking for CRC, in 300 seconds. >> > >> > >> > >> > Il gio 10 mag 2018, 18:35 Davies Liu <dav...@gm...> ha scritto: >> >> >> >> max(40, 300) = 300 >> >> >> >> 2018-05-11 0:17 GMT+08:00 Gandalf Corvotempesta >> >> <gan...@gm...>: >> >> > Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu >> >> > <dav...@gm...> >> >> > ha scritto: >> >> >> max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) >> >> > >> >> > So, >> >> > N=4000000 >> >> > CHUNKS_LOOP_MAX_CPS = 100000 >> >> > CHUNKS_LOOP_MIN_TIME = 300 >> >> > >> >> > to scan 4.000.000 chunks, I need >> >> > >> >> > max(4000000/100000, 300) = 40 seconds ? >> >> > >> >> > That's impossible. >> >> >> >> >> >> >> >> -- >> >> - Davies >> >> >> >> -- >> - Davies > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Davies L. <dav...@gm...> - 2018-05-10 23:48:58
|
It's checked on every chunk server, also configurable (HDD_TEST_FREQ), so it's possible to check all the chunks in 2 months (HDD_TEST_FREQ=1). If you have more chunk servers, it will be faster. On Fri, May 11, 2018 at 7:34 AM, Gandalf Corvotempesta <gan...@gm...> wrote: > Clear > So, CRC testing is almost useless, because one chunk every 10 sec, would > need years to fully test a cluster > > Il ven 11 mag 2018, 00:53 Davies Liu <dav...@gm...> ha scritto: >> >> This loop only check status and goals of chunks, no CRC. CRC is >> randomly tested by chunk server (1 in every 10 seconds). >> >> On Fri, May 11, 2018 at 1:19 AM, Gandalf Corvotempesta >> <gan...@gm...> wrote: >> > Yes, sorry, my mistake but is still impossibile to scan 4.000.000 chunks >> > checking for CRC, in 300 seconds. >> > >> > >> > >> > Il gio 10 mag 2018, 18:35 Davies Liu <dav...@gm...> ha scritto: >> >> >> >> max(40, 300) = 300 >> >> >> >> 2018-05-11 0:17 GMT+08:00 Gandalf Corvotempesta >> >> <gan...@gm...>: >> >> > Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu >> >> > <dav...@gm...> >> >> > ha scritto: >> >> >> max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) >> >> > >> >> > So, >> >> > N=4000000 >> >> > CHUNKS_LOOP_MAX_CPS = 100000 >> >> > CHUNKS_LOOP_MIN_TIME = 300 >> >> > >> >> > to scan 4.000.000 chunks, I need >> >> > >> >> > max(4000000/100000, 300) = 40 seconds ? >> >> > >> >> > That's impossible. >> >> >> >> >> >> >> >> -- >> >> - Davies >> >> >> >> -- >> - Davies -- - Davies |
From: Gandalf C. <gan...@gm...> - 2018-05-10 23:34:53
|
Clear So, CRC testing is almost useless, because one chunk every 10 sec, would need years to fully test a cluster Il ven 11 mag 2018, 00:53 Davies Liu <dav...@gm...> ha scritto: > This loop only check status and goals of chunks, no CRC. CRC is > randomly tested by chunk server (1 in every 10 seconds). > > On Fri, May 11, 2018 at 1:19 AM, Gandalf Corvotempesta > <gan...@gm...> wrote: > > Yes, sorry, my mistake but is still impossibile to scan 4.000.000 chunks > > checking for CRC, in 300 seconds. > > > > > > > > Il gio 10 mag 2018, 18:35 Davies Liu <dav...@gm...> ha scritto: > >> > >> max(40, 300) = 300 > >> > >> 2018-05-11 0:17 GMT+08:00 Gandalf Corvotempesta > >> <gan...@gm...>: > >> > Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu > >> > <dav...@gm...> > >> > ha scritto: > >> >> max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) > >> > > >> > So, > >> > N=4000000 > >> > CHUNKS_LOOP_MAX_CPS = 100000 > >> > CHUNKS_LOOP_MIN_TIME = 300 > >> > > >> > to scan 4.000.000 chunks, I need > >> > > >> > max(4000000/100000, 300) = 40 seconds ? > >> > > >> > That's impossible. > >> > >> > >> > >> -- > >> - Davies > > > > -- > - Davies > |
From: Davies L. <dav...@gm...> - 2018-05-10 22:53:57
|
This loop only check status and goals of chunks, no CRC. CRC is randomly tested by chunk server (1 in every 10 seconds). On Fri, May 11, 2018 at 1:19 AM, Gandalf Corvotempesta <gan...@gm...> wrote: > Yes, sorry, my mistake but is still impossibile to scan 4.000.000 chunks > checking for CRC, in 300 seconds. > > > > Il gio 10 mag 2018, 18:35 Davies Liu <dav...@gm...> ha scritto: >> >> max(40, 300) = 300 >> >> 2018-05-11 0:17 GMT+08:00 Gandalf Corvotempesta >> <gan...@gm...>: >> > Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu >> > <dav...@gm...> >> > ha scritto: >> >> max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) >> > >> > So, >> > N=4000000 >> > CHUNKS_LOOP_MAX_CPS = 100000 >> > CHUNKS_LOOP_MIN_TIME = 300 >> > >> > to scan 4.000.000 chunks, I need >> > >> > max(4000000/100000, 300) = 40 seconds ? >> > >> > That's impossible. >> >> >> >> -- >> - Davies -- - Davies |
From: Gandalf C. <gan...@gm...> - 2018-05-10 17:20:12
|
Yes, sorry, my mistake but is still impossibile to scan 4.000.000 chunks checking for CRC, in 300 seconds. Il gio 10 mag 2018, 18:35 Davies Liu <dav...@gm...> ha scritto: > max(40, 300) = 300 > > 2018-05-11 0:17 GMT+08:00 Gandalf Corvotempesta > <gan...@gm...>: > > Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu < > dav...@gm...> > > ha scritto: > >> max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) > > > > So, > > N=4000000 > > CHUNKS_LOOP_MAX_CPS = 100000 > > CHUNKS_LOOP_MIN_TIME = 300 > > > > to scan 4.000.000 chunks, I need > > > > max(4000000/100000, 300) = 40 seconds ? > > > > That's impossible. > > > > -- > - Davies > |
From: Davies L. <dav...@gm...> - 2018-05-10 16:35:55
|
max(40, 300) = 300 2018-05-11 0:17 GMT+08:00 Gandalf Corvotempesta <gan...@gm...>: > Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu <dav...@gm...> > ha scritto: >> max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) > > So, > N=4000000 > CHUNKS_LOOP_MAX_CPS = 100000 > CHUNKS_LOOP_MIN_TIME = 300 > > to scan 4.000.000 chunks, I need > > max(4000000/100000, 300) = 40 seconds ? > > That's impossible. -- - Davies |
From: Gandalf C. <gan...@gm...> - 2018-05-10 16:17:52
|
Il giorno gio 10 mag 2018 alle ore 16:50 Davies Liu <dav...@gm...> ha scritto: > max(N/CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME) So, N=4000000 CHUNKS_LOOP_MAX_CPS = 100000 CHUNKS_LOOP_MIN_TIME = 300 to scan 4.000.000 chunks, I need max(4000000/100000, 300) = 40 seconds ? That's impossible. |