From: Gandalf C. <gan...@gm...> - 2018-05-20 20:22:05
|
Il giorno dom 20 mag 2018 alle ore 22:09 Marin Bernard <li...@ol...> ha scritto: > The chunkserver acknowledges the write while the data are still pending > commit to disk. If the server dies meanwhile, the data are lost. Even if client asked for O_DIRECT or fsync explicitely ? If yes, this would break the POSIX compatibility, I think. > However, if goal is >= 2 (as it should always be), at least one more > copy of the data must already be present on another chunkserver before > the acknowledgment is sent. Not really. if goal >= 2, the ack is sent if another chunkserver has commited to the cache, so you are acknowledging when all goal copies are wrote to the cache, not to the disk. This woud be ok in normal condition (like any writes made by any software), but if client is asking for O_DIRECT, the acknoledge *must* be sent *after* data is stored on disk. |
From: Marin B. <li...@ol...> - 2018-05-20 21:11:15
Attachments:
smime.p7s
|
> Il giorno dom 20 mag 2018 alle ore 22:09 Marin Bernard > <li...@ol...> > ha scritto: > > The chunkserver acknowledges the write while the data are still > > pending > > commit to disk. If the server dies meanwhile, the data are lost. > > Even if client asked for O_DIRECT or fsync explicitely ? > If yes, this would break the POSIX compatibility, I think. No, not necessarily. It depends on what you mean by 'the client'. The POSIX interface is implemented by mfsmount, not the chunkserver. The client is just another process on the same machine. mfsmount may comply with POSIX and let chunkservers deal with the data in their own way in the background. The client process would know nothing about it, as it only speaks to mfsmount. From a POSIX perspective, the chunkserver is like a physical disk. POSIX does not specify how physical disks should work internally. If you O_DIRECT a file stored on a consumer hard drive and fill it with data, chances are that you'll experiment some kind of data loss if you unplug the box in the middle of a write: the content of the write cache, which was acknowledged but not committed, would have vanished. POSIX can't do anything about it. Since MooseFS extends over both the kernel and device layers, it has the opportunity to do better than POSIX, and break the tiering by leaking useful data from the mount processes to the chunkservers. I suppose this is why fsync operations are cascaded from mfsmount to the chunkservers. I do not know if this is the same with O_DIRECT, though. > > However, if goal is >= 2 (as it should always be), at least one > > more > > copy of the data must already be present on another chunkserver > > before > > the acknowledgment is sent. > > Not really. if goal >= 2, the ack is sent if another chunkserver has > commited to the cache, > so you are acknowledging when all goal copies are wrote to the cache, > not > to the disk. Absolutely. > This woud be ok in normal condition (like any writes made by any > software), > but if > client is asking for O_DIRECT, the acknoledge *must* be sent *after* > data > is stored on disk. This is what you would get with the ``mfscachemode=DIRECT`` mount option, which bypasses the cache completely, at least on client side. Yet, I don't know whether mfsmount is able to enforce O_DIRECT on a per-file basis or if the settings must apply to the whole mountpoint. |
From: Jakub Kruszona-Z. <jak...@ge...> - 2018-05-22 06:05:41
|
> On 20 May, 2018, at 16:00, Gandalf Corvotempesta <gan...@gm...> wrote: > > Il dom 20 mag 2018, 15:56 Zlatko Čalušić <zca...@bi... <mailto:zca...@bi...>> ha scritto: > Even if LizardFS currently has a feature or two missing in current > MooseFS 3.0, it seems that MooseFS 4.0 will be a clear winner, and > definitely worth a wait! > > Totally agree > Probably the real missing thing, IMHO, is the qemu driver I've checked qemu today - there is no such thing as "qemu driver". They have support for glusterfs - but this is part of qemu code. I probably could write similar code for qemu to integrate qemu with MFS, but I'm not sure if it would be accepted by qemu programmers. -- Regards, Jakub Kruszona-Zawadzki - - - - - - - - - - - - - - - - Segmentation fault (core dumped) Phone: +48 602 212 039 |
From: Gandalf C. <gan...@gm...> - 2018-05-22 09:07:32
|
Il giorno mar 22 mag 2018 alle ore 08:05 Jakub Kruszona-Zawadzki < jak...@ge...> ha scritto: > I've checked qemu today - there is no such thing as "qemu driver". They have support for glusterfs - but this is part of qemu code. > I probably could write similar code for qemu to integrate qemu with MFS, but I'm not sure if it would be accepted by qemu programmers. I don't see any reasons why they won't accept your qemu block driver. They have tons of block drivers, even for software with much less features than MooseFS, like sheepdog. Yes, qemu doesn't have "loadable modules", every block driver is compiled directly, but I'm sure that if you make a pull-request on latest master, they will accept. |
From: Devin A. <lin...@gm...> - 2018-05-22 17:55:40
|
I think it would be really NEAT if you actually did create a MooseFS driver to use with QEMU. I think the more options available the better. :) -- Devin Acosta RHCA|LFCE Red Hat Certified Architect e: de...@li...oud p: 602-354-1220 On May 21, 2018, 11:06 PM -0700, Jakub Kruszona-Zawadzki <jak...@ge...>, wrote: > > > On 20 May, 2018, at 16:00, Gandalf Corvotempesta <gan...@gm...> wrote: > > > > > Il dom 20 mag 2018, 15:56 Zlatko Čalušić <zca...@bi...> ha scritto: > > > > Even if LizardFS currently has a feature or two missing in current > > > > MooseFS 3.0, it seems that MooseFS 4.0 will be a clear winner, and > > > > definitely worth a wait! > > > > Totally agree > > Probably the real missing thing, IMHO, is the qemu driver > > I've checked qemu today - there is no such thing as "qemu driver". They have support for glusterfs - but this is part of qemu code. > > I probably could write similar code for qemu to integrate qemu with MFS, but I'm not sure if it would be accepted by qemu programmers. > > -- > Regards, > Jakub Kruszona-Zawadzki > - - - - - - - - - - - - - - - - > Segmentation fault (core dumped) > Phone: +48 602 212 039 > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Alexander A. <ba...@ya...> - 2018-05-25 07:10:44
|
Hi All! Is it true? I'm asking about "MooseFS Pro also includes a Windows client." If yes where can I read more about it? On 20.05.2018 16:08, Marin Bernard wrote: >>> This is very interesting >>> Any official and detailed docs about the HA feature? >>> >>> Other than this, without making any flame, which are the >>> differences >>> between MFS4 and LizardFS? >>> >> Hi again, >> >> I've been testing both MooseFS 3.0.x and LizardFS 3.1x in parallel >> for >> a few weeks now. Here are the main differences I found while using >> them. I think most of them will still be relevant with MooseFS 4.0. >> >> * High availability >> In theory, LizardFS provides master high-availability with >> _shadow_ instances. The reality is less glorious, as the piece of >> software actually implementing master autopromotion (based on uraft) >> is >> still proprietary. It is expected to be GPL'd, yet nobody knows when. >> So as of now, if you need HA with LizardFS, you have to write your >> own >> set of scripts and use a 3rd party cluster manager such as corosync. >> >> * POSIX ACLs >> Using POSIX ACLs with LizardFS requires a recent Linux Kernel (4.9+), >> because a version of FUSE with ACL support is needed. This means ACLs >> are unusable with most LTS distros, whose kernels are too old. >> >> With MooseFS, ACLs do work even with older kernels; maybe because >> they >> are implemented at the master level and the client does not even try >> to >> enforce them? >> >> * FreeBSD support >> According to the LizardFS team, all components do compile on FreeBSD. >> They do not provide a package repository, though, nor did they >> succeed >> in submitting LizardFS to the FreeBSD ports tree (bug #225489 is >> still >> open on phabricator). >> >> * Storage classes >> Erasure coding is supported in LizardFS, and I had no special issue >> with it. So far, it works as expected. >> >> The equivalent of MooseFS storage classes in LizardFS are _custom >> goals_. While MooseFS storage classes may be dealt with >> interactively, >> LizardFS goals are statically defined in a dedicated config file. >> MooseFS storage classes allow the use of different label expressions >> at >> each step of a chunk lifecycle (different labels for new, kept and >> archived chunks). LizardFS has no equivalent. >> >> One application of MooseFS storage classes is to transparently delay >> the geo-replication of a chunk for a given amount of time, to lower >> the >> latency of client I/O operations. As far as I know, it is not >> possible >> to do the same with LizardFS. >> >> * NFS support >> LizardFS supports NFSv4 ACL. It may also be used with the NFS Ganesha >> server to export directories directly through user-space NFS. I did >> not >> test this feature myself. According to several people, the feature, >> which is rather young, does work but performs poorly. Ganesha on top >> of >> LizardFS is a multi-tier setup with a lot of moving parts. I think it >> will take some time for it to reach production quality, if ever. >> >> In theory, Ganesha is compatible with kerberized NFS, which would be >> far more secure a solution than the current mfsmount client, enabling >> its use in public/hostile environments. I don't know if MooseFS 4.0 >> has >> improved on this matter. >> >> * Tape server >> LizardFS includes a tape server daemon for tape archiving. That's >> another way to implement some kind of chunk lifecycle without storage >> classes. >> >> * IO limits >> Lizardfs includes a new config file dedicated to IO limits. It allows >> to assign IO limits to cgroups. The LFS client negotiates its >> bandwidth >> limit with the master is leased a reserved bandwidth for a given >> amount >> of time. The big limitation of this feature is that the reserved >> bandwidth may not be shared with another client while the original >> one >> is not using it. In that case, the reserved bandwidth is simply lost. >> >> * Windows client >> The paid version of LizardFS includes a native Windows client. I >> think >> it is built upon some kind of fsal à la Dokan. The client allows to >> map >> a LizardFS export to a drive letter. The client supports Windows ACL >> (probably stored as NFSv4 ACL). >> >> * Removed features >> LizardFS removed chunkserver maintenance mode and authentication code >> (AUTH_CODE). Several tabs from the Web UI are also gone, including >> the >> one showing quotas. The original CLI tools were replaced by their own >> versions, which I find harder to use (no more tables, and very >> verbose >> output). >> >> >> >> I've been using MooseFS for several years and never had any problem >> with it, even in very awkward situations. My feeling is that it is >> really a rock-solid, battle-tested product. >> >> I gave LizardFS a try, mainly for erasure coding and high- >> availability. >> While the former worked as expected, the latter turned out to be a >> myth: the free version of LizardFS does not provide more HA than >> MooseFS CE: in both cases, building a HA solution requires writing >> custom scripts and relying on a cluster managed such as corosync. I >> see >> no added value in using LizardFS for HA. >> >> On all other aspects, LizardFS does the same or worse than MooseFS. I >> found performance to be roughly equivalent between the two (provided >> you disable fsync on LizardFS chunkservers, where it is enabled by >> default). Both solutions are still similar in many aspects, yet >> LizardFS is clouded by a few negative points: ACLs are hardly usable, >> custom goals are less powerful than storage classes and less >> convenient >> for geo-replication, FreeBSD support is inexistent, CLI tools are >> less >> efficient, and native NFS support is too young to be really usable. >> >> After a few months, I came to the conclusion than migrating to >> LizardFS >> was not worth the single erasure coding feature, especially now that >> MooseFS 4.0 CE with EC is officially announced. I'd rather buy a few >> more drives and cope with standard copies for a while than ditching >> MooseFS reliability for LizardFS. >> >> Hope it helps, >> >> Marin > A few corrections: > > 1. MooseFS Pro also includes a Windows client. > > 2. LizardFS did not "remove" tabs from the web UI: these tabs were > added by MooseFS after LizardFS had forked the code base. > |
From: <li...@ol...> - 2018-05-25 07:26:16
|
It seems so : https://moosefs.com/blog/moosefs-and-high-performance-computing-isc-2017-exhibition/ Marin. De : Alexander AKHOBADZE Envoyé le :vendredi 25 mai 2018 09:10 À : MooseFS-Users Objet :Re: [MooseFS-Users] MooseFS 4.0 and new features Hi All! Is it true? I'm asking about "MooseFS Pro also includes a Windows client." If yes where can I read more about it? On 20.05.2018 16:08, Marin Bernard wrote: This is very interesting Any official and detailed docs about the HA feature? Other than this, without making any flame, which are the differences between MFS4 and LizardFS? Hi again, I've been testing both MooseFS 3.0.x and LizardFS 3.1x in parallel for a few weeks now. Here are the main differences I found while using them. I think most of them will still be relevant with MooseFS 4.0. * High availability In theory, LizardFS provides master high-availability with _shadow_ instances. The reality is less glorious, as the piece of software actually implementing master autopromotion (based on uraft) is still proprietary. It is expected to be GPL'd, yet nobody knows when. So as of now, if you need HA with LizardFS, you have to write your own set of scripts and use a 3rd party cluster manager such as corosync. * POSIX ACLs Using POSIX ACLs with LizardFS requires a recent Linux Kernel (4.9+), because a version of FUSE with ACL support is needed. This means ACLs are unusable with most LTS distros, whose kernels are too old. With MooseFS, ACLs do work even with older kernels; maybe because they are implemented at the master level and the client does not even try to enforce them? * FreeBSD support According to the LizardFS team, all components do compile on FreeBSD. They do not provide a package repository, though, nor did they succeed in submitting LizardFS to the FreeBSD ports tree (bug #225489 is still open on phabricator). * Storage classes Erasure coding is supported in LizardFS, and I had no special issue with it. So far, it works as expected. The equivalent of MooseFS storage classes in LizardFS are _custom goals_. While MooseFS storage classes may be dealt with interactively, LizardFS goals are statically defined in a dedicated config file. MooseFS storage classes allow the use of different label expressions at each step of a chunk lifecycle (different labels for new, kept and archived chunks). LizardFS has no equivalent. One application of MooseFS storage classes is to transparently delay the geo-replication of a chunk for a given amount of time, to lower the latency of client I/O operations. As far as I know, it is not possible to do the same with LizardFS. * NFS support LizardFS supports NFSv4 ACL. It may also be used with the NFS Ganesha server to export directories directly through user-space NFS. I did not test this feature myself. According to several people, the feature, which is rather young, does work but performs poorly. Ganesha on top of LizardFS is a multi-tier setup with a lot of moving parts. I think it will take some time for it to reach production quality, if ever. In theory, Ganesha is compatible with kerberized NFS, which would be far more secure a solution than the current mfsmount client, enabling its use in public/hostile environments. I don't know if MooseFS 4.0 has improved on this matter. * Tape server LizardFS includes a tape server daemon for tape archiving. That's another way to implement some kind of chunk lifecycle without storage classes. * IO limits Lizardfs includes a new config file dedicated to IO limits. It allows to assign IO limits to cgroups. The LFS client negotiates its bandwidth limit with the master is leased a reserved bandwidth for a given amount of time. The big limitation of this feature is that the reserved bandwidth may not be shared with another client while the original one is not using it. In that case, the reserved bandwidth is simply lost. * Windows client The paid version of LizardFS includes a native Windows client. I think it is built upon some kind of fsal à la Dokan. The client allows to map a LizardFS export to a drive letter. The client supports Windows ACL (probably stored as NFSv4 ACL). * Removed features LizardFS removed chunkserver maintenance mode and authentication code (AUTH_CODE). Several tabs from the Web UI are also gone, including the one showing quotas. The original CLI tools were replaced by their own versions, which I find harder to use (no more tables, and very verbose output). I've been using MooseFS for several years and never had any problem with it, even in very awkward situations. My feeling is that it is really a rock-solid, battle-tested product. I gave LizardFS a try, mainly for erasure coding and high- availability. While the former worked as expected, the latter turned out to be a myth: the free version of LizardFS does not provide more HA than MooseFS CE: in both cases, building a HA solution requires writing custom scripts and relying on a cluster managed such as corosync. I see no added value in using LizardFS for HA. On all other aspects, LizardFS does the same or worse than MooseFS. I found performance to be roughly equivalent between the two (provided you disable fsync on LizardFS chunkservers, where it is enabled by default). Both solutions are still similar in many aspects, yet LizardFS is clouded by a few negative points: ACLs are hardly usable, custom goals are less powerful than storage classes and less convenient for geo-replication, FreeBSD support is inexistent, CLI tools are less efficient, and native NFS support is too young to be really usable. After a few months, I came to the conclusion than migrating to LizardFS was not worth the single erasure coding feature, especially now that MooseFS 4.0 CE with EC is officially announced. I'd rather buy a few more drives and cope with standard copies for a while than ditching MooseFS reliability for LizardFS. Hope it helps, Marin A few corrections: 1. MooseFS Pro also includes a Windows client. 2. LizardFS did not "remove" tabs from the web UI: these tabs were added by MooseFS after LizardFS had forked the code base. |
From: Marco M. <mar...@gm...> - 2018-05-25 12:18:29
|
On 05/25/2018 03:10 AM, Alexander AKHOBADZE wrote: > > Hi All! > > Is it true? I'm asking about "MooseFS Pro also includes a Windows client." > > If yes where can I read more about it? MooseFS Pro is not free. I have to pay to get it. -- Marco > > > > On 20.05.2018 16:08, Marin Bernard wrote: >>>> This is very interesting >>>> Any official and detailed docs about the HA feature? >>>> >>>> Other than this, without making any flame, which are the >>>> differences >>>> between MFS4 and LizardFS? >>>> >>> Hi again, >>> >>> I've been testing both MooseFS 3.0.x and LizardFS 3.1x in parallel >>> for >>> a few weeks now. Here are the main differences I found while using >>> them. I think most of them will still be relevant with MooseFS 4.0. >>> >>> * High availability >>> In theory, LizardFS provides master high-availability with >>> _shadow_ instances. The reality is less glorious, as the piece of >>> software actually implementing master autopromotion (based on uraft) >>> is >>> still proprietary. It is expected to be GPL'd, yet nobody knows when. >>> So as of now, if you need HA with LizardFS, you have to write your >>> own >>> set of scripts and use a 3rd party cluster manager such as corosync. >>> >>> * POSIX ACLs >>> Using POSIX ACLs with LizardFS requires a recent Linux Kernel (4.9+), >>> because a version of FUSE with ACL support is needed. This means ACLs >>> are unusable with most LTS distros, whose kernels are too old. >>> >>> With MooseFS, ACLs do work even with older kernels; maybe because >>> they >>> are implemented at the master level and the client does not even try >>> to >>> enforce them? >>> >>> * FreeBSD support >>> According to the LizardFS team, all components do compile on FreeBSD. >>> They do not provide a package repository, though, nor did they >>> succeed >>> in submitting LizardFS to the FreeBSD ports tree (bug #225489 is >>> still >>> open on phabricator). >>> >>> * Storage classes >>> Erasure coding is supported in LizardFS, and I had no special issue >>> with it. So far, it works as expected. >>> >>> The equivalent of MooseFS storage classes in LizardFS are _custom >>> goals_. While MooseFS storage classes may be dealt with >>> interactively, >>> LizardFS goals are statically defined in a dedicated config file. >>> MooseFS storage classes allow the use of different label expressions >>> at >>> each step of a chunk lifecycle (different labels for new, kept and >>> archived chunks). LizardFS has no equivalent. >>> >>> One application of MooseFS storage classes is to transparently delay >>> the geo-replication of a chunk for a given amount of time, to lower >>> the >>> latency of client I/O operations. As far as I know, it is not >>> possible >>> to do the same with LizardFS. >>> >>> * NFS support >>> LizardFS supports NFSv4 ACL. It may also be used with the NFS Ganesha >>> server to export directories directly through user-space NFS. I did >>> not >>> test this feature myself. According to several people, the feature, >>> which is rather young, does work but performs poorly. Ganesha on top >>> of >>> LizardFS is a multi-tier setup with a lot of moving parts. I think it >>> will take some time for it to reach production quality, if ever. >>> >>> In theory, Ganesha is compatible with kerberized NFS, which would be >>> far more secure a solution than the current mfsmount client, enabling >>> its use in public/hostile environments. I don't know if MooseFS 4.0 >>> has >>> improved on this matter. >>> >>> * Tape server >>> LizardFS includes a tape server daemon for tape archiving. That's >>> another way to implement some kind of chunk lifecycle without storage >>> classes. >>> >>> * IO limits >>> Lizardfs includes a new config file dedicated to IO limits. It allows >>> to assign IO limits to cgroups. The LFS client negotiates its >>> bandwidth >>> limit with the master is leased a reserved bandwidth for a given >>> amount >>> of time. The big limitation of this feature is that the reserved >>> bandwidth may not be shared with another client while the original >>> one >>> is not using it. In that case, the reserved bandwidth is simply lost. >>> >>> * Windows client >>> The paid version of LizardFS includes a native Windows client. I >>> think >>> it is built upon some kind of fsal à la Dokan. The client allows to >>> map >>> a LizardFS export to a drive letter. The client supports Windows ACL >>> (probably stored as NFSv4 ACL). >>> >>> * Removed features >>> LizardFS removed chunkserver maintenance mode and authentication code >>> (AUTH_CODE). Several tabs from the Web UI are also gone, including >>> the >>> one showing quotas. The original CLI tools were replaced by their own >>> versions, which I find harder to use (no more tables, and very >>> verbose >>> output). >>> >>> >>> >>> I've been using MooseFS for several years and never had any problem >>> with it, even in very awkward situations. My feeling is that it is >>> really a rock-solid, battle-tested product. >>> >>> I gave LizardFS a try, mainly for erasure coding and high- >>> availability. >>> While the former worked as expected, the latter turned out to be a >>> myth: the free version of LizardFS does not provide more HA than >>> MooseFS CE: in both cases, building a HA solution requires writing >>> custom scripts and relying on a cluster managed such as corosync. I >>> see >>> no added value in using LizardFS for HA. >>> >>> On all other aspects, LizardFS does the same or worse than MooseFS. I >>> found performance to be roughly equivalent between the two (provided >>> you disable fsync on LizardFS chunkservers, where it is enabled by >>> default). Both solutions are still similar in many aspects, yet >>> LizardFS is clouded by a few negative points: ACLs are hardly usable, >>> custom goals are less powerful than storage classes and less >>> convenient >>> for geo-replication, FreeBSD support is inexistent, CLI tools are >>> less >>> efficient, and native NFS support is too young to be really usable. >>> >>> After a few months, I came to the conclusion than migrating to >>> LizardFS >>> was not worth the single erasure coding feature, especially now that >>> MooseFS 4.0 CE with EC is officially announced. I'd rather buy a few >>> more drives and cope with standard copies for a while than ditching >>> MooseFS reliability for LizardFS. >>> >>> Hope it helps, >>> >>> Marin >> A few corrections: >> >> 1. MooseFS Pro also includes a Windows client. >> >> 2. LizardFS did not "remove" tabs from the web UI: these tabs were >> added by MooseFS after LizardFS had forked the code base. >> > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |