Re: [Moosefs-users] Why too slow in calling fdatasync()? (or futex())

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> If you’d be eager to experiment, let us know, we will prepare a special
> version with this option for tests.

We are very keen to use MFS in our real production because it seems to
fit well to most of our needs. However, the fdatasync problem is a
critical issue that I have to clarify before making our final
decision. It would be greatly appreciated if you provide the special
version with the option of by-passing fdatasync() as soon as possible.

Thanks,

-Jun

2011/2/8 Michal Borychowski <mic...@ge...>:
> Hi Jun and Thomas!
>
>
>
> We are aware of "non-persistent connections" and soon we’ll improve this
> behavior.
>
>
>
> But unfortunately "persistent connections" would not help much in this
> scenario. “fdatasync” causes immediate dispatch of all data from cache to
> external disks – this is a “costly” operation (even if the connection would
> be sustained) and generally speaking eliminates all profits from having the
> cache.
>
>
>
> We can add to mfsmount an option like “mfsignorefsync” which would cause
> that “fdatasync” does nothing and data would be sent at “its own speed” and
> would use single connection for the whole group of data.
>
>
>
> If you’d be eager to experiment, let us know, we will prepare a special
> version with this option for tests.
>
>
>
>
>
> Regards
>
> -Michal
>
>
>
>
>
>
>
> From: Thomas S Hatch [mailto:tha...@gm...]
> Sent: Tuesday, February 08, 2011 12:32 AM
> To: Jun Cheol Park
> Cc: moosefs-users
> Subject: Re: [Moosefs-users] Why too slow in calling fdatasync()? (or
> futex())
>
>
>
>
>
> On Mon, Feb 7, 2011 at 4:25 PM, Jun Cheol Park <jun...@gm...>
> wrote:
>
> Hi,
>
> I found a more specific case about the performance slowness issue that
> I experienced before.
>
> One of important commands in using KVM (kernel-based Virtual Machine)
> is qemu-img that generates a special form of file (i.e., qcow2) as
> follows.
>
> # strace qemu-img convert -f qcow2 -O qcow2 ........
>
> read(4,
> "\f\0\0\0\0\0\0\0\0\0\0\0\260r\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 128) = 128
> pwrite(5,
> "\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1"...,
> 512, 196608) = 512
> fdatasync(5)                            = 0
> futex(0x63eec4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x63eec0, {FUTEX_OP_SET,
> 0, FUTEX_OP_CMP_GT, 1}) = 1
> select(5, [4], [], NULL, NULL)          = 1 (in [4])
> read(4,
> "\f\0\0\0\0\0\0\0\0\0\0\0\260r\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 128) = 128
> pwrite(5,
> "\200\0\0\0\0\27\0\0\200\0\0\0\0\30\0\0\200\0\0\0\0\31\0\0\200\0\0\0\0\32\0\0"...,
> 512, 263168) = 512
> fdatasync(5)
> futex(0x63eec4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x63eec0, {FUTEX_OP_SET,
> 0, FUTEX_OP_CMP_GT, 1}) = 1
>
> The problem of this command (unlike 'cp' where no calling of
> fdatasync) is that, for every pwrite operation, fdatasync() and
> futex() are called. Then, the MFS write performance is significantly
> reduced from 60 M/s (with 'cp')  to 1M/s (with qemu-img). I noticed
> via netstat with TIME_WAIT that, while the qemu-img command is
> running, there are a lot of non-persistent TCP connections.
>
> Is there any way to improve this situation?
>
> Thanks,
>
> -Jun
>
> ------------------------------------------------------------------------------
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb
> _______________________________________________
> moosefs-users mailing list
> moo...@li...
> https://lists.sourceforge.net/lists/listinfo/moosefs-users
>
>
>
>
>
> Thanks Jun, that explains a lot, I usually prepare my qcow images on a build
> machine with local disks and then my datacenters running moosefs pull from
> the build machine over the internet. But I have never been able to get
> qemu-img convert to create a qcow image on a moosefs mount, this answers a
> lot of questions here, and it raises my curiosity about the connection
> handling.
>
>
>
> As I said before, I trust the MooseFS devs, but I wonder if I can ask, from
> an academic perspective, why has it been designed this way? and could this
> be an opportunity for performance improvement?
>
>
>
> -Thomas S Hatch

Re: [Moosefs-users] Why too slow in calling fdatasync()? (or futex())

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] Why too slow in calling fdatasync()? (or futex())