Re: [Moosefs-users] Why too slow in calling fdatasync()? (or futex())

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Jun and Thomas!

We are aware of "non-persistent connections" and soon we'll improve this
behavior.

But unfortunately "persistent connections" would not help much in this
scenario. "fdatasync" causes immediate dispatch of all data from cache to
external disks - this is a "costly" operation (even if the connection would
be sustained) and generally speaking eliminates all profits from having the
cache.

We can add to mfsmount an option like "mfsignorefsync" which would cause
that "fdatasync" does nothing and data would be sent at "its own speed" and
would use single connection for the whole group of data. 

If you'd be eager to experiment, let us know, we will prepare a special
version with this option for tests. 

Regards

-Michal 

From: Thomas S Hatch [mailto:tha...@gm...] 
Sent: Tuesday, February 08, 2011 12:32 AM
To: Jun Cheol Park
Cc: moosefs-users
Subject: Re: [Moosefs-users] Why too slow in calling fdatasync()? (or
futex())

On Mon, Feb 7, 2011 at 4:25 PM, Jun Cheol Park <jun...@gm...>
wrote:

Hi,

I found a more specific case about the performance slowness issue that
I experienced before.

One of important commands in using KVM (kernel-based Virtual Machine)
is qemu-img that generates a special form of file (i.e., qcow2) as
follows.

# strace qemu-img convert -f qcow2 -O qcow2 ........

read(4,
"\f\0\0\0\0\0\0\0\0\0\0\0\260r\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
128) = 128
pwrite(5,
"\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1\0\1"...,
512, 196608) = 512
fdatasync(5)                            = 0
futex(0x63eec4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x63eec0, {FUTEX_OP_SET,
0, FUTEX_OP_CMP_GT, 1}) = 1
select(5, [4], [], NULL, NULL)          = 1 (in [4])
read(4,
"\f\0\0\0\0\0\0\0\0\0\0\0\260r\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
128) = 128
pwrite(5,
"\200\0\0\0\0\27\0\0\200\0\0\0\0\30\0\0\200\0\0\0\0\31\0\0\200\0\0\0\0\32\0\
0"...,
512, 263168) = 512
fdatasync(5)
futex(0x63eec4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x63eec0, {FUTEX_OP_SET,
0, FUTEX_OP_CMP_GT, 1}) = 1

The problem of this command (unlike 'cp' where no calling of
fdatasync) is that, for every pwrite operation, fdatasync() and
futex() are called. Then, the MFS write performance is significantly
reduced from 60 M/s (with 'cp')  to 1M/s (with qemu-img). I noticed
via netstat with TIME_WAIT that, while the qemu-img command is
running, there are a lot of non-persistent TCP connections.

Is there any way to improve this situation?

Thanks,

-Jun

----------------------------------------------------------------------------
--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
moosefs-users mailing list
moo...@li...
https://lists.sourceforge.net/lists/listinfo/moosefs-users

Thanks Jun, that explains a lot, I usually prepare my qcow images on a build
machine with local disks and then my datacenters running moosefs pull from
the build machine over the internet. But I have never been able to get
qemu-img convert to create a qcow image on a moosefs mount, this answers a
lot of questions here, and it raises my curiosity about the connection
handling.

As I said before, I trust the MooseFS devs, but I wonder if I can ask, from
an academic perspective, why has it been designed this way? and could this
be an opportunity for performance improvement?

-Thomas S Hatch

Re: [Moosefs-users] Why too slow in calling fdatasync()? (or futex())

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] Why too slow in calling fdatasync()? (or futex())