#559 Solaris 11 and 10.9.x locking problem

None
closed
nobody
1
2014-09-18
2014-04-15
No

I have encountered a problem with file locking. It appears the problem is only related to Solaris OS and Mavericks Mac OSX 10.8+

Linux + any version - not affected
Solaris 11 + 10.6 - not affected
Solaris 11 + 10.7 - not affected

Solaris 11 + 10.8 - PROBLEM
Solaris 11 + 10.9.x - PROBLEM

I was testing all cases above with netatalk 3.0.5 and 3.1.1 and they are behaving exactly the same. Apparently it is connected to some AFP protocol changes to 3.3 and 3.4 introduced by Apple in new 10.9 OS.

The lock is disappearing after 10 or 15 seconds because file is opened and closed second time by the same process causing all flocks and fshares to be removed by kernel.

below is an output from LD_PRELOAD'ed fcntl() ,open() and close()

open64[14034]: blah with flags 131074 (O_RDWR) and ret 17
fcntl[14034]: 17 with cmd 40 F_SHARE (3 , 0)
fcntl[14034]: 17 with cmd 33 F_GETLK and RET 0 (l_type 3)
fcntl[14034]: 17 with cmd 33 F_GETLK and RET 0 (l_type 3)
fcntl[14034]: 17 with cmd 34 F_SETLK (F_RDLCK) and ret 0

fcntl[14034]: 17 with cmd 33 F_GETLK and RET 0 (l_type 3)
fcntl[14034]: 17 with cmd 33 F_GETLK and RET 0 (l_type 3)
fcntl[14034]: 17 with cmd 34 F_SETLK (F_RDLCK) and ret 0

( ... )

fcntl[14034]: 17 with cmd 33 F_GETLK and RET 0 (l_type 3)
open64[14034]: blah with flags 131072 () and ret 18
close[14034]: 18

Discussion

  • Ralph Böhme

    Ralph Böhme - 2014-04-16

    Have you tried setting "solaris share reservations = no" ? Might be a workaround.

    Alos, take advantage of the help on the mailing list, and perhaps bring in someone
    to assist with your problem.

     
  • Peter Baranski

    Peter Baranski - 2014-04-17

    Settings "solaris share reservations " and "afp read locks " were the first checks I did.

    I've tried "solaris share reservations = no" with and without zfs nbmand attribute. I also compiled source without HAVE_FSHARE_T as this was the obvious difference between solaris and linux config.h file but it made no difference.

    Solaris F_SHARE's are set and unset correctly and they play the role of an additional layer of locks (cross protocol locks) They do not exclude POSIX locks.

    The origin of the problem and difference between solaris and linux behaviour, is the implementation of POSIX locking mechanism.

    script below will give different results.

    import os,sys,fcntl
    f = os.open("blah",os.O_RDWR|os.O_CREAT)
    fcntl.flock(f,fcntl.LOCK_EX)
    f1 = os.open("blah",os.O_RDWR)
    os.close(f1)
    // use lsof to check lock state

    at this point on Solaris OS all locks are removed.

    The same thing is happening when we proceed similar test over AFP. Everything is fine until some kind of periodic "testlock" routine is opening and closing our previously locked file.

     
    • Ralph Böhme

      Ralph Böhme - 2014-04-17

      I've tried "solaris share reservations = no" with and without zfs nbmand
      attribute. I also compiled source without HAVE_FSHARE_T as this was the
      obvious difference between solaris and linux config.h file but it made
      no difference. Solaris F_SHARE's are set and unset correctly and they
      play the role of an additional layer of locks (cross protocol locks)
      They do not exclude POSIX locks.

      I know, I wrote that code. :)

      The origin of the problem and difference between solaris and linux
      behaviour, is the implementation of POSIX locking mechanism. script
      below will give different results.

      import os,sys,fcntl
      f = os.open("blah",os.O_RDWR|os.O_CREAT)
      fcntl.flock(f,fcntl.LOCK_EX)
      f1 = os.open("blah",os.O_RDWR)
      os.close(f1)
      // use lsof to check lock state

      at this point on Solaris OS all locks are removed.

      This behaviour is mandated by POSIX and would be the same on Linux.

      Afair I've changed locking semantics in Netatalk 3 compared to 2. It's a long story, but it boils down to "keep it simple and don't try to be overly clever". Therefor Netatalk 3 doesn't maintain its own per file lock state. Simplifies things greaty, but might expose some corner cases. Looks like you may be running into one. :) Still no idea why you only have it on Solaris, not on Linux.

       
    • Ralph Böhme

      Ralph Böhme - 2014-04-26

      I've tried "solaris share reservations = no" with and without zfs nbmand attribute.

      As I'm just working with another OEM who has a customer where saving from Adobe AE fails and setting "solaris share reservations = no" actually fixes the issue, did you double check you had set the option right? You have to place it in the global section and restart Netatalk.

       
      • Peter Baranski

        Peter Baranski - 2014-05-22

        Hello Ralph

        My apologies for late response. I was unavailable for few weeks.
        I will double check all related option's and let You know.

         
  • Peter Baranski

    Peter Baranski - 2014-04-17

    I know, I wrote that code. :)

    ... this is more than excellent !!! now I'm sure that my problem will be fixed in no time :)

    import os,sys,fcntl
    f = os.open("blah",os.O_RDWR|os.O_CREAT)
    fcntl.flock(f,fcntl.LOCK_EX)
    f1 = os.open("blah",os.O_RDWR)
    os.close(f1)
    // use lsof to check lock state

    at this point on Solaris OS all locks are removed.

    This behaviour is mandated by POSIX and would be the same on Linux.

    You are right. The code above has different results on Solaris and Linux because of python implementation of exclusive lock. On Linux is used flock() but on Solaris is mapped to fcntl(). My mistake. I choosed wrong method for testing.

    Using "C" code is showing the same result.

    (...)
    fl.l_pid = getpid();
    fl.l_type = F_RDLCK;
    fl.l_whence = SEEK_SET
    (...)
    fcntl(fd, F_SETLKW, &fl)

    Still no idea why you only have it on Solaris, not on Linux.

    On Linux I was only testing following scenario
    (any version of Mac OSX) + netatalk 3.05

    so 10.9.x is falling back to 3.3 protocol.

    although forcing AFP3.3 protocol between 10.9.x and Solaris didn't help

    I will test netatalk3.1.1 on Linux and let U know.

     
  • Peter Baranski

    Peter Baranski - 2014-04-18

    Summarise, I can confirm that on Linux is working any version of OSX with any version of netatalk. On Solaris there is a problem only with Mac OSX 10.8 10.9.1 10.9.2 and 10.9.3.

    Solaris strace is showing periodic checks
    F_GETLK, open(), close() to the locked file.

    On Linux this mechanism is absent.

     
    • Ralph Böhme

      Ralph Böhme - 2014-04-18

      Interesting, without further analysis I can't tell why behaviour differs between Solaris and Linux. Should be the same if share reservations are disabled.

       
  • Peter Baranski

    Peter Baranski - 2014-06-13

    Hello Ralph

    I've managed to find a little time yesterday to analyse the problem again.

    Setting "solaris share reservations = no" does not change anything, because problem lays in EA.

    I found out that system calls are correlated with retrieving EA. I did confirm it by setting "zfs set xattr=off" Netatalk fall back to EA_AD and problem was fixed.

    Perhaps problem is affecting all versions of Netatalk and Mac OSX running on Solaris with SYS EA. You can easily reproduce it in the following way:

    python
    import os,sys,fcntl
    f = os.open("blah1",os.O_RDWR|os.O_CREAT)

    on the other terminal run ls -la or xattr and your lock is gone.

    This problem started to manifest it's self because new version of Mac OSX became more active in reading EA ( even just after file creation ).

    Solaris is keeping EA in files inside a hidden folder related to the file with the same name. It is required to open a file to be able to open a hidden folder to be able to read any EA. And here it is second open I've seen during debug.

    Its exactly "solaris_attropen" from extattr.c, which is:

    opening file
    filedes = open(path, O_RDONLY | (oflag & O_NOFOLLOW), mode))

    opening eafolder
    eafd = openat(filedes, attrpath, oflag | O_XATTR, mode))

    and here is close() to second FD, which is releasing locks and shares
    if (filedes != -1)
    close(filedes);

    Probably, somewhere in the code "above" instead of:

    sys_getxattr() -> solaris_attropen()

    shoud be used

    sys_getxattrfd() -> solaris_openat()

    in a case when there is a lock on specific file.

     
  • Ralph Böhme

    Ralph Böhme - 2014-06-17

    Well done, I'll see if I can fix it based on your findings.

     
  • Ralph Böhme

    Ralph Böhme - 2014-06-17

    Hm, in fact the relevant code in libatalk/adouble/ad_open.c that deals with reading the metadata from the xattr should already use the appropriate function sys_fgetxattr():

    https://github.com/Netatalk/Netatalk/blob/branch-netatalk-3-1/libatalk/adouble/ad_open.c#L740

    Iirc this depends on all functions in etc/afpd/*.c checking whether a file is already open and passing the struct adouble handle referenced in the open fork handle. Looks like this needs more investigation.

     
  • Peter Baranski

    Peter Baranski - 2014-06-18

    The order of calling functions are correct indeed.

    I have added some log entries to the code and I can see that second run of
    ad_header_read_ea is reporting the file to be closed.

    The log below is corresponding to "ls" command on MAC OSX. This time I used 10.6.3.
    I am also attaching full max debug log.

    afpd[6058] {ad_open.c:738} (D5:Default): ad_header_read_ea("/data/part1/blah4"): BEGIN | pbaranski
    afpd[6058] {ad_open.c:741} (D5:Default): ad_header_read_ea("/data/part1/blah4"): file is already opened, calling sys_fgetxattr | pbaranski
    afpd[6058] {extattr.c:220} (D5:Default): sys_fgetxattr: org.netatalk.Metadata HAVE_ATTROPEN before solaris_openat | pbaranski
    afpd[6058] {extattr.c:921} (D5:Default): solaris_openat: ("/data/part1/org.netatalk.Metadata") BEGIN | pbaranski
    afpd[6058] {extattr.c:223} (D5:Default): sys_fgetxattr: org.netatalk.Metadata HAVE_ATTROPEN before solaris_read_xattr | pbaranski
    afpd[6058] {extattr.c:220} (D5:Default): sys_fgetxattr: org.netatalk.ResourceFork HAVE_ATTROPEN before solaris_openat | pbaranski
    afpd[6058] {extattr.c:921} (D5:Default): solaris_openat: ("/data/part1/org.netatalk.ResourceFork") BEGIN | pbaranski
    afpd[6058] {ad_open.c:738} (D5:Default): ad_header_read_ea("/data/part1/blah4"): BEGIN | pbaranski
    afpd[6058] {ad_open.c:744} (D5:Default): ad_header_read_ea("/data/part1/blah4"): file is closed, calling sys_getxattr | pbaranski
    afpd[6058] {extattr.c:163} (D5:Default): sys_getxattr: /data/part1/blah4 HAVE_ATTROPEN before solaris_attropen | pbaranski
    afpd[6058] {extattr.c:880} (D5:Default): solaris_attropen: ("/data/part1/blah4") BEGIN | pbaranski
    afpd[6058] {extattr.c:166} (D5:Default): sys_getxattr: /data/part1/blah4 HAVE_ATTROPEN before solaris_read_xattr | pbaranski
    afpd[6058] {extattr.c:270} (D5:Default): sys_lgetxattr: /data/part1/blah4 HAVE_ATTROPEN - before solaris_attropen | pbaranski
    afpd[6058] {extattr.c:880} (D5:Default): solaris_attropen: ("/data/part1/blah4") BEGIN | pbaranski
    afpd[6058] {extattr.c:524} (D5:Default): sys_llistxattr: /data/part1/blah4 HAVE_ATTROPEN - before solaris_attropen | pbaranski
    afpd[6058] {extattr.c:880} (D5:Default): solaris_attropen: ("/data/part1/blah4") BEGIN | pbaranski
    afpd[6058] {extattr.c:527} (D5:Default): sys_llistxattr: /data/part1/blah4 HAVE_ATTROPEN - before solaris_list_xattr | pbaranski

     
    Attachments
  • Peter Baranski

    Peter Baranski - 2014-08-13

    Hello Ralph

    Eventually I found time to investigated the problem a bit further and found out that actually there are 2 bugs. I also prepared a patch for (netatalk-3.1.3) witch I am including in this post. On Monday I will move these changes to production. As a side effect I had to extend VFS_FUNC_ARGS_EA_GETCONTENT and VFS_FUNC_ARGS_EA_LIST to be able to pass a file descriptor. I tried to use ad_open() instead of of_findname() in lower functions but adouble structure was not filled correctly sometimes (maybe wrong flags).

    1

    Function chain: afp_getextattr() -> vol->vfs->vfs_geteacontent -> ea_getcontent -> sys_get_eacontent -> sys_getxattr()
    How to reproduce: Using any version of Mac OSX, use python script attached to open file with exclusive lock. Using Finder click on a file. The lock is gone.

    2

    Function chain: afp_listextattr() -> vol->vfs->vfs_ea_list -> ea_list -> sys_list_eas -> sys_listxattr()
    How to reproduce: Using any version of Mac OSX, use python script attached to open file with exclusive lock. On a Mac side run "ls -la blah" or "ls -l@". The lock is gone.

     
    Last edit: Peter Baranski 2014-08-13
  • Ralph Böhme

    Ralph Böhme - 2014-08-15

    I don't have time for an in depth review, but I've pushed it to autobuild which runs the test-suite. if it passes I'll push it to branch 3-1.

     
  • Peter Baranski

    Peter Baranski - 2014-08-15

    Here you have latest version. Previous one is buggy.

    This version has exclusion for directories and also fix for:
    - setextattr
    - remextattr
    - getextattr_size

    I will be testing this version during next week at "production" and let you know about results.

    PS. Patch is also applying to the 3.1.5 version with no problems.

     
  • Ralph Böhme

    Ralph Böhme - 2014-08-15

    This looks much better and passes autobuild and the test-suite, the previous patch errored out.
    Again, I haven't done a thorough review or testing, but hey, if we break something subtle elsewhere, I'll assign the bug reports to your account. ;)

    Thanks for contributing! Let me know when your done with testing in production, I'll then push the patch to branch 3-1.

     
  • Peter Baranski

    Peter Baranski - 2014-09-09

    Hello Ralph

    Everything is working good. I can not see any side effects running 34 solaris zones in production environment.

    Best Regards

     
  • Ralph Böhme

    Ralph Böhme - 2014-09-18
    • status: open --> closed
    • Group: -->
     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks