Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#243 Linux futex support for locks

trunk
closed-accepted
core (47)
5
2013-01-22
2012-09-04
Ryan Bullock
No

This patch adds the option for using futexs under Linux with FAST_LOCK.

Implementation was taken from http://people.redhat.com/drepper/futex.pdf, and modified to add support for an Adaptive Wait loop in the case of a single waiter.

Uses a modified version of the tsl() fuction from fastlock.h as an atomic xchg operation. It uses gcc builtins for atomic cmpxchg.

To use futexes FAST_LOCK and USE_FUTEX must both be defined. Since the logic for using the futex is notably different than FAST_LOCK it is
implemented separately in futex_lock.h, however we still require FAST_LOCK be defined to ensure support for the atomic_xchg.

In my testing, using opensips in a virtual machine allocated 2 processors and using 16 children in a fairly pathological locking scenario this shows a large performance
boost over just FAST_LOCK.

Some Benchmark numbers after 5 minutes of runtime with sipp attempting 1000 CPS with 6 second durations:
USE_FUTEX
Sep 4 15:06:00 labproxy /usr/local/sbin/opensips[17196]: benchmark (timer lock_test [0]): 309 [ msgs/total/min/max/avg - LR: 100/14797/21/2049/147.970000 | GB: 599600/98709250/0/20102/987092.500000]
Sep 4 15:06:00 labproxy /usr/local/sbin/opensips[17202]: benchmark (timer lock_test [0]): 120 [ msgs/total/min/max/avg - LR: 100/15584/85/317/155.840000 | GB: 599700/98724834/0/20102/987248.340000]
Sep 4 15:06:00 labproxy /usr/local/sbin/opensips[17192]: benchmark (timer lock_test [0]): 150 [ msgs/total/min/max/avg - LR: 100/19241/86/4076/192.410000 | GB: 599800/98744075/0/20102/987440.750000]
Sep 4 15:06:00 labproxy /usr/local/sbin/opensips[17205]: benchmark (timer lock_test [0]): 124 [ msgs/total/min/max/avg - LR: 100/22005/74/6166/220.050000 | GB: 599900/98766080/0/20102/987660.800000]
Sep 4 15:06:00 labproxy /usr/local/sbin/opensips[17206]: benchmark (timer lock_test [0]): 105 [ msgs/total/min/max/avg - LR: 100/19022/21/848/190.220000 | GB: 600000/98785102/0/20102/987851.020000]
Sep 4 15:06:01 labproxy /usr/local/sbin/opensips[17203]: benchmark (timer lock_test [0]): 155 [ msgs/total/min/max/avg - LR: 100/18633/85/1386/186.330000 | GB: 600100/98803735/0/20102/988037.350000]
Sep 4 15:06:01 labproxy /usr/local/sbin/opensips[17203]: benchmark (timer lock_test [0]): 107 [ msgs/total/min/max/avg - LR: 100/18284/81/2873/182.840000 | GB: 600200/98822019/0/20102/988220.190000]
Sep 4 15:06:01 labproxy /usr/local/sbin/opensips[17207]: benchmark (timer lock_test [0]): 106 [ msgs/total/min/max/avg - LR: 100/16327/83/747/163.270000 | GB: 600300/98838346/0/20102/988383.460000]
Sep 4 15:06:01 labproxy /usr/local/sbin/opensips[17204]: benchmark (timer lock_test [0]): 40 [ msgs/total/min/max/avg - LR: 100/16664/27/5093/166.640000 | GB: 600400/98855010/0/20102/988550.100000]
Sep 4 15:06:01 labproxy /usr/local/sbin/opensips[17205]: benchmark (timer lock_test [0]): 43 [ msgs/total/min/max/avg - LR: 100/4938/26/222/49.380000 | GB: 600500/98859948/0/20102/988599.480000]

FAST_LOCK
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2360]: benchmark (timer lock_test [0]): 159 [ msgs/total/min/max/avg - LR: 100/16685/78/380/166.850000 | GB: 600600/347519878/0/137058/3475198.780000]
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2361]: benchmark (timer lock_test [0]): 153 [ msgs/total/min/max/avg - LR: 100/24353/0/2216/243.530000 | GB: 600700/347544231/0/137058/3475442.310000]
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2355]: benchmark (timer lock_test [0]): 95 [ msgs/total/min/max/avg - LR: 100/16816/75/465/168.160000 | GB: 600800/347561047/0/137058/3475610.470000]
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2359]: benchmark (timer lock_test [0]): 131 [ msgs/total/min/max/avg - LR: 100/20270/81/3535/202.700000 | GB: 600900/347581317/0/137058/3475813.170000]
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2346]: benchmark (timer lock_test [0]): 87 [ msgs/total/min/max/avg - LR: 100/17892/87/872/178.920000 | GB: 601000/347599209/0/137058/3475992.090000]
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2361]: benchmark (timer lock_test [0]): 178 [ msgs/total/min/max/avg - LR: 100/16425/81/354/164.250000 | GB: 601100/347615634/0/137058/3476156.340000]
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2351]: benchmark (timer lock_test [0]): 45 [ msgs/total/min/max/avg - LR: 100/14298/0/1367/142.980000 | GB: 601200/347629932/0/137058/3476299.320000]
Sep 4 15:15:44 labproxy /usr/local/sbin/opensips[2350]: benchmark (timer lock_test [0]): 0 [ msgs/total/min/max/avg - LR: 100/119421/0/60350/1194.210000 | GB: 601300/347749353/0/137058/3477493.530000]
Sep 4 15:15:45 labproxy /usr/local/sbin/opensips[2355]: benchmark (timer lock_test [0]): 29 [ msgs/total/min/max/avg - LR: 100/554685/0/64421/5546.850000 | GB: 601400/348304038/0/137058/3483040.380000]
Sep 4 15:15:45 labproxy /usr/local/sbin/opensips[2358]: benchmark (timer lock_test [0]): 38 [ msgs/total/min/max/avg - LR: 100/206816/0/28935/2068.160000 | GB: 601500/348510854/0/137058/3485108.540000]

While running the benchmark vmstat reported drastically reduced context switching for USE_FUTEX as compared to FAST_LOCK.

Tested on Scientific Linux 6.2 with gcc 4.4.6. Does currently cause a couple warnings, but otherwise seems to work fine.
Compiles with gcc 4.7 on debian unstable without warnings.

Comments/criticisms welcome. I don't normally write C so if anything looks incorrect/bad style please let me know.

Hoping others would be able to test/benchmark this as well.

Discussion

  • Ryan Bullock
    Ryan Bullock
    2012-10-16

    Linux futex support for locks - fixes warnings

     
  • Ryan Bullock
    Ryan Bullock
    2012-10-16

    Uploaded a new revision of the patch that fixes the warnings on older compilers.

     
    • assigned_to: nobody --> vladut-paiu
    • status: open --> open-accepted
     
    • status: open-accepted --> closed-accepted
     
  • Hello Ryan,

    Committed the patch on OpenSIPS trunk.
    Tested it out and indeed it brings a significant performance improvement over the FAST_LOCK mechanism in a lock heavy environment.

    Thank you !

    Best Regards,
    Vlad