This is a conservative workaround to prevent missed wakeups on
EPOLLONESHOT-based systems running old Linux kernels.  Unfortunately
some users cannot upgrade kernels as easily as they can userland
software, so we'll provide this workaround for them.

Newer Linux kernels are immune to the race and do not need the
workaround:
  3.8+, 2.6.32.61+, 2.6.34.15+, 3.0.59+, 3.2.37+, 3.4.26+, 3.5.7.3+, 3.7.3+

ref: Linux kenrel commit 128dd1759d96ad36c379240f8b9463e8acfd37a1
     ("epoll: prevent missed events on EPOLL_CTL_MOD")

This kernel bug was probably never triggered by common
level-triggered epoll users, and may not have been exposed in
less-scalable, older kernel versions or systems with few CPUs.
---
 I'm undecided on whether I want to push this to yahns master, but this
 message will at least serve as documentation in case anybody encounters
 this rare, tiny race condition.  When I first heard of this bug on
 Linux 3.7, I tried for hours to reproduce this on a 4-core machine
 with no luck...

 My personal preference is to push people to newer kernels;
 but that's not always practical, unfortunately :<

 lib/yahns/queue_epoll.rb | 54 ++++++++++++++++++++++++++++++++++++++++++++++--
 test/test_queue.rb       | 25 ++++++++++++++++++++++
 2 files changed, 77 insertions(+), 2 deletions(-)
 create mode 100644 test/test_queue.rb

diff --git a/lib/yahns/queue_epoll.rb b/lib/yahns/queue_epoll.rb
index 4a10ce0..208167a 100644
--- a/lib/yahns/queue_epoll.rb
+++ b/lib/yahns/queue_epoll.rb
@@ -55,9 +55,9 @@ class Yahns::Queue < SleepyPenguin::Epoll::IO # :nodoc:
           # thread only until epoll_ctl is called on it.
           case rv = io.yahns_step
           when :wait_readable
-            epoll_ctl(Epoll::CTL_MOD, io, QEV_RD)
+            epoll_ctl_mod(io, QEV_RD)
           when :wait_writable
-            epoll_ctl(Epoll::CTL_MOD, io, QEV_WR)
+            epoll_ctl_mod(io, QEV_WR)
           when :ignore # only used by rack.hijack
             # we cannot call Epoll::CTL_DEL after hijacking, the hijacker
             # may have already closed it  Likewise, io.fileno is not
@@ -76,4 +76,54 @@ class Yahns::Queue < SleepyPenguin::Epoll::IO # :nodoc:
       end while true
     end
   end
+
+  # workaround racy EPOLL_CTL_MOD raciness with EPOLLONESHOT on SMP systems.
+  # ref: Linux commit 128dd1759d96ad36c379240f8b9463e8acfd37a1
+  # ("epoll: prevent missed events on EPOLL_CTL_MOD")
+  # We'll be conservative and assume bugginess while older kernels.
+  def self.epoll_ctl_mod_buggy?(uname)
+    # maybe somebody ported epoll to non-Linux, assume it works:
+    uname[:sysname] == "Linux" or return false
+
+    # converts a version array (e.g. %w(2 6 32 61)) into an integer,
+    # no official Linux kernel version component exceeds 255 currently.
+    ver = lambda { |*v|
+      v[0] << 24 | (v[1] || 0) << 16 | (v[2] || 0) << 8 | (v[3] || 0)
+    }
+    release = uname[:release].split(/\./).map(&:to_i)
+    cur = ver[*release]
+
+    # all 3.8+ kernels are good (not buggy)
+    return false if cur >= ver[3,8]
+
+    # some stable versions have the relevant patch backported,
+    # most of these are on kernel.org:
+    # git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
+    # 3.5.7 stable is from: git://kernel.ubuntu.com/ubuntu/linux.git
+
+    curpfx = cur >> 16 # X.Y
+    [ ver[3,7,3], ver[3,4,26], ver[3,2,37], ver[3,0,59] ].each do |minver|
+      minpfx = minver >> 16
+      return false if minpfx == curpfx && cur >= minver
+    end
+
+    curpfx = cur >> 8 # X.Y.Z
+    [ ver[3,5,7,3], ver[2,6,34,15], ver[2,6,32,61] ].each do |minver|
+      minpfx = minver >> 8
+      return false if minpfx == curpfx && cur >= minver
+    end
+    true
+  end
+
+  if epoll_ctl_mod_buggy?(Etc.uname)
+    # slow and safe for systems missing the necessary memory barrier
+    def epoll_ctl_mod(io, flag)
+      epoll_ctl(Epoll::CTL_DEL, io, 0)
+      epoll_ctl(Epoll::CTL_ADD, io, flag)
+    end
+  else
+    def epoll_ctl_mod(io, flag)
+      epoll_ctl(Epoll::CTL_MOD, io, flag)
+    end
+  end
 end
diff --git a/test/test_queue.rb b/test/test_queue.rb
new file mode 100644
index 0000000..32b8b76
--- /dev/null
+++ b/test/test_queue.rb
@@ -0,0 +1,25 @@
+# Copyright (C) 2014, all contributors (see git://yhbt.net/yahns.git history)
+# License: GPLv3 or later (https://www.gnu.org/licenses/gpl-3.0.txt)
+require_relative 'helper'
+
+class TestQueue < Testcase
+  def test_ep_buggy
+    uname = Etc.uname
+
+    if uname[:sysname] == "Linux"
+      %w(2.4.19 3.7.2 3.4.25 3.2.36 3.0.58 2.6.0 2.6.32 2.6.32.60).each do |v|
+        uname[:release] = v
+        assert Yahns::Queue.epoll_ctl_mod_buggy?(uname), "#{v} is buggy"
+      end
+
+      %w(3.8 3.8.1 3.16.2 4.0 2.6.32.61 3.5.7.3 3.7.3).each do |v|
+        uname[:release] = v
+        refute Yahns::Queue.epoll_ctl_mod_buggy?(uname), "#{v} is not buggy"
+      end
+    end
+
+    uname[:sysname] = "Hurd"
+    uname[:release] = "0.1.0"
+    refute Yahns::Queue.epoll_ctl_mod_buggy?(uname), "Hurd is never buggy :)"
+  end if Yahns::Queue.respond_to?(:epoll_ctl_mod_buggy?)
+end
-- 
EW

Reply via email to