"Lin Jen-Shin (godfat)" <god...@godfat.org> wrote:
> On Sat, May 9, 2015 at 1:03 AM, Eric Wong <e...@80x24.org> wrote:
> > It's unfortunately difficult to detect thread death from ruby (no
> > SIGCHLD handler unlike for processes) besides polling Thread#join
> >
> > We had this issue in ruby-core a few years back, but apparently
> > it was forgotten/ignored by matz.  Care to chime in?
> > https://bugs.ruby-lang.org/issues/6647
> 
> I just sent a few characters, hope that would speed up the process.

Thanks for reminding us of this, care to examine/fix some of the MRI
test failures in the patch I posted to MRI? :)

> >> I am aware that yahns is *extremely sensitive to fatal bugs in the
> >> applications it hosts*, so I am just curious.
> >>
> >> For reference, Puma would immediately close the socket without
> >> sending anything, and Unicorn would log the error backtrace and
> >> kill the worker (If I read it correctly).

Actually, unicorn doesn't do that explicitly, it's standard Ruby
behavior for the main thread.

> >> In this case, Unicorn helped me figure out what's happened.
> >
> > yahns can probably rescue Exception (or Object(!) like puma)
> > and then log + abort the entire process.
> 
> I think rescuing Object is misleading. AFAIK, we cannot raise
> an instance which is not a kind of Exception.

I guess, there's some internal non-object interrupts in MRI for threads
(eKillSignal, eTerminateSignal) but I don't think those get exposed to
Ruby-land...

> I learned that rescuing Exception is a bad idea because like
> signal handling and some other stuffs are also using Exception
> to communicate, and of course we won't want to interfere.

Right.

> However for a worker thread, I guess that might be ok?

Maybe limiting it to the common types {Standard,Load,Syntax}Error
is sufficient.

Below, I'm choosing to both leave the socket open and keep the worker
running to slow down a potentially malicious client if this happens and
to hopefully prevent an evil client from taking others down with it.

The process may be in bad state from Load/SyntaxErrors anyways with
partially loaded code, though.

yahns cannot be made error-tolerant when given buggy code, but it should
at least allow users to find problems since the Ruby default behavior
sucks right now:

diff --git a/lib/yahns/queue_epoll.rb b/lib/yahns/queue_epoll.rb
index 4f3289e..2875920 100644
--- a/lib/yahns/queue_epoll.rb
+++ b/lib/yahns/queue_epoll.rb
@@ -64,7 +64,7 @@ class Yahns::Queue < SleepyPenguin::Epoll::IO # :nodoc:
             raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}"
           end
         end
-      rescue => e
+      rescue StandardError, LoadError, SyntaxError => e
         break if closed? # can still happen due to shutdown_timeout
         Yahns::Log.exception(logger, 'queue loop', e)
       end while true
diff --git a/lib/yahns/queue_kqueue.rb b/lib/yahns/queue_kqueue.rb
index 4176f7a..33f5f8b 100644
--- a/lib/yahns/queue_kqueue.rb
+++ b/lib/yahns/queue_kqueue.rb
@@ -72,7 +72,7 @@ class Yahns::Queue < SleepyPenguin::Kqueue::IO # :nodoc:
             raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}"
           end
         end
-      rescue => e
+      rescue StandardError, LoadError, SyntaxError => e
         break if closed? # can still happen due to shutdown_timeout
         Yahns::Log.exception(logger, 'queue loop', e)
       end while true

Thoughts?

Reply via email to