On Sat, May 9, 2015 at 9:03 AM, Eric Wong <e...@80x24.org> wrote: > "Lin Jen-Shin (godfat)" <god...@godfat.org> wrote: >> On Sat, May 9, 2015 at 1:03 AM, Eric Wong <e...@80x24.org> wrote: >> > It's unfortunately difficult to detect thread death from ruby (no >> > SIGCHLD handler unlike for processes) besides polling Thread#join >> > >> > We had this issue in ruby-core a few years back, but apparently >> > it was forgotten/ignored by matz. Care to chime in? >> > https://bugs.ruby-lang.org/issues/6647 >> >> I just sent a few characters, hope that would speed up the process. > > Thanks for reminding us of this, care to examine/fix some of the MRI > test failures in the patch I posted to MRI? :)
Haha, cool. Probably not now though. I just took some look, ignoring warnings, I guess some of the tests were trying to capture stdout or stderr and assert on messages. Along with abort_on_exception and using join to peek the exception, this probably breaks those tests. So I assume most of them were bugs in the tests, not in MRI itself. Testing error messages is hard :( >> I think rescuing Object is misleading. AFAIK, we cannot raise >> an instance which is not a kind of Exception. > > I guess, there's some internal non-object interrupts in MRI for threads > (eKillSignal, eTerminateSignal) but I don't think those get exposed to > Ruby-land... Got it, makes sense. >> However for a worker thread, I guess that might be ok? > > Maybe limiting it to the common types {Standard,Load,Syntax}Error > is sufficient. Those are what I can think of right now, too. > Below, I'm choosing to both leave the socket open and keep the worker > running to slow down a potentially malicious client if this happens and > to hopefully prevent an evil client from taking others down with it. I am curious how this could slow down a malicious client? Because this might somehow confuse them that the worker is still working? > The process may be in bad state from Load/SyntaxErrors anyways with > partially loaded code, though. > > yahns cannot be made error-tolerant when given buggy code, but it should > at least allow users to find problems since the Ruby default behavior > sucks right now: > > diff --git a/lib/yahns/queue_epoll.rb b/lib/yahns/queue_epoll.rb > index 4f3289e..2875920 100644 > --- a/lib/yahns/queue_epoll.rb > +++ b/lib/yahns/queue_epoll.rb > @@ -64,7 +64,7 @@ class Yahns::Queue < SleepyPenguin::Epoll::IO # :nodoc: > raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}" > end > end > - rescue => e > + rescue StandardError, LoadError, SyntaxError => e > break if closed? # can still happen due to shutdown_timeout > Yahns::Log.exception(logger, 'queue loop', e) > end while true > diff --git a/lib/yahns/queue_kqueue.rb b/lib/yahns/queue_kqueue.rb > index 4176f7a..33f5f8b 100644 > --- a/lib/yahns/queue_kqueue.rb > +++ b/lib/yahns/queue_kqueue.rb > @@ -72,7 +72,7 @@ class Yahns::Queue < SleepyPenguin::Kqueue::IO # :nodoc: > raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}" > end > end > - rescue => e > + rescue StandardError, LoadError, SyntaxError => e > break if closed? # can still happen due to shutdown_timeout > Yahns::Log.exception(logger, 'queue loop', e) > end while true > > Thoughts? A backtrace for knowing what's happening I think is quite enough for me now. Still curious though, could this worker do anything else if this happened? I am guessing that if the application no longer does anything, then this worker would not do anything. Or the socket might timeout eventually?