On Sat, May 9, 2015 at 9:03 AM, Eric Wong <[email protected]> wrote:
> "Lin Jen-Shin (godfat)" <[email protected]> wrote:
>> On Sat, May 9, 2015 at 1:03 AM, Eric Wong <[email protected]> wrote:
>> > It's unfortunately difficult to detect thread death from ruby (no
>> > SIGCHLD handler unlike for processes) besides polling Thread#join
>> >
>> > We had this issue in ruby-core a few years back, but apparently
>> > it was forgotten/ignored by matz. Care to chime in?
>> > https://bugs.ruby-lang.org/issues/6647
>>
>> I just sent a few characters, hope that would speed up the process.
>
> Thanks for reminding us of this, care to examine/fix some of the MRI
> test failures in the patch I posted to MRI? :)
Haha, cool. Probably not now though. I just took some look, ignoring
warnings, I guess some of the tests were trying to capture stdout or
stderr and assert on messages. Along with abort_on_exception and
using join to peek the exception, this probably breaks those tests.
So I assume most of them were bugs in the tests, not in MRI itself.
Testing error messages is hard :(
>> I think rescuing Object is misleading. AFAIK, we cannot raise
>> an instance which is not a kind of Exception.
>
> I guess, there's some internal non-object interrupts in MRI for threads
> (eKillSignal, eTerminateSignal) but I don't think those get exposed to
> Ruby-land...
Got it, makes sense.
>> However for a worker thread, I guess that might be ok?
>
> Maybe limiting it to the common types {Standard,Load,Syntax}Error
> is sufficient.
Those are what I can think of right now, too.
> Below, I'm choosing to both leave the socket open and keep the worker
> running to slow down a potentially malicious client if this happens and
> to hopefully prevent an evil client from taking others down with it.
I am curious how this could slow down a malicious client? Because this
might somehow confuse them that the worker is still working?
> The process may be in bad state from Load/SyntaxErrors anyways with
> partially loaded code, though.
>
> yahns cannot be made error-tolerant when given buggy code, but it should
> at least allow users to find problems since the Ruby default behavior
> sucks right now:
>
> diff --git a/lib/yahns/queue_epoll.rb b/lib/yahns/queue_epoll.rb
> index 4f3289e..2875920 100644
> --- a/lib/yahns/queue_epoll.rb
> +++ b/lib/yahns/queue_epoll.rb
> @@ -64,7 +64,7 @@ class Yahns::Queue < SleepyPenguin::Epoll::IO # :nodoc:
> raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}"
> end
> end
> - rescue => e
> + rescue StandardError, LoadError, SyntaxError => e
> break if closed? # can still happen due to shutdown_timeout
> Yahns::Log.exception(logger, 'queue loop', e)
> end while true
> diff --git a/lib/yahns/queue_kqueue.rb b/lib/yahns/queue_kqueue.rb
> index 4176f7a..33f5f8b 100644
> --- a/lib/yahns/queue_kqueue.rb
> +++ b/lib/yahns/queue_kqueue.rb
> @@ -72,7 +72,7 @@ class Yahns::Queue < SleepyPenguin::Kqueue::IO # :nodoc:
> raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}"
> end
> end
> - rescue => e
> + rescue StandardError, LoadError, SyntaxError => e
> break if closed? # can still happen due to shutdown_timeout
> Yahns::Log.exception(logger, 'queue loop', e)
> end while true
>
> Thoughts?
A backtrace for knowing what's happening I think is quite enough for me now.
Still curious though, could this worker do anything else if this happened?
I am guessing that if the application no longer does anything, then this worker
would not do anything. Or the socket might timeout eventually?