On Wed, 2004-08-04 at 19:56, Mark Hammond wrote:
> On Windows however, "runzope" is executed as a *child* of the service. If
> the child fatally fails, the service itself is still reporting success, and
> hopelessly attempting a restart. The service needs to know if the child
> fatally failed. This doesn't apply on Unix as "runzope" *is* the 'service',
> not the child of the service.
FWIW, the equivalent of the service manager code under UNIX is "zopectl"
rather than runzope. zopectl has its own failure detection and backoff
algorithm that's a bit more complex than the Windows service code of the
Actually that's not entirely true: "zopectl" is a client that attempts
to communicate with a separate daemonizing process via a UNIX domain
socket. The daemonizing process is really the parent of the Zope
process (it just invokes runzope).
The majority of code for this is in lib/python/zdaemon/zdrun.py. The
mainloop that impleents the backoff algorithm is in the "runforever"
method of the Daemonizer class in that file, and the thing that decides
not to restart it if it exits with a "known" error code is in the
"reportstatus" method of the same class. You probably care about none
of this, but it's there if you do. ;-)
> By adding a layer around run.py, I believe we could arrange for these fatal
> errors to be handled with a special return code. Alternatively, if Zope
> itself never returns an error code of 1 (one), then we could use that -
> Python itself returns this for unhandled exceptions. That seems dangerous
When you say "these errors" above, do you mean any unhandled exception?
If so (and any nonzero exit code indicated a startup failure), would we
really need cooperation from run.py for this? It seems like it could be
done entirely inside of SvcDoRun.
OTOH, if the "real" problem is that you can't stop the service from
fruitlessly restarting itself in the face of an insoluble error because
of the blocking sleep, it seems like you already solved that. Would it
be a reasonable strategy to leave the backoff stupidty as-is if you were
able to stop the service from flailing via the service manager CP applet
and if it didn't report successful startup until the child actually
> Can you offer some advice here?
> * Is an exit-code of 1 suitable for a fatal error? If so, this requires no
> changes to the child process. However, I assume it is not suitable.
> * Is a special exit code, generated by a wrapper to run.py (eg, run_svc.py)
> suitable? If so, what value do you recommend?
> * Should some Win32 specific, robust IPC mechanism be investigated? This
> would cut deeper into a run.py wrapper, and obviously is not a general
I guess I'll need to wait for the confusion evidenced by my last
question to clear up before I'd be able to venture an answer to that.
> > On another note, I'd really prefer to work out a general facility that
> > can be used with any Python program, including both Zope 2 and Zope 3.
> The problem at the moment is that our facilities are *too* general - ie,
> without some coordination between the parent and child, the parent must
> guess. The simplest coordination does seem to be process exit code, but
> that seems fragile. But whatever coodination is chosen, any Python program
> that was willing to play the coordinate game could use it. The "simpler"
> this game to play, the more fragile the system is (ie, just using exit codes
> is simple, but fragile; using other IPC mechanisms could be made robust, but
> is not simple - especially not in a platform independent way.)
> Zope-Dev maillist - [EMAIL PROTECTED]
> ** No cross posts or HTML encoding! **
> (Related lists -
> http://mail.zope.org/mailman/listinfo/zope )
Zope-Dev maillist - [EMAIL PROTECTED]
** No cross posts or HTML encoding! **
(Related lists -