On Tue, 2015-09-22 at 16:12 +0100, Ian Jackson wrote:
> When a build job finishes, the same flight may well want to do a
> subsequent build that depended on the first.  When this happens, we
> have a race:
> 
> One the one hand, we have the flight: after sg-run-job exits,
> sg-execute-flight needs to double-check the job status, and search the
> flight for more jobs to run; it will spawn ts-allocate-hosts-Executive
> for the new job, which needs to get its head together, parse its
> arguments, become a client of the queue daemon, and ask to be put in
> the queue.
> 
> On the other hand, we have the planning system: currently, as soon as
> sg-run-job exits, the connection to the ownerdaemon closes.  The
> ownerdaemon tells the queue daemon, and the planning queue is
> restarted.  It might even happen that coincidentally the planning
> queue is about to start.
> 
> If the planning system wins the race, another job will pick up the
> newly-freed resource.  Often this will mean unsharing the build host,
> which is very wasteful if the releasing flight hasn't finished its
> builds for that architecture: it means that the next build job needs
> to regroove a host for builds.
> 
> Add a bodge to try to make the race go the other way: after a build
> job completes successfuly, do not give up the share for a further 90
> seconds.  (We have to use setsid because sg-execute-flight kills the
> process group to clean up stray processes, which this sleep definitely
> is.)
> 
> A better solution would be to move the wait-for-referenced-job logic
> from sg-execute-flight to ts-hosts-allocate-*.  But that would be much
> more complicated.
> 
> Signed-off-by: Ian Jackson <ian.jack...@eu.citrix.com>

Acked-by: Ian Campbell <ian.campb...@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to