On Tue, 2015-09-22 at 16:12 +0100, Ian Jackson wrote: > When a build job finishes, the same flight may well want to do a > subsequent build that depended on the first. When this happens, we > have a race: > > One the one hand, we have the flight: after sg-run-job exits, > sg-execute-flight needs to double-check the job status, and search the > flight for more jobs to run; it will spawn ts-allocate-hosts-Executive > for the new job, which needs to get its head together, parse its > arguments, become a client of the queue daemon, and ask to be put in > the queue. > > On the other hand, we have the planning system: currently, as soon as > sg-run-job exits, the connection to the ownerdaemon closes. The > ownerdaemon tells the queue daemon, and the planning queue is > restarted. It might even happen that coincidentally the planning > queue is about to start. > > If the planning system wins the race, another job will pick up the > newly-freed resource. Often this will mean unsharing the build host, > which is very wasteful if the releasing flight hasn't finished its > builds for that architecture: it means that the next build job needs > to regroove a host for builds. > > Add a bodge to try to make the race go the other way: after a build > job completes successfuly, do not give up the share for a further 90 > seconds. (We have to use setsid because sg-execute-flight kills the > process group to clean up stray processes, which this sleep definitely > is.) > > A better solution would be to move the wait-for-referenced-job logic > from sg-execute-flight to ts-hosts-allocate-*. But that would be much > more complicated. > > Signed-off-by: Ian Jackson <ian.jack...@eu.citrix.com>
Acked-by: Ian Campbell <ian.campb...@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel