https://bugzilla.wikimedia.org/show_bug.cgi?id=72113

Antoine "hashar" Musso (WMF) <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|Highest                     |Normal
             Status|ASSIGNED                    |NEW
            Summary|Jenkins: Zuul queue stuck   |Zuul prepareRef does not
                   |indefinitely if dependent   |handle failure to connect
                   |pipeline change has merge   |to Gearman
                   |conflict?                   |
           Severity|critical                    |normal

--- Comment #1 from Antoine "hashar" Musso (WMF) <[email protected]> ---
Looking at debug.log.2014-10-15


23:21:17
- receives a code review +2 event for change 166678,3
- triggers a merge:merge function

23:21:47 (30 seconds later):

 ERROR gear.Client.unknown:
   Connection <gear.Connection 0x1ca7f90 host: 208.80.154.135 port: 4730>
   timed out waiting for a response to a submit job request:
      <gear.Job 0x7f1809f1dd50
          handle: None
          name: merger:merge
          unique: d928090c54474fa59b51d1661263750b>
 NoConnectedServersError: No connected Gearman servers

Zuul forks to spawn a Gearman server and both process communicate over TCP 4730
via the host public IP.

The Zuul code that handles the creation of the reference
(zuul.scheduler.BasePipelineManager.prepareRef() ) does not handle exceptions
when submitting the job to Gearman, so it is raised up to the main loop and
leave the enqueued change in an inconsistent state.


I am not familiar with that part of the code.  I guess exceptions should be
caught and some internal status (there is a `ready` variable) be set
accordingly.


To unblock such issue either:

A) potentially a new patchset could be sent (that would remove the previous one
from the Zuul queues) then +2ed again to reenter the gate-and-submit pipeline

B) use the Zuul client to move the change ahead in the queue (an action known
as 'promote') which would reorder the changes in the pipeline and retrigger
merge jobs. Ie:

 zuul promote --pipeline gate-and-submit --changes 166678,3

See doc at http://ci.openstack.org/zuul/client.html#usage


Adjusting priority, we can't have everything highest/critical.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to