hashar added a comment.

  Next fun, digging into Jenkins!
  
  I looked at 
https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php72-docker/64820/
 which got affected by the same issue (all times are UTC):
  
  counterexample
    18:08:34 [wmf-quibble-selenium-php72-docker] $ /bin/bash 
/tmp/jenkins5993728317798275711.sh
    // Quibble output
    18:34:14 INFO:backend.MySQL:Terminating MySQL
    18:35:12 [wmf-quibble-selenium-php72-docker] $ /bin/bash -xe 
/tmp/jenkins5970126818379489927.sh
  
  It ran on integration-agent-docker-1009 for which I took a couple thread dump 
while the job was waiting using 
https://integration.wikimedia.org/ci/computer/integration-agent-docker-1009/systemInfo
  
  A couple thread dumps for the agent:
  
  F32409486: A.txt <https://phabricator.wikimedia.org/F32409486>
  
  F32409489: B.txt <https://phabricator.wikimedia.org/F32409489>
  
  See 
https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjAvMTAvMjAvLS1CLnR4dC0tMjAtMzQtMTM7Oy0tQS50eHQtLTIwLTM0LTEz
  
  Not much standing out? There is a stdout copier thread for the script that 
runs Quibble (/tmp/jenkins5993728317798275711.sh). So I definitely took the 
thread dump while it was blocked and there is some Groovy scripts running.
  
  I also took two thread dumps of the Jenkins server:
  
  F32409499: server1.txt <https://phabricator.wikimedia.org/F32409499>
  
  F32409500: server2.txt <https://phabricator.wikimedia.org/F32409500>
  
  See also: 
https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjAvMTAvMjAvLS1zZXJ2ZXIyLnR4dC0tMjAtMjctNTI7Oy0tc2VydmVyMS50eHQtLTIwLTI3LTUy&;
  
  The later second dump has a thread:
  
    line=5
    "Running 
CpsFlowExecution[Owner[maintenance-sample-workspace-sizes/123736:maintenance-sample-workspace-sizes
 #123736]] / waiting for integration-agent-docker-1009 id=6114779" daemon 
prio=5 TIMED_WAITING
        java.lang.Object.wait(Native Method)
        hudson.remoting.Request.call(Request.java:176)
        hudson.remoting.Channel.call(Channel.java:1000)
        
hudson.util.RemotingDiagnostics.executeGroovy(RemotingDiagnostics.java:111)
        hudson.util.RemotingDiagnostics$executeGroovy.call(Unknown Source)
        
org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
        
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
        
com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)
        
com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)
        
com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
        
com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
        sun.reflect.GeneratedMethodAccessor268.invoke(Unknown Source)
        
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:498)
        
com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
        
com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:89)
        
com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
        
com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixName(FunctionCallBlock.java:78)
        sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
        
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:498)
        
com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
        com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
        com.cloudbees.groovy.cps.Next.step(Next.java:83)
        com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
        com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
        
org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
        
org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
        com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
        
org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)
        
org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:400)
        
org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)
        
org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:312)
        
org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:276)
        
org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
        java.util.concurrent.FutureTask.run(FutureTask.java:266)
        
hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:136)
        
jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
        
jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
        java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        java.util.concurrent.FutureTask.run(FutureTask.java:266)
        
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        java.lang.Thread.run(Thread.java:748)
  
  That leads me to 
https://integration.wikimedia.org/ci/job/maintenance-sample-workspace-sizes/123736/console
 which executes a Groovy script hitting all agents in order to look at the jobs 
workspace disk usage and report back to statsd.
  
  According to the console output it started at 18:34:00 and finished after 
18:35:04. Looking at the build XML files, the success event was at 18:35:07.  
Could it be that the groovy script is holding some lock that prevents the other 
job from finishing the shell task?
  
  I looked at the job in the task description, but it got hold for 4 minutes so 
I am not sure how that matches.

TASK DETAIL
  https://phabricator.wikimedia.org/T265615

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hashar
Cc: hashar, awight, kostajh, Lucas_Werkmeister_WMDE, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, darthmon_wmde, Kent7301, joker88john, CucyNoiD, 
Nandana, NebulousIris, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, 
Bsandipan, GoranSMilovanovic, QZanden, LawExplorer, Lewizho99, Maathavan, 
_jensen, rosalieper, Liudvikas, Scott_WUaS, thcipriani, Wikidata-bugs, aude, 
Jdforrester-WMF, Mbch331, Jay8g
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to