[ https://issues.apache.org/jira/browse/YARN-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046374#comment-15046374 ]
Rohith Sharma K S commented on YARN-4424: ----------------------------------------- Sorry if I am not clear in previous comment. Correct me if my understanding is wrong. >From the 3 Threads trace, I see that all 3 threads are BLOCKED for holding >RMAppImpl lock. {{Thread-1}} and {{Thread-2}} is for *ReadLock* & {{Thread-3}} >is for *WriteLock*. Thread-1 {quote} Thread 53785: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=834 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(int) @bci=83, line=964 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(int) @bci=10, line=1282 (Interpreted frame) - java.util.concurrent.locks.*ReentrantReadWriteLock$ReadLock.lock()* @bci=5, line=731 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.rmapp.*RMAppImpl.getFinalApplicationStatus()* @bci=4, line=478 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptFinished {quote} Thread-2 {quote} Thread 25723: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=834 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(int) @bci=83, line=964 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(int) @bci=10, line=1282 (Compiled frame) - java.util.concurrent.locks.*ReentrantReadWriteLock$ReadLock.lock()* @bci=5, line=731 (Compiled frame) - org.apache.hadoop.yarn.server.resourcemanager.rmapp.*RMAppImpl.createAndGetApplicationReport*(java.lang.String, boolean) @bci=4, line=598 (Interpreted frame) {quote} Thread-3 {quote} Thread 53696: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=834 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, int) @bci=67, line=867 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, line=1197 (Interpreted frame) - java.util.concurrent.locks.*ReentrantReadWriteLock$WriteLock.lock()* @bci=5, line=945 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.rmapp.*RMAppImpl.pullRMNodeUpdates*(java.util.Collection) @bci=4, line=584 (Interpreted frame) {quote} Thread-2 is NOT holding read lock of RMAppImpl, but it is blocked for read lock. So my doubt is which thread flow is holding lock of RMAppImpl? > YARN CLI command hangs > ---------------------- > > Key: YARN-4424 > URL: https://issues.apache.org/jira/browse/YARN-4424 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Yesha Vora > Assignee: Jian He > Priority: Blocker > Attachments: YARN-4424.1.patch > > > {code} > yarn@XXX:/mnt/hadoopqe$ /usr/hdp/current/hadoop-yarn-client/bin/yarn > application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING > 15/12/04 21:59:54 INFO impl.TimelineClientImpl: Timeline service address: > http://XXX:8188/ws/v1/timeline/ > 15/12/04 21:59:54 INFO client.RMProxy: Connecting to ResourceManager at > XXX/0.0.0.0:8050 > 15/12/04 21:59:55 INFO client.AHSProxy: Connecting to Application History > server at XXX/0.0.0.0:10200 > {code} > {code:title=RM log} > 2015-12-04 21:59:19,744 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 237000 > 2015-12-04 22:00:50,945 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 238000 > 2015-12-04 22:02:22,416 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 239000 > 2015-12-04 22:03:53,593 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 240000 > 2015-12-04 22:05:24,856 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 241000 > 2015-12-04 22:06:56,235 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 242000 > 2015-12-04 22:08:27,510 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 243000 > 2015-12-04 22:09:58,786 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 244000 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)