[
https://issues.apache.org/jira/browse/YARN-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024901#comment-16024901
]
Jason Lowe commented on YARN-6647:
----------------------------------
Sample test output showing the mishandling of InterruptedException and a forced
exit of the RM as a result. In this case it causes tests to error because the
JVM exits without notifying the test framework.
{noformat}
2017-05-25 10:23:45,835 INFO [Thread-50] zookeeper.JUnit4ZKTestRunner
(JUnit4ZKTestRunner.java:evaluate(78)) - FINISHED TEST METHOD
testKillAppWhenFailoverHappensAtNewState
2017-05-25 10:23:45,835 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: ResourceManager entered state
STOPPED
2017-05-25 10:23:45,835 DEBUG [main] service.CompositeService
(CompositeService.java:serviceStop(129)) - ResourceManager: stopping services,
size=3
2017-05-25 10:23:45,835 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #2: Service Dispatcher in
state Dispatcher: STARTED
2017-05-25 10:23:45,835 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: Dispatcher entered state
STOPPED
2017-05-25 10:23:45,835 INFO
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@233aac83] util.JvmPauseMonitor
(JvmPauseMonitor.java:run(188)) - Starting JVM pause monitor
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #1: Service
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter in
state org.apache.hadoop.yarn.server.res
ourcemanager.ahs.RMApplicationHistoryWriter: STARTED
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter
entered state STOPPED
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService
(CompositeService.java:serviceStop(129)) -
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter:
stopping services, size=0
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #0: Service
org.apache.hadoop.yarn.server.resourcemanager.AdminService in state
org.apache.hadoop.yarn.server.resourcemanager.Admin
Service: STARTED
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.AdminService entered state STOPPED
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService
(CompositeService.java:serviceStop(129)) -
org.apache.hadoop.yarn.server.resourcemanager.AdminService: stopping services,
size=0
2017-05-25 10:23:45,836 INFO [main] resourcemanager.ResourceManager
(ResourceManager.java:transitionToStandby(1191)) - Already in standby state
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: ResourceManager entered state
STOPPED
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService
(CompositeService.java:serviceStop(129)) - ResourceManager: stopping services,
size=3
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #2: Service
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter in
state
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter:
STARTED
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter
entered state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:serviceStop(129)) -
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter:
stopping services, size=0
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #1: Service
org.apache.hadoop.yarn.server.resourcemanager.AdminService in state
org.apache.hadoop.yarn.server.resourcemanager.AdminService: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.AdminService entered state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:serviceStop(129)) -
org.apache.hadoop.yarn.server.resourcemanager.AdminService: stopping services,
size=0
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #0: Service Dispatcher in
state Dispatcher: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: Dispatcher entered state
STOPPED
2017-05-25 10:23:45,837 INFO [main] resourcemanager.ResourceManager
(ResourceManager.java:transitionToStandby(1195)) - Transitioning to standby
state
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: RMActiveServices entered
state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:serviceStop(129)) - RMActiveServices: stopping services,
size=14
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #13: Service
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher
in state
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher:
STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher
entered state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #12: Service
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService in state
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService entered state
STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #11: Service
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService in state
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService entered
state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #10: Service
org.apache.hadoop.util.JvmPauseMonitor in state
org.apache.hadoop.util.JvmPauseMonitor: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.util.JvmPauseMonitor entered state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #9: Service
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService in state
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService entered
state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #8: Service
NMLivelinessMonitor in state NMLivelinessMonitor: STARTED
2017-05-25 10:23:45,838 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: NMLivelinessMonitor entered
state STOPPED
2017-05-25 10:23:45,838 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #7: Service
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
in state
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
STARTED
2017-05-25 10:23:45,838 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
entered state STOPPED
2017-05-25 10:23:45,838 INFO [Ping Checker] util.AbstractLivelinessMonitor
(AbstractLivelinessMonitor.java:run(156)) - NMLivelinessMonitor thread
interrupted
2017-05-25 10:23:45,838 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #6: Service
org.apache.hadoop.yarn.server.resourcemanager.NodesListManager in state
org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: STARTED
2017-05-25 10:23:45,838 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.NodesListManager entered state
STOPPED
2017-05-25 10:23:45,838 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #5: Service
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager in state
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: STARTED
2017-05-25 10:23:45,838 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager entered state STOPPED
2017-05-25 10:23:45,838 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #4: Service
org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor
in state
org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor:
STARTED
2017-05-25 10:23:45,838 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor
entered state STOPPED
2017-05-25 10:23:45,838 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #3: Service
AMLivelinessMonitor in state AMLivelinessMonitor: STARTED
2017-05-25 10:23:45,838 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: AMLivelinessMonitor entered
state STOPPED
2017-05-25 10:23:45,838 INFO [Ping Checker] util.AbstractLivelinessMonitor
(AbstractLivelinessMonitor.java:run(156)) -
org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor
thread interrupted
2017-05-25 10:23:45,838 INFO [Ping Checker] util.AbstractLivelinessMonitor
(AbstractLivelinessMonitor.java:run(156)) - AMLivelinessMonitor thread
interrupted
2017-05-25 10:23:45,838 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #2: Service
AMLivelinessMonitor in state AMLivelinessMonitor: STARTED
2017-05-25 10:23:45,838 DEBUG [Thread-50-SendThread(127.0.0.1:24578)]
zookeeper.ClientCnxn (ClientCnxn.java:readResponse(843)) - Reading reply
sessionid:0x15c4034de420001, packet:: clientPath:null serverPath:null
finished:false header:: 39,3 replyHeader:: 39,28,-101 request::
'/rmstore/ZKRMStateRoot/RMDTSecretManagerRoot/RMDTMasterKeysRoot/DelegationKey_4,F
response::
2017-05-25 10:23:45,838 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service: AMLivelinessMonitor entered
state STOPPED
2017-05-25 10:23:45,838 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #1: Service
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer
in state
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer:
STARTED
2017-05-25 10:23:45,839 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer
entered state STOPPED
2017-05-25 10:23:45,839 DEBUG [main] service.CompositeService
(CompositeService.java:stop(151)) - Stopping service #0: Service
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService in state
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService: STARTED
2017-05-25 10:23:45,839 DEBUG [main] service.AbstractService
(AbstractService.java:enterState(452)) - Service:
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService entered
state STOPPED
2017-05-25 10:23:45,839 INFO [Ping Checker] util.AbstractLivelinessMonitor
(AbstractLivelinessMonitor.java:run(156)) - AMLivelinessMonitor thread
interrupted
2017-05-25 10:23:45,839 DEBUG [main]
delegation.AbstractDelegationTokenSecretManager
(AbstractDelegationTokenSecretManager.java:stopThreads(638)) - Stopping expired
delegation token remover thread
2017-05-25 10:23:45,839 ERROR [Thread[Thread-85,5,main]] recovery.RMStateStore
(RMStateStore.java:transition(456)) - Error While Storing RMDTMasterKey.
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1406)
at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:990)
at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:910)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation(CuratorTransactionImpl.java:159)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.access$200(CuratorTransactionImpl.java:44)
at
org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:129)
at
org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:125)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.commit(CuratorTransactionImpl.java:122)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$SafeTransaction.commit(ZKRMStateStore.java:1305)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.safeCreate(ZKRMStateStore.java:1261)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeRMDTMasterKeyState(ZKRMStateStore.java:1021)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTMasterKeyTransition.transition(RMStateStore.java:454)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTMasterKeyTransition.transition(RMStateStore.java:438)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:1099)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeRMDTMasterKey(RMStateStore.java:931)
at
org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewMasterKey(RMDelegationTokenSecretManager.java:88)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.storeDelegationKey(AbstractDelegationTokenSecretManager.java:261)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:355)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey(AbstractDelegationTokenSecretManager.java:375)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:676)
at java.lang.Thread.run(Thread.java:745)
2017-05-25 10:23:45,839 INFO [Ping Checker] util.AbstractLivelinessMonitor
(AbstractLivelinessMonitor.java:run(156)) -
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer
thread interrupted
2017-05-25 10:23:45,839 ERROR [Thread[Thread-85,5,main]] recovery.RMStateStore
(RMStateStore.java:notifyStoreOperationFailedInternal(1131)) - State store
operation failed
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1406)
at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:990)
at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:910)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation(CuratorTransactionImpl.java:159)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.access$200(CuratorTransactionImpl.java:44)
at
org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:129)
at
org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:125)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.commit(CuratorTransactionImpl.java:122)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$SafeTransaction.commit(ZKRMStateStore.java:1305)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.safeCreate(ZKRMStateStore.java:1261)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeRMDTMasterKeyState(ZKRMStateStore.java:1021)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTMasterKeyTransition.transition(RMStateStore.java:454)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTMasterKeyTransition.transition(RMStateStore.java:438)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:1099)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeRMDTMasterKey(RMStateStore.java:931)
at
org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewMasterKey(RMDelegationTokenSecretManager.java:88)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.storeDelegationKey(AbstractDelegationTokenSecretManager.java:261)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:355)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey(AbstractDelegationTokenSecretManager.java:375)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:676)
at java.lang.Thread.run(Thread.java:745)
2017-05-25 10:23:45,839 DEBUG [SyncThread:0] server.FinalRequestProcessor
(FinalRequestProcessor.java:processRequest(88)) - Processing request::
sessionid:0x15c4034de420001 type:multi cxid:0x28 zxid:0x1d txntype:14
reqpath:n/a
2017-05-25 10:23:45,840 DEBUG [SyncThread:0] server.FinalRequestProcessor
(FinalRequestProcessor.java:processRequest(160)) - sessionid:0x15c4034de420001
type:multi cxid:0x28 zxid:0x1d txntype:14 reqpath:n/a
2017-05-25 10:23:45,840 DEBUG [Thread-50-SendThread(127.0.0.1:24578)]
zookeeper.ClientCnxn (ClientCnxn.java:readResponse(843)) - Reading reply
sessionid:0x15c4034de420001, packet:: clientPath:null serverPath:null
finished:false header:: 40,14 replyHeader:: 40,29,0 request::
org.apache.zookeeper.MultiTransactionRecord@f92aa7c8 response::
org.apache.zookeeper.MultiResponse@fda6e9e
2017-05-25 10:23:45,840 ERROR [Thread[Thread-85,5,main]]
security.RMDelegationTokenSecretManager
(RMDelegationTokenSecretManager.java:storeNewMasterKey(90)) - Error in storing
master key with KeyID: 4
2017-05-25 10:23:45,841 DEBUG [Thread[Thread-85,5,main]] util.ExitUtil
(ExitUtil.java:terminate(209)) - Exiting with status 1:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.lang.InterruptedException
1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.lang.InterruptedException
at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265)
at
org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewMasterKey(RMDelegationTokenSecretManager.java:91)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.storeDelegationKey(AbstractDelegationTokenSecretManager.java:261)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:355)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey(AbstractDelegationTokenSecretManager.java:375)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:676)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.lang.InterruptedException
at
org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:273)
at
org.apache.hadoop.yarn.event.DrainDispatcher$2.handle(DrainDispatcher.java:91)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.notifyStoreOperationFailedInternal(RMStateStore.java:1134)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.access$1500(RMStateStore.java:86)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTMasterKeyTransition.transition(RMStateStore.java:457)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTMasterKeyTransition.transition(RMStateStore.java:438)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:1099)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeRMDTMasterKey(RMStateStore.java:931)
at
org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewMasterKey(RMDelegationTokenSecretManager.java:88)
... 5 more
Caused by: java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:265)
... 17 more
2017-05-25 10:23:45,841 INFO [Thread[Thread-85,5,main]] util.ExitUtil
(ExitUtil.java:terminate(210)) - Exiting with status 1:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.lang.InterruptedException
{noformat}
Looks like the master key was rolling just as we were shutting down, and the
interrupt exception ended up bubbling all the way up to the dispatcher which
caused the JVM exit. The state store needs to check if it's in the process of
shutting down when an interrupted exception occurs and not report that as an
error.
> ZKRMStateStore can crash during shutdown due to InterruptedException
> --------------------------------------------------------------------
>
> Key: YARN-6647
> URL: https://issues.apache.org/jira/browse/YARN-6647
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Reporter: Jason Lowe
>
> Noticed some tests were failing due to the JVM shutting down early. I was
> able to reproduce this occasionally with TestKillApplicationWithRMHA.
> Stacktrace to follow.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]