[ https://issues.apache.org/jira/browse/YARN-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745946#comment-16745946 ]
Feng Yuan commented on YARN-6607: --------------------------------- hi [~gouri shankar] i am sorry to replay so long, this issue was lost with my trivial work in past two years. if is because ThreadPool quota meet, i submit a patch on top. but i still suspect there is any other FATAL problem. like: XXX happen -> service stop -> threadpool stop -> executor.submit return reject. so is there XXX exist. holp your words. > YARN Resource Manager quits with the exception > java.util.concurrent.RejectedExecutionException: > ------------------------------------------------------------------------------------------------ > > Key: YARN-6607 > URL: https://issues.apache.org/jira/browse/YARN-6607 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.7.1 > Reporter: Anandhaprabhu > Priority: Major > > ResourceManager goes down frequently with the below exception > 2017-05-16 03:32:36,897 FATAL event.AsyncDispatcher > (AsyncDispatcher.java:dispatch(189)) - Error in dispatcher thread > java.util.concurrent.RejectedExecutionException: Task > java.util.concurrent.FutureTask@9efeac9 rejected from > java.util.concurrent.ThreadPoolExecutor@42ab30[Shutting down, pool size = 16, > active threads = 0, queued tasks = 0, completed tasks = 223337] > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) > at > java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134) > at > org.apache.hadoop.registry.server.services.RegistryAdminService.submit(RegistryAdminService.java:176) > at > org.apache.hadoop.registry.server.integration.RMRegistryOperationsService.purgeRecordsAsync(RMRegistryOperationsService.java:200) > at > org.apache.hadoop.registry.server.integration.RMRegistryOperationsService.purgeRecordsAsync(RMRegistryOperationsService.java:170) > at > org.apache.hadoop.registry.server.integration.RMRegistryOperationsService.onContainerFinished(RMRegistryOperationsService.java:146) > at > org.apache.hadoop.yarn.server.resourcemanager.registry.RMRegistryService.handleAppAttemptEvent(RMRegistryService.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.registry.RMRegistryService$AppEventHandler.handle(RMRegistryService.java:183) > at > org.apache.hadoop.yarn.server.resourcemanager.registry.RMRegistryService$AppEventHandler.handle(RMRegistryService.java:177) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:276) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > 2017-05-16 03:32:36,898 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(524)) > - EventThread shut down > 2017-05-16 03:32:36,898 INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) > - Session: 0x15b8703e986b750 closed > 2017-05-16 03:32:36,898 INFO capacity.ParentQueue > (ParentQueue.java:completedContainer(623)) - completedContainer queue=high > usedCapacity=0.41496983 absoluteUsedCapacity=0.29047886 used=<memory:3469312, > vCores:847> cluster=<memory:11943424, vCores:6064> > 2017-05-16 03:32:36,905 INFO capacity.ParentQueue > (ParentQueue.java:completedContainer(640)) - Re-sorting completed queue: > root.high.lawful stats: lawful: capacity=0.3, absoluteCapacity=0.21000001, > usedResources=<memory:417792, vCores:102>, usedCapacity=0.16657583, > absoluteUsedCapacity=0.034980923, numApps=19, numContainers=102 > 2017-05-16 03:32:36,905 INFO capacity.ParentQueue > (ParentQueue.java:completedContainer(623)) - completedContainer queue=root > usedCapacity=0.41565567 absoluteUsedCapacity=0.41565567 used=<memory:4964352, > vCores:1212> cluster=<memory:11943424, vCores:6064> > 2017-05-16 03:32:36,906 INFO capacity.ParentQueue > (ParentQueue.java:completedContainer(640)) - Re-sorting completed queue: > root.high stats: high: numChildQueue= 4, capacity=0.7, absoluteCapacity=0.7, > usedResources=<memory:3469312, vCores:847>usedCapacity=0.41496983, > numApps=61, numContainers=847 > 2017-05-16 03:32:36,906 INFO capacity.CapacityScheduler > (CapacityScheduler.java:completedContainer(1562)) - Application attempt > appattempt_1494886223429_7023_000001 released container > container_e43_1494886223429_7023_01_000043 on node: host: > r13d8.hadoop.log10.blackberry:45454 #containers=1 available=<memory:36864, > vCores:23> used=<memory:4096, vCores:1> with event: FINISHED -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org