[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15907544#comment-15907544 ]
sandflee commented on YARN-4051: -------------------------------- Thanks [~jlowe], bq. I'm also wondering about the scenario where the kill event is coming in from an AM and not the RM. simple throw a YarnException when AM stops a recovering container, but seems NMClientAsyncImpl could't try stopContainer again, we could fix this in a new issue? {code} .addTransition(ContainerState.RUNNING, EnumSet.of(ContainerState.DONE, ContainerState.FAILED), ContainerEventType.STOP_CONTAINER, new StopContainerTransition()) {code} do another two changes: 1, using app.handle(new ApplicationContainerInitEvent(container)) when recover containers, for there is a race condition when Finish events comes, ApplicationContainerInitEvent not processed and containers are not added to app 2, use ConcurrentHashMap to store containers in app. because I encountered ConcurrentModifyException when iterating app.getContainers() , and I also see web and AppLogAggregator using app.getContainers() without protect. > ContainerKillEvent is lost when container is In New State and is recovering > ---------------------------------------------------------------------------- > > Key: YARN-4051 > URL: https://issues.apache.org/jira/browse/YARN-4051 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Reporter: sandflee > Assignee: sandflee > Priority: Critical > Attachments: YARN-4051.01.patch, YARN-4051.02.patch, > YARN-4051.03.patch, YARN-4051.04.patch, YARN-4051.05.patch, > YARN-4051.06.patch, YARN-4051.07.patch > > > As in YARN-4050, NM event dispatcher is blocked, and container is in New > state, when we finish application, the container still alive even after NM > event dispatcher is unblocked. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org