[
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15907544#comment-15907544
]
sandflee commented on YARN-4051:
--------------------------------
Thanks [~jlowe],
bq. I'm also wondering about the scenario where the kill event is coming in
from an AM and not the RM.
simple throw a YarnException when AM stops a recovering container, but seems
NMClientAsyncImpl could't try stopContainer again, we could fix this in a new
issue?
{code}
.addTransition(ContainerState.RUNNING,
EnumSet.of(ContainerState.DONE, ContainerState.FAILED),
ContainerEventType.STOP_CONTAINER,
new StopContainerTransition())
{code}
do another two changes:
1, using app.handle(new ApplicationContainerInitEvent(container)) when recover
containers, for there is a race condition when Finish events comes,
ApplicationContainerInitEvent not processed and containers are not added to app
2, use ConcurrentHashMap to store containers in app. because I encountered
ConcurrentModifyException when iterating app.getContainers() , and I also see
web and AppLogAggregator using app.getContainers() without protect.
> ContainerKillEvent is lost when container is In New State and is recovering
> ----------------------------------------------------------------------------
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Reporter: sandflee
> Assignee: sandflee
> Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch,
> YARN-4051.03.patch, YARN-4051.04.patch, YARN-4051.05.patch,
> YARN-4051.06.patch, YARN-4051.07.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New
> state, when we finish application, the container still alive even after NM
> event dispatcher is unblocked.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]