[
https://issues.apache.org/jira/browse/YARN-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285561#comment-16285561
]
Chandni Singh commented on YARN-7565:
-------------------------------------
[~jianhe] Thanks for looking at the patch.
In regards to the last race condition, I assumed that the events in a component
are processed sequentially instead of parallel.
If that's not the case then we will have to synchronize on the entire
unRecoveredInstance map.
> Yarn service pre-maturely releases the container after AM restart
> ------------------------------------------------------------------
>
> Key: YARN-7565
> URL: https://issues.apache.org/jira/browse/YARN-7565
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Chandni Singh
> Assignee: Chandni Singh
> Fix For: yarn-native-services
>
> Attachments: YARN-7565.001.patch, YARN-7565.002.patch,
> YARN-7565.003.patch
>
>
> With YARN-6168, recovered containers can be reported to AM in response to the
> AM heartbeat.
> Currently, the Service Master will release the containers, that are not
> reported in the AM registration response, immediately.
> Instead, the master can wait for a configured amount of time for the
> containers to be recovered by RM. These containers are sent to AM in the
> heartbeat response. Once a container is not reported in the configured
> interval, it can be released by the master.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]