[
https://issues.apache.org/jira/browse/YARN-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585300#comment-14585300
]
zhihai xu commented on YARN-3802:
---------------------------------
I uploaded a patch YARN-3802.000.patch for review. The patch fixed the issue by
using the old RMNode in NodeAddedSchedulerEvent and updating the old RMNode's
capability based on the new RMNode's capability.
> Two RMNodes for the same NodeId are used in RM sometimes after NM is
> reconnected.
> ---------------------------------------------------------------------------------
>
> Key: YARN-3802
> URL: https://issues.apache.org/jira/browse/YARN-3802
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.7.0
> Reporter: zhihai xu
> Assignee: zhihai xu
> Attachments: YARN-3802.000.patch
>
>
> Two RMNodes for the same NodeId are used in RM sometimes after NM is
> reconnected. Scheduler and RMContext use different RMNode reference for the
> same NodeId sometimes after NM is reconnected, which is not correct.
> Scheduler and RMContext should always use same RMNode reference for the same
> NodeId.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)