[
https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235166#comment-15235166
]
Jason Lowe commented on YARN-4311:
----------------------------------
Thanks for updating the patch! Couple of comments:
The patch has a lot of diffs from the trunk patch. I would expect them to be
somewhat similar, but there's a lot of unnecessary diffs here, like final
modifiers in one but not the other, inconsistent line breaks for essentially
the same change, etc. Please examine the diff of the two patches and keep in
mind these should be the minimal necessary. Otherwise we're just making other
backports to branch-2.7 that much more difficult. If we really need the final
modifiers, etc, then why didn't we need them in the trunk patch?
When we walk the inactive nodes list to remove nodes, what if the node isn't in
the DECOMMISSIONED state (e.g.: LOST, REBOOTED)? Wondering if the
decommissioned metric could go negative in those cases and we could end up with
wrong lost/rebooted metrics. If so, I believe this could happen in the trunk
patch as well.
> Removing nodes from include and exclude lists will not remove them from
> decommissioned nodes list
> -------------------------------------------------------------------------------------------------
>
> Key: YARN-4311
> URL: https://issues.apache.org/jira/browse/YARN-4311
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.6.1
> Reporter: Kuhu Shukla
> Assignee: Kuhu Shukla
> Fix For: 2.8.0
>
> Attachments: YARN-4311-branch-2.7.001.patch,
> YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch,
> YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch,
> YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch,
> YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch,
> YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch,
> YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch,
> YARN-4311-v8.patch, YARN-4311-v9.patch
>
>
> In order to fully forget about a node, removing the node from include and
> exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The
> tricky part that [~jlowe] pointed out was the case when include lists are not
> used, in that case we don't want the nodes to fall off if they are not active.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)