[
https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
caozhiqiang updated YARN-10501:
-------------------------------
Attachment: YARN-10501.003.patch
> Can't remove all node labels after add node label without nodemanager port
> --------------------------------------------------------------------------
>
> Key: YARN-10501
> URL: https://issues.apache.org/jira/browse/YARN-10501
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 3.4.0
> Reporter: caozhiqiang
> Assignee: caozhiqiang
> Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10501.002.patch, YARN-10501.003.patch
>
>
> When add a label to nodes without nodemanager port or use WILDCARD_PORT (0)
> port, it can't remove all label info in these nodes
> Reproduce process:
> {code:java}
> 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)"
> 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode"
> 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
> {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}}}}}}
> 4.yarn rmadmin -replaceLabelsOnNode "server001"
> 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
> {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}}}}}}
> {code}
> You can see after the 4 process to remove nodemanager labels, the label info
> is still in the node info.
>
> {code:java}
> 641 case REPLACE:
> 642 replaceNodeForLabels(nodeId, host.labels, labels);
> 643 replaceLabelsForNode(nodeId, host.labels, labels);
> 644 host.labels.clear();
> 645 host.labels.addAll(labels);
> 646 for (Node node : host.nms.values()) {
> 647 replaceNodeForLabels(node.nodeId, node.labels, labels);
> 649 node.labels = null;
> 650 }
> 651 break;{code}
>
> The cause is in 647 line, when add labels to node without port, the 0 port
> and the real nm port with be both add to node info, and when remove labels,
> the parameter node.labels in 647 line is null, so it will not remove the old
> label.
> By the way, I don't add UT because these have been test case
> testGetNodeLabelsInfo(), this will call getLabelsByNode method and it will
> use node.labels or host.labels for check.
> {code:java}
> 1083 protected Set<String> getLabelsByNode(NodeId nodeId, Map<String, Host>
> map) {
> 1084 Host host = map.get(nodeId.getHost());
> 1085 if (null == host) {
> 1086 return EMPTY_STRING_SET;
> 1087 }
> 1088 Node nm = host.nms.get(nodeId);
> 1089 if (null != nm && null != nm.labels) {
> 1090 return nm.labels;
> 1091 } else {
> 1092 return host.labels;
> 1093 }
> 1094 }{code}
> In line 1092, it will use host.labels if the nm.labels is null, so this
> testGetNodeLabelsInfo UT have cover this case.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]