[
https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
caozhiqiang updated YARN-10501:
-------------------------------
Description:
When add a label to nodes without nodemanager port or use WILDCARD_PORT (0)
port, it can't remove all label info in these nodes
Reproduce process:
{code:java}
1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)"
2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode"
3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
{"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}}}}}}
4.yarn rmadmin -replaceLabelsOnNode "server001"
5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
{"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}}}}}}
{code}
You can see after the 4 process to remove nodemanager labels, the label info is
still in the node info.
{code:java}
641 case REPLACE:
642 replaceNodeForLabels(nodeId, host.labels, labels);
643 replaceLabelsForNode(nodeId, host.labels, labels);
644 host.labels.clear();
645 host.labels.addAll(labels);
646 for (Node node : host.nms.values()) {
647 replaceNodeForLabels(node.nodeId, node.labels, labels);
649 node.labels = null;
650 }
651 break;{code}
The cause is in 647 line, when add labels to node without port, the 0 port and
the real nm port with be both add to node info, and when remove labels, the
parameter node.labels in 647 line is null, so it will not remove the old label.
By the way, I don't add UT because these have been test case
testGetNodeLabelsInfo(), this will call getLabelsByNode method and it will use
node.labels or host.labels for check.
{code:java}
1083 protected Set<String> getLabelsByNode(NodeId nodeId, Map<String, Host>
map) {
1084 Host host = map.get(nodeId.getHost());
1085 if (null == host) {
1086 return EMPTY_STRING_SET;
1087 }
1088 Node nm = host.nms.get(nodeId);
1089 if (null != nm && null != nm.labels) {
1090 return nm.labels;
1091 } else {
1092 return host.labels;
1093 }
1094 }{code}
In line 1092, it will use host.labels if the nm.labels is null, so this
testGetNodeLabelsInfo UT have cover this case.
was:
When add a label to nodes without nodemanager port or use WILDCARD_PORT (0)
port, it can't remove all label info in these nodes
Reproduce process:
{code:java}
1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)"
2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode"
3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
{"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}}}}}}
4.yarn rmadmin -replaceLabelsOnNode "server001"
5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
{"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}}}}}}
{code}
You can see after the 4 process to remove nodemanager labels, the label info is
still in the node info.
> Can't remove all node labels after add node label without nodemanager port
> --------------------------------------------------------------------------
>
> Key: YARN-10501
> URL: https://issues.apache.org/jira/browse/YARN-10501
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 3.4.0
> Reporter: caozhiqiang
> Assignee: caozhiqiang
> Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10501.002.patch
>
>
> When add a label to nodes without nodemanager port or use WILDCARD_PORT (0)
> port, it can't remove all label info in these nodes
> Reproduce process:
> {code:java}
> 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)"
> 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode"
> 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
> {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}}}}}}
> 4.yarn rmadmin -replaceLabelsOnNode "server001"
> 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
> {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}}}}}}
> {code}
> You can see after the 4 process to remove nodemanager labels, the label info
> is still in the node info.
>
> {code:java}
> 641 case REPLACE:
> 642 replaceNodeForLabels(nodeId, host.labels, labels);
> 643 replaceLabelsForNode(nodeId, host.labels, labels);
> 644 host.labels.clear();
> 645 host.labels.addAll(labels);
> 646 for (Node node : host.nms.values()) {
> 647 replaceNodeForLabels(node.nodeId, node.labels, labels);
> 649 node.labels = null;
> 650 }
> 651 break;{code}
>
> The cause is in 647 line, when add labels to node without port, the 0 port
> and the real nm port with be both add to node info, and when remove labels,
> the parameter node.labels in 647 line is null, so it will not remove the old
> label.
> By the way, I don't add UT because these have been test case
> testGetNodeLabelsInfo(), this will call getLabelsByNode method and it will
> use node.labels or host.labels for check.
> {code:java}
> 1083 protected Set<String> getLabelsByNode(NodeId nodeId, Map<String, Host>
> map) {
> 1084 Host host = map.get(nodeId.getHost());
> 1085 if (null == host) {
> 1086 return EMPTY_STRING_SET;
> 1087 }
> 1088 Node nm = host.nms.get(nodeId);
> 1089 if (null != nm && null != nm.labels) {
> 1090 return nm.labels;
> 1091 } else {
> 1092 return host.labels;
> 1093 }
> 1094 }{code}
> In line 1092, it will use host.labels if the nm.labels is null, so this
> testGetNodeLabelsInfo UT have cover this case.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]