[
https://issues.apache.org/jira/browse/YARN-9332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780150#comment-16780150
]
Weiwei Yang commented on YARN-9332:
-----------------------------------
Hi [~cltlfcjin]
Thanks, that sounds good to me.
Regarding to the patch, pls see my comments below
1. RackResolver#coreResolve
Currently it only handles the case when returned list is empty. This might not
enough. Since {{DNSToSwitchMapping#resolve}} is a pluggable class, if one host
inside of many cannot be resolved, we cannot be very certain what will be the
returning value. It could be a default rack, or a null. So I think even
returned list is not empty, we still need to check every value,
{code:java}
for (int i = 0; i < hostNames.size(); i++) {
if (Strings.isNullOrEmpty(rNameList.get(i))) {
// fallback to use default rack
}
...
}{code}
2. TestRackResolver#testCaching
This UT was to verify the result can be properly cached. {{MyResolver}}
memories number of hosts it resolved in variable {{numHost1}}, and in
{{testCaching}}, it runs resolve 2 times on host1 and then verify the number of
resolution only happens 1 time. But the modified version of your UT changes
that. Can you revise the UT make sure the cache behavior is still tested, you
can add another one to verify resolution of multiple hosts at a time.
Thanks
> RackResolver tool should accept multiple hosts
> ----------------------------------------------
>
> Key: YARN-9332
> URL: https://issues.apache.org/jira/browse/YARN-9332
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Affects Versions: 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
> Reporter: Lantao Jin
> Assignee: Lantao Jin
> Priority: Minor
> Attachments: YARN-9332.001.patch
>
>
> RackResolver as a public rack resolver tool only offers a method {{public
> static Node resolve(String hostName)}} which only accepts one host a time.
> Actually the internal implementation class {{DNSToSwitchMapping}} always
> accept a host list as its input and return a list of resolved racks. That's
> cause the invoker like Spark takes a long time to resolve the rack info when
> handling abundant tasks (a mass of loops to execute script to resolve rack
> info).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]