Hong Zhiguo commented on YARN-4024:

That's a good reason to have this cache.
[~leftnoteasy],  in earlier comments, you said
1) If a host_a, has IP=IP1, IP1 is on whitelist. If we change the IP of host_a 
to IP2, IP2 is in blacklist. We won't do the re-resolve since the cached IP1 is 
on whitelist.
2) If a host_a, has IP=IP1, IP1 is on blacklist. We may need to do re-resolve 
every time when the node doing heartbeat since it may change to its IP to a one 
not on the blacklist.
I think that's too complicated. The cache lookup is a part of resolving (name 
to address). And the check of IP whitelist/blacklist is just the following 
stage. I think cache with configurable expiration is enough, we'd better leave 
the 2 stages orthogonal, not to mix them up.

BTW, I think it's not good to have "Name" in NodeId, but "Address" in 
whitelist/blacklist. Different layers of abstraction are mixed up.  We'll don't 
have this issue if "Name" or "Address" is used for both NodeId and 
a better way is to have "Name" in whitelist/blacklist, instead of "Address". 

> YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
> ----------------------------------------------------------------------
>                 Key: YARN-4024
>                 URL: https://issues.apache.org/jira/browse/YARN-4024
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Wangda Tan
>            Assignee: Hong Zhiguo
> Currently, YARN RM NodesListManager will resolve IP address every time when 
> node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
> blocked and cannot make progress.

This message was sent by Atlassian JIRA

Reply via email to