Wangda Tan created YARN-4113:
--------------------------------
Summary: RM should respect retry-interval when uses
RetryPolicies.RETRY_FOREVER
Key: YARN-4113
URL: https://issues.apache.org/jira/browse/YARN-4113
Project: Hadoop YARN
Issue Type: Bug
Reporter: Wangda Tan
Priority: Critical
Found one issue in RMProxy how to initialize RetryPolicy: In
RMProxy#createRetryPolicy. When rmConnectWaitMS is set to -1 (wait forever), it
uses RetryPolicies.RETRY_FOREVER which doesn't respect
{{yarn.resourcemanager.connect.retry-interval.ms}} setting.
RetryPolicies.RETRY_FOREVER uses 0 as the interval, when I run the test without
properly setup localhost name:
{{TestYarnClient#testShouldNotRetryForeverForNonNetworkExceptions}}, it wrote
14G DEBUG exception message to system before it dies. This will be very bad if
we do the same thing in a production cluster.
We should fix two places:
- Make RETRY_FOREVER can take retry-interval as constructor parameter.
- Respect retry-interval when we uses RETRY_FOREVER policy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)