[jira] [Created] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

Junping Du (JIRA) Mon, 10 Oct 2016 08:08:51 -0700

Junping Du created YARN-5718:
--------------------------------

             Summary: TimelineClient (and other places in YARN) shouldn't 
over-write HDFS client retry settings which could cause unexpected behavior
                 Key: YARN-5718
                 URL: https://issues.apache.org/jira/browse/YARN-5718
             Project: Hadoop YARN
          Issue Type: Bug
          Components: timelineclient, resourcemanager
            Reporter: Junping Du
            Assignee: Junping Du



In one HA cluster, after NN failed over, we noticed that job is getting failed 
as TimelineClient failed to retry connection to proper NN. This is because we 
are overwrite hdfs client settings that hard code retry policy to be enabled 
that conflict NN failed-over case - hdfs client should fail fast so can retry 
on another NN.
We shouldn't assume any retry policy for hdfs client at all places in YARN. 
This should keep consistent with HDFS settings that has different retry polices 
in different deployment case. Thus, we should clean up these hard code settings 
in YARN, include: FileSystemTimelineWriter, FileSystemRMStateStore and 
FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

Reply via email to