[ 
https://issues.apache.org/jira/browse/YARN-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-2673:
------------------------
    Attachment: YARN-2673-101714.patch

Hi [~zjshen], I've updated my patch according to your comments. I've also fixed 
a bug in the previous version: in the previous patch I confused "maxRetries" 
with "maxTries", and issues one less attempt in the retry filter. 

According to your comments:

1. Made retried, maxRetries and retryInterval \@VisibleForTesting. 
bq. After retried is set to true first time. It is always true, which means 
it's not useful for asserting the second request.
This is a bug. retried should indicate if retry happened in the last jersey 
request. I've fixed this issue in this patch by resetting retried every time a 
request is launched (and the client filter is called). 

2. Fixed. 

3. maxRetries can be -1 to indicate there is no limit for the number of retries 
(described in TimelineJerseyRetryFilter). I've added a line of comment here to 
make it clearer (also a line in the original configuration). 

4. Fixed.

5. I think you raised a very valid point. I've removed this new API. 

> Add retry for timeline client put APIs
> --------------------------------------
>
>                 Key: YARN-2673
>                 URL: https://issues.apache.org/jira/browse/YARN-2673
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-2673-101414-1.patch, YARN-2673-101414-2.patch, 
> YARN-2673-101414.patch, YARN-2673-101714.patch
>
>
> Timeline client now does not handle the case gracefully when the server is 
> down. Jobs from distributed shell may fail due to ATS restart. We may need to 
> add some retry mechanisms to the client. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to