[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726097#comment-14726097
 ] 

Jian He commented on YARN-4087:
-------------------------------

bq. as there are no retries or explicit app-failures
Retry already happened internally before the final Exception is thrown. 
Right, app will be stuck at certain state, since no notification is sent back. 
But,  explicitly failing the app may be too harsh, since the app itself can 
actually proceed without any impact.  I think we can still notify back that the 
store operation is done and let the app continue. Also, print warning message 
on application page something like "Application is not persisted in state-store 
due to state-store error. Application will be lost if RM restarted."

> Set YARN_FAIL_FAST to be false by default
> -----------------------------------------
>
>                 Key: YARN-4087
>                 URL: https://issues.apache.org/jira/browse/YARN-4087
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-4087.1.patch, YARN-4087.2.patch
>
>
> Increasingly, I feel setting this property to be false makes more sense 
> especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to