[ 
https://issues.apache.org/jira/browse/YARN-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320872#comment-14320872
 ] 

Hitesh Shah commented on YARN-3131:
-----------------------------------

[~jlowe] Referring to my earlier comment, does it make more sense to do the 
simple checks inline instead of doing them as part of the app state machine? 
The issue mainly stems from the fact that in Tez, we start an AM and then 
submit work to it directly. In such cases, where the AM is never launched, the 
underlying issue of why it was never launched gets hidden at times. 

bq. YarnRunner "works" because it bothers to do one extra appreport after the 
app submission completes to verify the app is still in a non-failed/killed 
state.

We added a unit test for this and have seen it failing randomly on a 
minicluster as catching the failure on the first getAppReport() call is not 
reliable. Ref: TEZ-2058





> YarnClientImpl should check FAILED and KILLED state in submitApplication
> ------------------------------------------------------------------------
>
>                 Key: YARN-3131
>                 URL: https://issues.apache.org/jira/browse/YARN-3131
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Chang Li
>            Assignee: Chang Li
>
> Just run into a issue when submit a job into a non-existent queue and 
> YarnClient raise no exception. Though that job indeed get submitted 
> successfully and just failed immediately after, it will be better if 
> YarnClient can handle the immediate fail situation like YarnRunner does



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to