[
https://issues.apache.org/jira/browse/YARN-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320712#comment-14320712
]
Jason Lowe commented on YARN-3131:
----------------------------------
The issue with blocking until submission is complete has to do with holding the
IPC server thread hostage, waiting for the scheduler to get around to
processing the submission request. The concern is if the scheduler is running
behind or takes a long time to process the submit request then we can starve
the RM of IPC handler threads and end up blocking other clients. That's why
currently the YARN application submission process is a two-phase process, first
submit then followed by a polling loop where the client checks if the app ended
up in the right state. If we want to make these potentially long-running calls
synchronous then we should prioritize server-side asynchronous processing such
as proposed by HADOOP-11552.
YarnRunner "works" because it bothers to do one extra appreport after the app
submission completes to verify the app is still in a non-failed/killed state.
YarnClient doesn't do this check, hence why callers don't see a failure but the
MapReduce client does.
> YarnClientImpl should check FAILED and KILLED state in submitApplication
> ------------------------------------------------------------------------
>
> Key: YARN-3131
> URL: https://issues.apache.org/jira/browse/YARN-3131
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Chang Li
> Assignee: Chang Li
>
> Just run into a issue when submit a job into a non-existent queue and
> YarnClient raise no exception. Though that job indeed get submitted
> successfully and just failed immediately after, it will be better if
> YarnClient can handle the immediate fail situation like YarnRunner does
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)