[ https://issues.apache.org/jira/browse/YARN-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320712#comment-14320712 ]
Jason Lowe commented on YARN-3131: ---------------------------------- The issue with blocking until submission is complete has to do with holding the IPC server thread hostage, waiting for the scheduler to get around to processing the submission request. The concern is if the scheduler is running behind or takes a long time to process the submit request then we can starve the RM of IPC handler threads and end up blocking other clients. That's why currently the YARN application submission process is a two-phase process, first submit then followed by a polling loop where the client checks if the app ended up in the right state. If we want to make these potentially long-running calls synchronous then we should prioritize server-side asynchronous processing such as proposed by HADOOP-11552. YarnRunner "works" because it bothers to do one extra appreport after the app submission completes to verify the app is still in a non-failed/killed state. YarnClient doesn't do this check, hence why callers don't see a failure but the MapReduce client does. > YarnClientImpl should check FAILED and KILLED state in submitApplication > ------------------------------------------------------------------------ > > Key: YARN-3131 > URL: https://issues.apache.org/jira/browse/YARN-3131 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Chang Li > Assignee: Chang Li > > Just run into a issue when submit a job into a non-existent queue and > YarnClient raise no exception. Though that job indeed get submitted > successfully and just failed immediately after, it will be better if > YarnClient can handle the immediate fail situation like YarnRunner does -- This message was sent by Atlassian JIRA (v6.3.4#6332)