[
https://issues.apache.org/jira/browse/YARN-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhijie Shen resolved YARN-477.
------------------------------
Resolution: Cannot Reproduce
> MiniYARNCluster: When container executor script fails to launch App Master,
> NM logs error, but Client doesn't get signaled to kill the job
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-477
> URL: https://issues.apache.org/jira/browse/YARN-477
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Eli Reisman
> Assignee: Zhijie Shen
>
> I have been porting Giraph to YARN (GIRAPH-13 is the issue) and when I launch
> my App Master, if the container command line runs it successfully, any
> failure in the App Master or my launched Giraph Tasks promptly reports to
> Client and ends my job run. However, if the command line sent to the app
> master container fails to launch it at all, the error exit code is not
> propagating. My client hangs with the job at containersUsed == 1 and state ==
> ACCEPTED for as long as you want to sit and wait before CTRL-C'ing your way
> out.
> Disclaimer: this could be my fault. But I wanted to throw it out there in
> case its not. I also (when this happens) not getting error logs since the app
> master never launched, so I really have no visibility into why it failed to
> launch. I am sure its not launching, but the client IS sending the app
> request, getting a container for my AM, and I see the command line run on the
> container in my logs. Thats all.
> Thanks! If this is a dup or "won't fix" for some reason, let me know and
> sorry for wasting your time!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira