Eli Reisman created YARN-477:
--------------------------------

             Summary: When default container executor fails right away, at the 
CLI launching our App Master, Client doesn't always get the signal to kill the 
job
                 Key: YARN-477
                 URL: https://issues.apache.org/jira/browse/YARN-477
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Eli Reisman


I have been porting Giraph to YARN (GIRAPH-13 is the issue) and when I launch 
my App Master, if the container command line runs it successfully, any failure 
in the App Master or my launched Giraph Tasks promptly reports to Client and 
ends my job run. However, if the command line sent to the app master container 
fails to launch it at all, the error exit code is not propagating. My client 
hangs with the job at containersUsed == 1 and state == ACCEPTED for as long as 
you want to sit and wait before CTRL-C'ing your way out.

Disclaimer: this could be my fault. But I wanted to throw it out there in case 
its not. I also (when this happens) not getting error logs since the app master 
never launched, so I really have no visibility into why it failed to launch. I 
am sure its not launching, but the client IS sending the app request, getting a 
container for my AM, and I see the command line run on the container in my 
logs. Thats all.

Thanks! If this is a dup or "won't fix" for some reason, let me know and sorry 
for wasting your time!


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to