Aditya Acharya created YARN-1630:
------------------------------------
Summary: Unbounded waiting for response in YarnClientImpl.java
causes thread to hang forever
Key: YARN-1630
URL: https://issues.apache.org/jira/browse/YARN-1630
Project: Hadoop YARN
Issue Type: Bug
Components: client
Affects Versions: 2.2.0
Reporter: Aditya Acharya
Assignee: Aditya Acharya
Attachments: diff.txt
I ran an MR2 application that would have been long running, and killed it
programmatically using a YarnClient. The app was killed, but the client hung
forever. The message that I saw, which spammed the logs, was "Watiting for
application application_1389036507624_0018 to be killed."
The RM log indicated that the app had indeed transitioned from RUNNING to
KILLED, but for some reason future responses to the RPC to kill the application
did not indicate that the app had been terminated.
I tracked this down to YarnClientImpl.java, and though I was unable to
reproduce the bug, I wrote a patch to introduce a bound on the number of times
that YarnClientImpl retries the RPC before giving up.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)