[ https://issues.apache.org/jira/browse/YARN-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037055#comment-14037055 ]
Vinod Kumar Vavilapalli commented on YARN-2175: ----------------------------------------------- bq. there is no way to kill an task if its stuck in these states. YARN-1619/YARN-445 should let you do this manually if not automatically. > Container localization has no timeouts and tasks can be stuck there for a > long time > ----------------------------------------------------------------------------------- > > Key: YARN-2175 > URL: https://issues.apache.org/jira/browse/YARN-2175 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.4.0 > Reporter: Anubhav Dhoot > Assignee: Anubhav Dhoot > > There are no timeouts that can be used to limit the time taken by various > container startup operations. Localization for example could take a long time > and there is no way to kill an task if its stuck in these states. These may > have nothing to do with the task itself and could be an issue within the > platform. > Ideally there should be configurable limits for various states within the > NodeManager to limit various states. The RM does not care about most of these > and its only between AM and the NM. We can start by making these global > configurable defaults and in future we can make it fancier by letting AM > override them in the start container request. > This jira will be used to limit localization time and we open others if we > feel we need to limit other operations. -- This message was sent by Atlassian JIRA (v6.2#6252)