Rohith Sharma K S commented on YARN-2487:

Hi [~Naganarasimha Garla], it is worth for keeping the application if it is 
running. But problem is currently YARN does not identifies the reasons for the 
not progressing. App not progressing could be because of several reasons. So I 
feel, if any mechanism to get reason for not progressing applications, this 
could be handled. I believe, YARN-4091 is one such issue which trying to get 
more debug information and  planning to expose REST interface for getting per 
application progress information.

> Need to support timeout of AM When no containers are assigned to it for a 
> defined period
> ----------------------------------------------------------------------------------------
>                 Key: YARN-2487
>                 URL: https://issues.apache.org/jira/browse/YARN-2487
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>  There are some scenarios where AM will not get containers and indefinitely 
> waiting. We faced one such sceanrio which makes the applications to get hung 
> : 
> Consider a cluster setup which has 2 NMS of each 8GB resource,
> And 2 applications(MR2) are launched in the default queue where in each AM is 
> taking 2 GB each.
> Each AM is placed in each of the NM. Now each AM is requesting for container 
> of 7Gb  mem resource .
> As in each NM only 6GB resource is available both the applications are hung 
> forever.
> To avoid such scenarios i would like to propose 
> generic timeout feature for all AM's in yarn, such that if no containers are 
> assigned for an application for a defined period than yarn can timeout the 
> application attempt.
> Default can be set to 0 where in RM will not timeout the app attempt and user 
> can set his own timeout when he submits the application

This message was sent by Atlassian JIRA

Reply via email to