[ 
https://issues.apache.org/jira/browse/YARN-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718504#comment-13718504
 ] 

Omkar Vinit Joshi commented on YARN-389:
----------------------------------------

[~zjshen] [~bikassaha] I think we should reject the problematic requests at 
allocate call but not when it is accepted. As that will be a problem.
* For allocate call today we are only rejecting requests if their request is 
more than what cluster has but we don't do any validation w.r.t. how much a 
single container will need to run. I think we should add that check. 
SchedulerUtils#validateResourceRequest().. thoughts??
* We can not reject requests once they are accepted. How the AM will come to 
know which requests were rejected later? is there anyway we can inform AM about 
the accepted (earlier) but now rejected requests? One more thing to be 
considered here is that Node manager having large amount of resources may go 
down and come back in short span.. (node reconnect or..node removed and added 
back after very small time)..in whichever case we should not reject that 
request if it was accepted....large jobs will definitely suffer if few nodes 
restart in very short span.. thoughts?
                
> Infinitely assigning containers when the required resource exceeds the 
> cluster's absolute capacity
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-389
>                 URL: https://issues.apache.org/jira/browse/YARN-389
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Zhijie Shen
>            Assignee: Omkar Vinit Joshi
>
> I've run wordcount example on branch-2 and trunk. I've set 
> yarn.nodemanager.resource.memory-mb to 1G and 
> yarn.app.mapreduce.am.resource.mb to 1.5G. Therefore, resourcemanager is to 
> assign a 2G AM container for AM. However, the nodemanager doesn't have enough 
> memory to assign the container. The problem is that the assignment operation 
> will be repeated infinitely, if the assignment cannot be accomplished. Logs 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to