[ 
https://issues.apache.org/jira/browse/YARN-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718673#comment-13718673
 ] 

Hitesh Shah commented on YARN-389:
----------------------------------

The static limit is there for a reason. No application should ask for a 
container above a certain limit as defined by the admins. For example, if most 
nodes in a cluster have 4 GB resources, it can be used to set the cap to 4 GB 
to ensure that even if all large nodes ( ones with more than 4 GB ) disappear, 
the cluster is still healthy. 

The issue at hand is a scheduling/allocation problem:
  - can this allocation request be fulfilled? 
     - can it be fulfilled now?
     - can it be fulfilled within a short window?
     - can it ever be fulfilled? 
  - When an allocation request deemed to be non-fulfill-able?
     - is this based on static configuration?
     - is this based on a single snapshot of the dynamic view of the cluster?
     - is this based on snapshots over a period of time?
  - If time, on what basis is time defined?
     - clock time?
     - no. of rounds of heartbeats by all healthy nodes in the cluster.

How is the application informed of the necessary information that it needs to 
make a decision? Information could be:
  - your request could not be fulfilled
  - your partial request could not be fulfilled 
    - the reason why it could not be fulfilled
  - the current view of the cluster such as max available container size
  


                
> Infinitely assigning containers when the required resource exceeds the 
> cluster's absolute capacity
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-389
>                 URL: https://issues.apache.org/jira/browse/YARN-389
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Zhijie Shen
>            Assignee: Omkar Vinit Joshi
>
> I've run wordcount example on branch-2 and trunk. I've set 
> yarn.nodemanager.resource.memory-mb to 1G and 
> yarn.app.mapreduce.am.resource.mb to 1.5G. Therefore, resourcemanager is to 
> assign a 2G AM container for AM. However, the nodemanager doesn't have enough 
> memory to assign the container. The problem is that the assignment operation 
> will be repeated infinitely, if the assignment cannot be accomplished. Logs 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to