[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083974#comment-15083974 ]
Jason Lowe commented on YARN-1011: ---------------------------------- bq. Tasks are incorrectly over-allocated. Will never use the resources they ask for and hence we can safely run additional opportunistic containers. So this feature is used to compensate for poorly configured applications. Probably a valid scenario but is it common? In my experience this is fairly common. Users tend to twiddle with config values until something is working then they don't bother to revisit until there's a problem. And it's easier to over allocate than to spend the time to carefully tune the task size. Even if the user is interested in tuning they can't always tune optimally. Some examples are data skew or other task-specific issues where a few tasks need a lot of memory but the vast majority of the others do not. Many frameworks only allow the task sizes to be configured as a group, so the user has to run all the tasks in the group with the worst-case container size even though most of them don't need it. Pig on MapReduce is another example, where it will spawn multiple jobs but the user can only configure the memory settings once in the script and they apply to all jobs launched by the script. Therefore the user has to set it to the worst-case size across all the script's jobs, and all but one of the jobs runs with oversized map containers. > [Umbrella] Schedule containers based on utilization of currently allocated > containers > ------------------------------------------------------------------------------------- > > Key: YARN-1011 > URL: https://issues.apache.org/jira/browse/YARN-1011 > Project: Hadoop YARN > Issue Type: New Feature > Reporter: Arun C Murthy > Attachments: yarn-1011-design-v0.pdf, yarn-1011-design-v1.pdf > > > Currently RM allocates containers and assumes resources allocated are > utilized. > RM can, and should, get to a point where it measures utilization of allocated > containers and, if appropriate, allocate more (speculative?) containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)