Karthik Kambatla commented on YARN-1011:

Dropping a quick note here. (Traveling - will answer other comments tomorrow.)

bq. How do we determine that the perf is slower? The CPU would never exceed 
100% even under over-allocation.
This is one of the reasons I was proposing the notion of a max threshold which 
is less than 1 :) If the utilization goes to 100%, we clearly know there is 
contention. Since we measure resource utilization in resource-seconds (if not, 
we should update it), bursty spikes alone wouldn't take utilization over 100%. 
So, we shouldn't see a utilization greater than 100%.

bq. Can we make this per-resource? (80% memory, 120% CPU)?
I am open to per-resource configuration. That said, I am not too keen 
especially my above comment on utilization never going over 100% holds. 

bq. Tasks are incorrectly over-allocated. Will never use the resources they ask 
for and hence we can safely run additional opportunistic containers. So this 
feature is used to compensate for poorly configured applications. Probably a 
valid scenario but is it common?
It is quite common for folks to borrow a Hive or Pig script from a colleague. 

> [Umbrella] Schedule containers based on utilization of currently allocated 
> containers
> -------------------------------------------------------------------------------------
>                 Key: YARN-1011
>                 URL: https://issues.apache.org/jira/browse/YARN-1011
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Arun C Murthy
>         Attachments: yarn-1011-design-v0.pdf, yarn-1011-design-v1.pdf
> Currently RM allocates containers and assumes resources allocated are 
> utilized.
> RM can, and should, get to a point where it measures utilization of allocated 
> containers and, if appropriate, allocate more (speculative?) containers.

This message was sent by Atlassian JIRA

Reply via email to