Aditya Kishore commented on YARN-2791:

I can see how YARN-2139 and this JIRA could be seen as related, especially with 
the terse summary of this JIRA, however they aim to address two different 
concerns. YARN-2139 is about disk resource scheduling isolation and throttling 
at the execution time while this one is in the capacity planning/resource 
allocation phase.

So, either 1) this JIRA could continue on its own with its own design 
discussion here since the concerns are different from those discussed on 
YARN-2139, or 2) we widen the scope of YARN-2139 and add this as a sub-task.

I, as I see a clear separation of concern, would prefer the first choice 
however would be okay with second too.

> Add Disk as a resource for scheduling
> -------------------------------------
>                 Key: YARN-2791
>                 URL: https://issues.apache.org/jira/browse/YARN-2791
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: scheduler
>    Affects Versions: 2.5.1
>            Reporter: Swapnil Daingade
>            Assignee: Yuliya Feldman
> Currently, the number of disks present on a node is not considered a factor 
> while scheduling containers on that node. Having large amount of memory on a 
> node can lead to high number of containers being launched on that node, all 
> of which compete for I/O bandwidth. This multiplexing of I/O across 
> containers can lead to slower overall progress and sub-optimal resource 
> utilization as containers starved for I/O bandwidth hold on to other 
> resources like cpu and memory. This problem can be solved by considering disk 
> as a resource and including it in deciding how many containers can be 
> concurrently run on a node.

This message was sent by Atlassian JIRA

Reply via email to