Karthik Kambatla commented on YARN-2139:

bq. currently vdisks is counting the number of physical drives present on the 
We see vdisks as a multiple of the number of physical disks on the box. Again, 
it is just one of the ways, and we can add more ways to share disk resources in 
the future. 

bq. Should we consider evaluating a change in this policy that gives a 
container 1 local dir to a container with 1 vdisk. This way for a machine with 
6 disks (and 6 vdisks) would have 6 tasks running, each with their own 
"dedicated" disk. 
Good point. We were thinking of giving the AM the option to choose the amount 
of disk IO parallelism at the time of launching the container, as part of the 
spindle locality work. I see AMs wanting to either (1) pick a single local 
directory for guaranteed performance or (2) stripe accesses across multiple 
disks for potentially higher throughput based on other work on the node.

Initially, we could provide a global config for all containers - vdisks to span 
fewest or most disks. 

> [Umbrella] Support for Disk as a Resource in YARN 
> --------------------------------------------------
>                 Key: YARN-2139
>                 URL: https://issues.apache.org/jira/browse/YARN-2139
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wei Yan
>         Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
> Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
> YARN-2139-prototype-2.patch, YARN-2139-prototype.patch
> YARN should consider disk as another resource for (1) scheduling tasks on 
> nodes, (2) isolation at runtime, (3) spindle locality. 

This message was sent by Atlassian JIRA

Reply via email to