[ 
https://issues.apache.org/jira/browse/YARN-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799970#comment-13799970
 ] 

Bikas Saha commented on YARN-1324:
----------------------------------

When does MR use multiple disks in the same task/container? Isnt the map output 
written to a single indexed partition file?

Requiring apps to specify the number of disks for a container is also a viable 
solution and can be done in a back-compatible manner by changing MR to specify 
multiple disks and leaving the default to 1 for apps that dont care.

> NodeManager potentially causes unnecessary operations on all its disks
> ----------------------------------------------------------------------
>
>                 Key: YARN-1324
>                 URL: https://issues.apache.org/jira/browse/YARN-1324
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.2.0
>            Reporter: Bikas Saha
>
> Currently, for every container, the NM creates a directory on every disk and 
> expects the container-task to choose 1 of them and load balance the use of 
> the disks across all containers. 
> 1) This may have worked fine in the MR world where MR tasks would randomly 
> choose dirs but in general we cannot expect every app/task writer to 
> understand these nuances and randomly pick disks. So we could end up 
> overloading the first disk if most people decide to use the first disk.
> 2) This makes a number of NM operations to scan every disk (thus randomizing 
> that disk) to locate the dir which the task has actually chosen to use for 
> its files. Makes all these operations expensive for the NM as well as 
> disruptive for users of disks that did not have the real task working dirs.
> I propose that NM should up-front decide the disk it is assigning to tasks. 
> It could choose to do so randomly or weighted-randomly by looking at space 
> and load on each disk. So it could do a better job of load balancing. Then, 
> it would associate the chosen working directory with the container context so 
> that subsequent operations on the NM can directly seek to the correct 
> location instead of having to seek on every disk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to