[
https://issues.apache.org/jira/browse/YARN-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799667#comment-13799667
]
Vinod Kumar Vavilapalli commented on YARN-1324:
-----------------------------------------------
It wasn't done like that with MR world only in mind. Even outside MR, many apps
want to write data in parallel and want to take advantage of multiple disks. We
cannot make NM to decide one disk because of that.
Apps/containers that don't care about load-balancing or multiple disks can
chose to always write to the first disk and NM will eventually load balance
them.
To have true load-balancing all the time (and not just post container finish),
YARN needs cooperative containers. And the better solution for that is to make
apps ask the number of disks to write when they launch containers. That way
YARN isn't overriding users intention to use/not use multiple disks.
The title should be changed with problem description (and not the solution).
> NodeManager should assign 1 local directory directory to a container
> --------------------------------------------------------------------
>
> Key: YARN-1324
> URL: https://issues.apache.org/jira/browse/YARN-1324
> Project: Hadoop YARN
> Issue Type: Improvement
> Affects Versions: 2.2.0
> Reporter: Bikas Saha
>
> Currently, for every container, the NM creates a directory on every disk and
> expects the container-task to choose 1 of them and load balance the use of
> the disks across all containers.
> 1) This may have worked fine in the MR world where MR tasks would randomly
> choose dirs but in general we cannot expect every app/task writer to
> understand these nuances and randomly pick disks. So we could end up
> overloading the first disk if most people decide to use the first disk.
> 2) This makes a number of NM operations to scan every disk (thus randomizing
> that disk) to locate the dir which the task has actually chosen to use for
> its files. Makes all these operations expensive for the NM as well as
> disruptive for users of disks that did not have the real task working dirs.
> I propose that NM should up-front decide the disk it is assigning to tasks.
> It could choose to do so randomly or weighted-randomly by looking at space
> and load on each disk. So it could do a better job of load balancing. Then,
> it would associate the chosen working directory with the container context so
> that subsequent operations on the NM can directly seek to the correct
> location instead of having to seek on every disk.
--
This message was sent by Atlassian JIRA
(v6.1#6144)