[
https://issues.apache.org/jira/browse/YARN-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bikas Saha updated YARN-1324:
-----------------------------
Summary: NodeManager potentially causes unnecessary operations on all its
disks (was: NodeManager should assign 1 local directory directory to a
container)
> NodeManager potentially causes unnecessary operations on all its disks
> ----------------------------------------------------------------------
>
> Key: YARN-1324
> URL: https://issues.apache.org/jira/browse/YARN-1324
> Project: Hadoop YARN
> Issue Type: Improvement
> Affects Versions: 2.2.0
> Reporter: Bikas Saha
>
> Currently, for every container, the NM creates a directory on every disk and
> expects the container-task to choose 1 of them and load balance the use of
> the disks across all containers.
> 1) This may have worked fine in the MR world where MR tasks would randomly
> choose dirs but in general we cannot expect every app/task writer to
> understand these nuances and randomly pick disks. So we could end up
> overloading the first disk if most people decide to use the first disk.
> 2) This makes a number of NM operations to scan every disk (thus randomizing
> that disk) to locate the dir which the task has actually chosen to use for
> its files. Makes all these operations expensive for the NM as well as
> disruptive for users of disks that did not have the real task working dirs.
> I propose that NM should up-front decide the disk it is assigning to tasks.
> It could choose to do so randomly or weighted-randomly by looking at space
> and load on each disk. So it could do a better job of load balancing. Then,
> it would associate the chosen working directory with the container context so
> that subsequent operations on the NM can directly seek to the correct
> location instead of having to seek on every disk.
--
This message was sent by Atlassian JIRA
(v6.1#6144)