jaehoon ko created YARN-3856:
--------------------------------
Summary: YARN shoud allocate container that is closest to the data
Key: YARN-3856
URL: https://issues.apache.org/jira/browse/YARN-3856
Project: Hadoop YARN
Issue Type: Improvement
Components: resourcemanager
Affects Versions: 2.7.0
Environment: Hadoop cluster with multi-level network hierarchy
Reporter: jaehoon ko
Currently, given a Container request for a host, ResourceManager allocates a
Container with following priorities (RMContainerAllocator.java):
- Requested host
- a host in the same rack as the requested host
- any host
This can lead to a sub-optimal allocation if Hadoop cluster is deployed on
multi-level networked hosts (which is typical). For example, let's suppose a
network architecture with one core switches, two aggregate switches, four ToR
switches, and 8 hosts. Each switch has two downlinks. Rack IDs of hosts are as
follows:
h1, h2: /c/a1/t1
h3, h4: /c/a1/t2
h5, h6: /c/a2/t3
h7, h8: /c/a2/t4
To allocate a container for data in h1, Hadoop first tries h1 itself, then h2,
then any of h3 ~ h8. Clearly, h3 or h4 are better than h5~h8 in terms of
network distance and bandwidth. However, current implementation choose one from
h3~h8 with equal probabilities.
This limitation is more obvious when considering hadoop clusters deployed on VM
or containers. In this case, only the VMs or containers running in the same
physical host are considered rack local, and actual rack-local hosts are chosen
with same probabilities as far hosts.
The root cause of this limitation is that RMContainerAllocator.java performs
exact matching on rack id to find a rack local host. Alternatively, we can
perform longest-prefix matching to find a closest host. Using the same network
architecture as above, with longest-prefix matching, hosts are selected with
the following priorities:
h1
h2
h3 or h4
h5 or h6 or h7 or h8
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)