Miklos Szegedi created YARN-5686:
------------------------------------
Summary: DefaultContainerExecutor random working dir algorigthm
skews results
Key: YARN-5686
URL: https://issues.apache.org/jira/browse/YARN-5686
Project: Hadoop YARN
Issue Type: Bug
Reporter: Miklos Szegedi
Priority: Minor
{code}
long randomPosition = RandomUtils.nextLong() % totalAvailable;
...
while (randomPosition > availableOnDisk[dir]) {
randomPosition -= availableOnDisk[dir++];
}
{code}
The code above selects a disk based on the random number weighted by the free
space on each disk respectively. For example, if I have two disks with 100
bytes each, totalAvailable is 200. The value of randomPosition will be 0..199.
0..99 should select the first disk, 100..199 should select the second disk
inclusively. Random number 100 should select the second disk to be fair but
this is not the case right now.
We need to use
{code}
while (randomPosition >= availableOnDisk[dir])
{code}
instead of
{code}
while (randomPosition > availableOnDisk[dir])
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]