Huangkaixuan created YARN-6289:
----------------------------------
Summary: yarn got little data locality
Key: YARN-6289
URL: https://issues.apache.org/jira/browse/YARN-6289
Project: Hadoop YARN
Issue Type: Improvement
Components: capacity scheduler
Environment: Hardware configuration
CPU: 2 x Intel(R) Xeon(R) E5-2620 v2 @ 2.10GHz /15M Cache 6-Core 12-Thread
Memory: 128GB Memory (16x8GB) 1600MHz
Disk: 600GBx2 3.5-inch with RAID-1
Network bandwidth: 968Mb/s
Software configuration
Spark-1.6.2 Hadoop-2.7.1
Reporter: Huangkaixuan
Priority: Minor
When I ran this experiment with both Spark and MapReduce wordcount on the file,
I noticed that the job did not get data locality every time. It was seemingly
random in the placement of the tasks, even though there is no other job running
on the cluster. I expected the task placement to always be on the single
machine which is holding the data block, but that did not happen.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]