[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

Jeff Buell (JIRA) Tue, 17 Dec 2013 17:24:21 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13851210#comment-13851210
 ]


Jeff Buell commented on YARN-938:
---------------------------------

I did a set of runs using the terasort suite to drill down on the differences 
between MR1 and MR2.  All my tests used HDFS2.  The only distro I know of that 
allows comparing MR1 and MR2 while keeping the HDFS layer the same is CDH.  I 
used CDH 5.0.0 beta 1 for these tests.  Results are in the attached files 
(names start with CDH500beta1).  Some details of the configuration are in the 
spreadsheet, as well as the elapsed time data.  CPU utilization at the host 
level (includes 4 VMs and the hypervisor) is in the jpeg.  Much more info is in 
the paper published earlier this year (using older versions of software but the 
same hardware): http://www.vmware.com/resources/techresources/10360.

The YARN/MR2 case was also run on Apache 2.2.0.  Performance is just slightly 
worse than CDH5, the difference might be in the noise.  This indicates that 
there is no practical performance difference between these 2 distros.  All 
tests used 4 virtual machines per host running vSphere 5.5 on a 32-host 
cluster.  This is a better-performing configuration than running a single 
native OS on each host (details of why this is true are in the paper).

A 8 TB dataset was used to get longer and more repeatable run times, and to 
ensure there is no extra caching in the OS.  This certainly counts as "big 
data":  45-60 "waves" of map tasks are needed in each map slot to process all 
the data.

MR2 doesn't currently support JVM reuse, nor is it possible to simultaneously 
schedule map and shuffle tasks simultaneously in a uniform manner.  So 
slowstart=0.99 was used to force reduce tasks to run after all map tasks are 
finished.  Doing this loses the ability to pipeline map output into shuffle 
tasks (through the Linux buffer cache) and results in extra read I/O.  But it 
does allow more map tasks to run simultaneously.

Using the most optimal configuration for each, MR2 has a 45% longer elapsed 
time than MR1 for terasort.  The storage-limited workloads (teragen, 
teravalidate, and the merge part of terasort) all have similar elapsed times 
under the 2 versions of MR.  Although teragen is still 10% slower with MR2 
because the CPU became saturated, while MR1 did not.  Similarly, the CPU 
utilization is far higher for MR2 for teravalidate and terasort-merge.  Earlier 
tests with collecting hardware counters showed that CPI is similar, indicating 
that the the increase in elapsed time (or CPU utilization) is due to more 
machine instructions being executed in MR2, not to more cycles per instruction.

The slowstart and reuse parameters were changed from their optimal settings in 
MR1 to match the MR2 strategy.  Even then, MR2 is 17% slower for terasort.  The 
breakdown of the 1.45X increase in elapsed time appears to be as follows:

Lack of map-shuffle pipelining (slowstart):  1.06X
No JVM reuse:                                             1.17X
Instruction inefficiency:                                1.17X

256MB blocks were used in HDFS.  The JVM reuse penalty can be reduced with 
larger blocks, at the cost of greater memory usage.

The elapsed times for the individual terasort phases shown in the spreadsheet 
enables the performance differences to be further isolated.  One interesting 
point is the "sort" part of the reduce.  This takes only a few seconds for each 
task individually.  The elapsed time shown is a measure of how uniformly (or 
synchronized) the reduce tasks are running across the cluster.  Note that the 
faster configurations are also more synchronized.

> Hadoop 2 benchmarking 
> ----------------------
>
>                 Key: YARN-938
>                 URL: https://issues.apache.org/jira/browse/YARN-938
>             Project: Hadoop YARN
>          Issue Type: Task
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>         Attachments: Hadoop-benchmarking-2.x-vs-1.x-1.xls, 
> Hadoop-benchmarking-2.x-vs-1.x.xls, cdh500beta1_cpu_util.jpg, 
> cdh500beta1_mr1_mr2.xlsx
>
>
> I am running the benchmarks on Hadoop 2 and will update the results soon.
> Thanks,
> Mayank



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

Reply via email to