I run one 20GB sort example using Hadoop yarn 2.0.2-alpha being profiled. I run the sort example on 1, 2 ,4 and 8 node , then runtime of them are 1014, 470, 251 and 150 second. It's nonlinear.
1. Why the performance improve more than twice when I double the nodes from 1 to 2 ? 2. After 2 nodes, why the performance improve nonlinearly when I double the nodes? Anyone know the detail of these scene in yarn? -- *Sincerely,* *Zhaojie* * *
