zhengchenyu created YARN-5936:
---------------------------------
Summary: when cpu strict mode is closed, yarn couldn't assure
scheduling fairness between containers
Key: YARN-5936
URL: https://issues.apache.org/jira/browse/YARN-5936
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.7.1
Environment: CentOS7.1
Reporter: zhengchenyu
Priority: Critical
Fix For: 2.7.1
When using LinuxContainer, the setting that
"yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is
true could assure scheduling fairness with the cpu bandwith of cgroup. But the
cpu bandwidth of cgroup would lead to bad performance in our experience.
Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to
assure scheduling fairness, but it is not completely effective. For example,
There are two container that have same vcore(means same cpu.share), one
container is single-threaded, the other container is multi-thread. the
multi-thread will have more CPU time, It's unreasonable!
Here is my test case, I submit two distributedshell application. And two
commmand are below:
(code)
hadoop jar
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
org.apache.hadoop.yarn.applications.distributedshell.Client -jar
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
-shell_script ./run.sh -shell_args 10 -num_containers 1 -container_memory 1024
-container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
hadoop jar
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
org.apache.hadoop.yarn.applications.distributedshell.Client -jar
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
-shell_script ./run.sh -shell_args 1 -num_containers 1 -container_memory 1024
-container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
here show the cpu time of the two container:
(code)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15448 yarn 20 0 9059592 28336 9180 S 998.7 0.1 24:09.30 java
15026 yarn 20 0 9050340 27480 9188 S 100.0 0.1 3:33.97 java
13767 yarn 20 0 1799816 381208 18528 S 4.6 1.2 0:30.55 java
77 root rt 0 0 0 0 S 0.3 0.0 0:00.74 migration/1
(code)
We find the cpu time of Muliti-Thread are ten times than the cpu time of
Single-Thread, though the two container have same cpu.share.
notes:
run.sh
(code)
java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1
(code)
loop.java
(code)
package loop;
public class loop {
public static void main(String[] args) {
// TODO Auto-generated method stub
int loop = 1;
if(args.length>=1) {
System.out.println(args[0]);
loop = Integer.parseInt(args[0]);
}
for(int i=0;i<loop;i++){
System.out.println("start thread " + i);
new Thread(new Runnable() {
@Override
public void run() {
// TODO Auto-generated method stub
int j=0;
while(true){j++;}
}
}).start();
}
}
}
(code)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]