[
https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15721341#comment-15721341
]
zhengchenyu commented on YARN-5936:
-----------------------------------
You didn't catch my meaning! In fact, I knew the reason of the unfairness.
The processes and threads has the same level of scheduling. In Linux-3.10, task
are scheduled by a red–black tree, and update the red–black tree periodically
or manually(by other calling). the left-most is next task to be scheduled. The
red-black tree is update by this formula:
curr->vruntime+=delta_exec*nice_0_load/curr->load.weight
here curr->load.weight is just cpu.share. So Mulit Thread obtain more cpu than
Single-Thread because of more child thread are participating the scheduler.
> when cpu strict mode is closed, yarn couldn't assure scheduling fairness
> between containers
> -------------------------------------------------------------------------------------------
>
> Key: YARN-5936
> URL: https://issues.apache.org/jira/browse/YARN-5936
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.7.1
> Environment: CentOS7.1
> Reporter: zhengchenyu
> Priority: Critical
> Fix For: 2.7.1
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> When using LinuxContainer, the setting that
> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is
> true could assure scheduling fairness with the cpu bandwith of cgroup. But
> the cpu bandwidth of cgroup would lead to bad performance in our experience.
> Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to
> assure scheduling fairness, but it is not completely effective. For example,
> There are two container that have same vcore(means same cpu.share), one
> container is single-threaded, the other container is multi-thread. the
> multi-thread will have more CPU time, It's unreasonable!
> Here is my test case, I submit two distributedshell application. And two
> commmand are below:
> {code}
> hadoop jar
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
> -shell_script ./run.sh -shell_args 10 -num_containers 1 -container_memory
> 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> hadoop jar
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
> -shell_script ./run.sh -shell_args 1 -num_containers 1 -container_memory
> 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> {code}
> here show the cpu time of the two container:
> {code}
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 15448 yarn 20 0 9059592 28336 9180 S 998.7 0.1 24:09.30 java
> 15026 yarn 20 0 9050340 27480 9188 S 100.0 0.1 3:33.97 java
> 13767 yarn 20 0 1799816 381208 18528 S 4.6 1.2 0:30.55 java
> 77 root rt 0 0 0 0 S 0.3 0.0 0:00.74
> migration/1
> {code}
> We find the cpu time of Muliti-Thread are ten times than the cpu time of
> Single-Thread, though the two container have same cpu.share.
> notes:
> run.sh
> {code}
> java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1
> {code}
> loop.java
> {code}
> package loop;
> public class loop {
> public static void main(String[] args) {
> // TODO Auto-generated method stub
> int loop = 1;
> if(args.length>=1) {
> System.out.println(args[0]);
> loop = Integer.parseInt(args[0]);
> }
> for(int i=0;i<loop;i++){
> System.out.println("start thread " + i);
> new Thread(new Runnable() {
> @Override
> public void run() {
> // TODO Auto-generated method stub
> int j=0;
> while(true){j++;}
> }
> }).start();
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]