[ https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115449#comment-16115449 ]

## Advertising

YunFan Zhou edited comment on YARN-6361 at 8/6/17 3:04 PM: ----------------------------------------------------------- [~yufeigu] Thank you very much, Yufei. I have to say that this JIRA's description describes a very good idea, and I have achieved the corresponding code. I used *O(N)* complexity before sorting to calculate the properties of each application, and then according to the computed properties of each application ordering all applications using O(n*log(n)) complexity. It does look a lot faster. And I'll test the bench mark tomorrow, and see how the optimization of this approach improves performance compared to the unoptimized performance. was (Author: daemon): [~yufeigu] Thank you very much, Yufei. I have to say that this JIRA's description describes a very good idea, and I have achieved the corresponding code. I used *O(N)* complexity before sorting to calculate the properties of each application, and then according to the computed properties of each application ordering all applications using O(n*log(n)) complexity. It does look a lot faster. And I'll test the bench mark tomorrow, and see how the optimization of this approach improves performance compared to the unoptimized performance. Here are some snippets of the code I've written, and many of the details have to be changed. But I think our general idea is similar, using O(N) complexity to calculate each of the Schedulable attribute values (such as the *SchedulableInnerAttr*). Then, order N of the SchedulableInnerAttr using O(n*log(n)) time complexity. Because every member variable of the *SchedulableInnerAttr* is the base type, it can be very fast in sorting. And it will be much faster than the traditional sort. I will upload the corresponding detailed code and bench mark for the test tomorrow.Agree with what I said? {code:java} private Schedulable sched; private boolean isDemandNotNone; private boolean isNeedy; private double minShareRatio; private boolean isWeightExceedZero; private double memorySizePerWeight; private long startTime; private String name; public SchedulableInnerAttr(Schedulable sched) { this.sched = sched; this.isDemandNotNone = sched.getDemand().equals(Resources.none()); Resource resourceUsage = sched.getResourceUsage(); Resource minShare = Resources.min(RESOURCE_CALCULATOR, null, sched.getMinShare(), sched.getDemand()); this.isNeedy = Resources.lessThan(RESOURCE_CALCULATOR, null, resourceUsage, minShare); this.minShareRatio = (double) resourceUsage.getMemorySize() / Resources.max(RESOURCE_CALCULATOR, null, minShare, ONE) .getMemorySize(); double weight = sched.getWeights().getWeight(ResourceType.MEMORY); this.isWeightExceedZero = weight > 0.0; this.memorySizePerWeight = resourceUsage.getMemorySize(); if (this.isWeightExceedZero) { this.memorySizePerWeight /= weight; } this.startTime = sched.getStartTime(); this.name = sched.getName(); } {code} {code:java} @Override public int compare(SchedulableInnerAttr s1, SchedulableInnerAttr s2) { int res = 0; if (s1.isDemandNotNone() && !s2.isDemandNotNone()) { res = -1; } else if (!s1.isDemandNotNone() && s2.isDemandNotNone()) { res = 1; } if (res == 0) { if (s1.isNeedy() && !s2.isNeedy()) { res = -1; } else if (!s1.isNeedy() && s2.isNeedy()) { res = 1; } } if (res == 0) { res = (int) Math.signum(s1.getMinShareRatio() - s2.getMinShareRatio()); } if (res == 0) { if (s1.isWeightExceedZero() && !s2.isWeightExceedZero()) { res = 1; } else if (!s1.isWeightExceedZero() && s2.isWeightExceedZero()) { res = -1; } } if (res == 0) { res = (int) Math.signum(s1.getMemorySizePerWeight() - s2.getMemorySizePerWeight()); } if (res == 0) { res = (int) Math.signum(s1.getStartTime() - s2.getStartTime()); } if (res == 0) { res = s1.getName().compareTo(s2.getName()); } return res; } {code} > FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big > queues > -------------------------------------------------------------------------------- > > Key: YARN-6361 > URL: https://issues.apache.org/jira/browse/YARN-6361 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Miklos Szegedi > Assignee: YunFan Zhou > Attachments: dispatcherthread.png, threads.png, > YARN-6361.001.pre.patch > > > FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. > Most of the time is spent in FairShareComparator.compare. We could improve > this by doing the calculations outside the sort loop {{(O\(n\))}} and we > sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an > performance issue when there are huge number of applications in a single > queue. The attachments shows the performance impact when there are 10k > applications in one queue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org