[
https://issues.apache.org/jira/browse/YARN-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695205#comment-15695205
]
zhangyubiao edited comment on YARN-5188 at 11/26/16 6:33 AM:
-------------------------------------------------------------
[~chenfolin] ,Our team attached the patch Yarn-4090 to 2.7.1 and we found a
deadlock occurs. I think this patch will cause the same problem ,would you like
to give a look?
was (Author: piaoyu zhang):
[~chenfolin] ,Our team attached the patch Yarn-4090 to 2.7.1 and we found a
deadlock occurs. I think this patch will cause the same problem ,would you like
give a look?
> FairScheduler performance bug
> -----------------------------
>
> Key: YARN-5188
> URL: https://issues.apache.org/jira/browse/YARN-5188
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Affects Versions: 2.5.0
> Reporter: ChenFolin
> Attachments: YARN-5188-1.patch
>
>
> My Hadoop Cluster has recently encountered a performance problem. Details as
> Follows.
> There are two point which can cause this performance issue.
> 1: application sort before assign container at FSLeafQueue. TreeSet is not
> the best, Why not keep orderly ? and then we can use binary search to help
> keep orderly when a application's resource usage has changed.
> 2: queue sort and assignContainerPreCheck will lead to compute all leafqueue
> resource usage ,Why can we store the leafqueue usage at memory and update it
> when assign container op release container happen?
>
> The efficiency of assign container in the Resourcemanager may fall
> when the number of running and pending application grows. And the fact is the
> cluster has too many PendingMB or PengdingVcore , and the Cluster
> current utilization rate may below 20%.
> I checked the resourcemanager logs, I found that every assign
> container may cost 5 ~ 10 ms, but just 0 ~ 1 ms at usual time.
>
> I use TestFairScheduler to reproduce the scene:
>
> Just one queue: root.defalut
> 10240 apps.
>
> assign container avg time: 6753.9 us ( 6.7539 ms)
> apps sort time (FSLeafQueue : Collections.sort(runnableApps,
> comparator); ): 4657.01 us ( 4.657 ms )
> compute LeafQueue Resource usage : 905.171 us ( 0.905171 ms )
>
> When just root.default, one assign container op contains : ( one apps
> sort op ) + 2 * ( compute leafqueue usage op )
> According to the above situation, I think the assign container op has
> a performance problem .
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]