[ 
https://issues.apache.org/jira/browse/YARN-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730808#comment-15730808
 ] 

zhengchenyu commented on YARN-5964:
-----------------------------------

When I found this problem, the continuous scheduling turned off. In our 
cluster, we added more jobs, then leads to lock contention. 

Notes: continuous scheduling turned on later in our cluster, because the speed 
of assigned container is too slow. But this cofiguration didn't lead to obvious 
lock contention. Whatever, I support your opinion that continuous scheduling 
could lead to lock contention, if continuous scheduling is too frequent.

> Lower the granularity of locks in FairScheduler
> -----------------------------------------------
>
>                 Key: YARN-5964
>                 URL: https://issues.apache.org/jira/browse/YARN-5964
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>         Environment: CentOS-7.1
>            Reporter: zhengchenyu
>            Priority: Critical
>             Fix For: 2.7.1
>
>   Original Estimate: 2m
>  Remaining Estimate: 2m
>
> When too many applications are running, we found that client couldn't submit 
> the application, and a high callqueuelength of port 8032. I catch the jstack 
> of resourcemanager when callqueuelength is too high. I found that the thread 
> "IPC Server handler xxx on 8032" are waitting for the object lock of 
> FairScheduler, nodeupdate holds the lock of the FairScheduler. Maybe high 
> process time leads to the phenomenon that client can't submit the 
> application. 
> Here I don't consider the problem that client can't submit the application, 
> only estimates the performance of the fairscheduler. We can see too many 
> function which needs object lock are used, the granularity of object lock is 
> too big. For example, nodeUpdate and getAppWeight wanna hold the same object 
> lock. It is unresonable and inefficiency. I recommand that the low 
> granularity lock replaces the current lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to