[ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252343#comment-14252343
 ] 

Anubhav Dhoot commented on YARN-2975:
-------------------------------------

Yes I am worried about getting it wrong for maxRunningEnforcer. Before the 
change, we would inside a lock achieve the removal of the app whether it was in 
runnable or not and be reasonably sure.
Now the splitting it into 2 non atomic steps outside as i listed above,  and 
also 2 steps inside {noformat} return removeRunnableApp(app) || 
removeNonRunnableApp(app) {noformat}, we might make it worse as each one leaves 
the lock before the other acquires. The application could be completely missed 
when it moves from nonrunnable to runnable in between.
How about making removeApp do try to remove from both runnable or nonRunnable 
inside a single writelock. We can try removing the redundancies with 
removeRunnableApp and removeNonRunnableApp by having a fourth internal method 
that all 3 delegate via flags to limit where to look for the app.  

> FSLeafQueue app lists are accessed without required locks
> ---------------------------------------------------------
>
>                 Key: YARN-2975
>                 URL: https://issues.apache.org/jira/browse/YARN-2975
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>            Priority: Blocker
>         Attachments: yarn-2975-1.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to