Jason Lowe updated YARN-2171:

    Attachment: YARN-2171v2.patch

The point of the unit test was to catch regressions at a high level.  If anyone 
changes the code such that calling allocate() will grab the scheduler lock then 
the test will fail, whether that's a regression in this particular method or 
some new method that's added that ApplicationMasterService or CapacityScheduler 
itself calls and grabs the lock.

I added a separate unit test to exercise the getNumClusterNodes method.

The AHS unit test failure seems unrelated, and it passes for me locally even 
with this change.

> AMs block on the CapacityScheduler lock during allocate()
> ---------------------------------------------------------
>                 Key: YARN-2171
>                 URL: https://issues.apache.org/jira/browse/YARN-2171
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: YARN-2171.patch, YARN-2171v2.patch
> When AMs heartbeat into the RM via the allocate() call they are blocking on 
> the CapacityScheduler lock when trying to get the number of nodes in the 
> cluster via getNumClusterNodes.

This message was sent by Atlassian JIRA

Reply via email to