Weiwei Yang commented on YARN-8127:

Hi [~Tao Yang]

Thanks for the patch, while reading the commit logic, I think this check can be 
moved to \{{FiCaSchedulerApp#commonCheckContainerAllocation}}, when a proposal 
made an allocation for a reserved container, we need to make sure the node has 
reserved container at that moment (in case another same proposal was already 
committed). Could you please take a look?

> Resource leak when async scheduling is enabled
> ----------------------------------------------
>                 Key: YARN-8127
>                 URL: https://issues.apache.org/jira/browse/YARN-8127
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Weiwei Yang
>            Assignee: Tao Yang
>            Priority: Critical
>         Attachments: YARN-8127.001.patch, YARN-8127.002.patch
> Brief steps to reproduce
>  # Enable async scheduling, 5 threads
>  # Submit a lot of jobs trying to exhaust cluster resource
>  # After a while, observed NM allocated resource is more than resource 
> requested by allocated containers
> Looks like the commit phase is not sync handling reserved containers, causing 
> some proposal incorrectly accepted, subsequently resource was deducted 
> multiple times for a container.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to