Varun Saxena commented on YARN-3508:

Just to clarify, in the implementation I have spawned a new preemption 
dispatcher thread instead of posting preemption events to scheduler dispatcher.
This is because IMHO container preemption events should have priority over 
scheduler events. This approach though would make this one extra thread 
contending for scheduler lock.

Another approach though would be to post events preemption events to scheduler 
dispatcher. And have a {{LinkedBlockingDeque}} for storing events instead. This 
way preemption events can be posted to front of queue. However, linked blocking 
deque uses a single lock for put and take operations whereas linked blocking 
queue uses 2 different locks for these 2 operations making the latter better 
from a performance viewpoint.

[~jlowe], [~jianhe], [~leftnoteasy], thoughts on the approaches mentioned above 

> Preemption processing occuring on the main RM dispatcher
> --------------------------------------------------------
>                 Key: YARN-3508
>                 URL: https://issues.apache.org/jira/browse/YARN-3508
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, scheduler
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Varun Saxena
>         Attachments: YARN-3508.002.patch, YARN-3508.01.patch
> We recently saw the RM for a large cluster lag far behind on the 
> AsyncDispacher event queue.  The AsyncDispatcher thread was consistently 
> blocked on the highly-contended CapacityScheduler lock trying to dispatch 
> preemption-related events for RMContainerPreemptEventDispatcher.  Preemption 
> processing should occur on the scheduler event dispatcher thread or a 
> separate thread to avoid delaying the processing of other events in the 
> primary dispatcher queue.

This message was sent by Atlassian JIRA

Reply via email to