Wangda Tan commented on YARN-3508:

Thanks for explanation, [~varun_saxena].

Currently, YARN's non-preemption scheduler events are sent to central RM 
dispatcher and then pushed to scheduler event queue. RM dispatcher won't wait 
for these events to be handled, so the performance should be good.

Preemption event is different, RMContainerPreemptEventDispatcher is a 
synchronized dispatcher, when RMDispatcher receive these events, it will wait 
for these events proceeded.

My suggestion is to make {{ContainerPreemptEventType}} as a part of 
{{SchedulerEventType}}, and ProportionalCapacityPreemptionPolicy sends the 
event to {{RMDispatcher}}. So these events will be sent to SchedulerEventQueue, 
and not block central RMDispatcher.


> Preemption processing occuring on the main RM dispatcher
> --------------------------------------------------------
>                 Key: YARN-3508
>                 URL: https://issues.apache.org/jira/browse/YARN-3508
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, scheduler
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Varun Saxena
>         Attachments: YARN-3508.002.patch, YARN-3508.01.patch, 
> YARN-3508.03.patch, YARN-3508.04.patch
> We recently saw the RM for a large cluster lag far behind on the 
> AsyncDispacher event queue.  The AsyncDispatcher thread was consistently 
> blocked on the highly-contended CapacityScheduler lock trying to dispatch 
> preemption-related events for RMContainerPreemptEventDispatcher.  Preemption 
> processing should occur on the scheduler event dispatcher thread or a 
> separate thread to avoid delaying the processing of other events in the 
> primary dispatcher queue.

This message was sent by Atlassian JIRA

Reply via email to