Xianyin Xin commented on YARN-3652:

Thanks for comments, [~sunilg].
1. *Throughput* : Are you mentioning about #events processed over a period of 
time? If so, how can we set the timeline by which throughput is calculated 
A clear indicator from this will be like we can predict possible end timeline 
for the pending events in dispatcher queue. Adding throughput with #no of 
pending events may give much more better indication about RM overload.
In fact the first comes in my mind is the #containers allocated by scheduler 
per second, because the containers allocation what users care and the node 
update event is the most important scheduler event. The rate of processing 
events is also a nice indicator, just as you comment. 
2. However there are many events coming to scheduler, if possible a filter for 
the events based on events type may be helpful to give an accuracy for 
throughout and scheduling delay.
+1 for the idea. Besides, the #events processed by scheduler per second is 
large, so the indexes based on this is volatile. We may consider some method to 
smooth the fluctuate, like making sampling or statistics.

> A SchedulerMetrics may be need for evaluating the scheduler's performance
> -------------------------------------------------------------------------
>                 Key: YARN-3652
>                 URL: https://issues.apache.org/jira/browse/YARN-3652
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager, scheduler
>            Reporter: Xianyin Xin
> As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
> the scheduler's performance. The performance indexes includes #events waiting 
> for being handled by scheduler, the throughput, the scheduling delay and/or 
> other indicators.

This message was sent by Atlassian JIRA

Reply via email to