[
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545607#comment-14545607
]
Xianyin Xin commented on YARN-3652:
-----------------------------------
Thanks for comments, [~sunilg].
{quote}
1. *Throughput* : Are you mentioning about #events processed over a period of
time? If so, how can we set the timeline by which throughput is calculated
(configurable?)?
A clear indicator from this will be like we can predict possible end timeline
for the pending events in dispatcher queue. Adding throughput with #no of
pending events may give much more better indication about RM overload.
{quote}
In fact the first comes in my mind is the #containers allocated by scheduler
per second, because the containers allocation what users care and the node
update event is the most important scheduler event. The rate of processing
events is also a nice indicator, just as you comment.
{quote}
2. However there are many events coming to scheduler, if possible a filter for
the events based on events type may be helpful to give an accuracy for
throughout and scheduling delay.
{quote}
+1 for the idea. Besides, the #events processed by scheduler per second is
large, so the indexes based on this is volatile. We may consider some method to
smooth the fluctuate, like making sampling or statistics.
> A SchedulerMetrics may be need for evaluating the scheduler's performance
> -------------------------------------------------------------------------
>
> Key: YARN-3652
> URL: https://issues.apache.org/jira/browse/YARN-3652
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager, scheduler
> Reporter: Xianyin Xin
>
> As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating
> the scheduler's performance. The performance indexes includes #events waiting
> for being handled by scheduler, the throughput, the scheduling delay and/or
> other indicators.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)