[ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299302#comment-14299302
 ] 

Wangda Tan commented on YARN-2868:
----------------------------------

bq. Our scenario is debugging queue related issues for which we need queue 
related metrics because scheduling decisions are made based on the queue. What 
would be a good place to add metrics for all those queue related metrics?
It makes sense to me since it's use-case driven.
However, I'm wondering maybe the "first container allocation delay" is not 
correctly calculated in this patch. Thinking about a queue with some pending 
applications, but no queue gets resource allocated from RM (maybe there's any 
issue of the cluster). In this case, the "first container allocation delay" 
will be 0. I think we should consider the time of an app waiting for RM 
allocating container. So even if there's no container allocated in a queue, 
"first container allocation delay" will still be consistently increasing, which 
can help trouble shooting cluster issues.

Does this make sense? [~jianhe].

> Add metric for initial container launch time to FairScheduler
> -------------------------------------------------------------
>
>                 Key: YARN-2868
>                 URL: https://issues.apache.org/jira/browse/YARN-2868
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Ray Chiang
>            Assignee: Anubhav Dhoot
>              Labels: metrics, supportability
>         Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
> YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
> YARN-2868.006.patch, YARN-2868.007.patch
>
>
> Add a metric to measure the latency between "starting container allocation" 
> and "first container actually allocated".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to