[
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520480#comment-14520480
]
Ray Chiang commented on YARN-2868:
----------------------------------
I'll answer these in reverse order:
2) The first AM container is the "easy" one to measure. Subsequent
measurements can be tricky since the "request" time will need to be recorded
somewhere until the request is actually fulfilled. Tracking all the requests
and corresponding fulfillments would be a lot more work and may want more
sophisticated measurements. I haven't filed a JIRA for doing the later
containers.
1) Breaking this answer into several parts. I'm not going to remember all the
iterations I went through but I'll answer as best as I can.
1A) YARN-3105 covers the enhancements to StateMachine to record state
transitions generically for metrics. [~jianhe] made the original suggestion.
1B) There were several factors for this. I think it was a combination of
wanting queue-specific metrics, wanting to separate first allocation from later
allocations, working with managed and unmanaged AMs, and a desire to get a more
exact measurement with less overhead. I've deleted all my earliest attempts at
this (i.e. those prior to the first patch on this JIRA), so I can't provide
more specific information offhand.
Let me know if that satisfactorily answers your questions.
> FairScheduler: Metric for latency to allocate first container for an
> application
> --------------------------------------------------------------------------------
>
> Key: YARN-2868
> URL: https://issues.apache.org/jira/browse/YARN-2868
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Ray Chiang
> Assignee: Ray Chiang
> Labels: metrics, supportability
> Fix For: 2.8.0
>
> Attachments: YARN-2868-01.patch, YARN-2868.002.patch,
> YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch,
> YARN-2868.006.patch, YARN-2868.007.patch, YARN-2868.008.patch,
> YARN-2868.009.patch, YARN-2868.010.patch, YARN-2868.011.patch,
> YARN-2868.012.patch
>
>
> Add a metric to measure the latency between "starting container allocation"
> and "first container actually allocated".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)