[
https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214880#comment-15214880
]
Sangjin Lee commented on YARN-4821:
-----------------------------------
I thought about what you're suggesting too. However, I'm not sure if it would
be the most useful approach. First, the timeline publishing interval would be a
function of the resource monitoring interval, so you'd need to look up two
values to figure out how often you publish the timeline data. More importantly,
if one needs to modify the resource monitoring interval, he/she should be aware
of the implication it would have on the timeline publishing, or it's easy to
miss out that connection and make a mistake.
The main concern with this is to control the amount data you write as I suspect
this might be one of the more copious volumes of data we write.
How about a simple time-based publishing? Let's say the resrouce monitoring
interval is 3 seconds, and the timeline publishing interval is 10 seconds. Then
we could keep track of the last publishing time and use that to ensure we don't
publish more often than 10 seconds. The following might be one example.
|| time || resource monitoring || timeline publishing ||
| 0 | yes | yes |
| 3 | yes | |
| 6 | yes | |
| 9 | yes | |
| 12 | yes | yes |
| 15 | yes | |
| 18 | yes | |
| 21 | yes | |
| 24 | yes | yes |
| 27 | yes | |
| 30 | yes | |
This is one idea, but we could have increasingly more sophisticated ideas. For
example, we could also remember the regular intervals, and write one data even
if it's less than 10 seconds from the previous writes so that we have fairly
regular writes happening (in the above example, it would be at 0 seconds, 12
seconds, 21 seconds, and 30 seconds).
We could also consider different intervals for CPU and memory, although one
could argue that the YARN resource monitoring does not do that so we probably
don't need to differentiate them. That's just my 2 cents.
> have a separate NM timeline publishing interval
> -----------------------------------------------
>
> Key: YARN-4821
> URL: https://issues.apache.org/jira/browse/YARN-4821
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Sangjin Lee
> Assignee: Naganarasimha G R
> Labels: yarn-2928-1st-milestone
>
> Currently the interval with which NM publishes container CPU and memory
> metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose
> default is 3 seconds. This is too aggressive.
> There should be a separate configuration that controls how often
> {{NMTimelinePublisher}} publishes container metrics.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)