[ 
https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214880#comment-15214880
 ] 

Sangjin Lee commented on YARN-4821:
-----------------------------------

I thought about what you're suggesting too. However, I'm not sure if it would 
be the most useful approach. First, the timeline publishing interval would be a 
function of the resource monitoring interval, so you'd need to look up two 
values to figure out how often you publish the timeline data. More importantly, 
if one needs to modify the resource monitoring interval, he/she should be aware 
of the implication it would have on the timeline publishing, or it's easy to 
miss out that connection and make a mistake.

The main concern with this is to control the amount data you write as I suspect 
this might be one of the more copious volumes of data we write.

How about a simple time-based publishing? Let's say the resrouce monitoring 
interval is 3 seconds, and the timeline publishing interval is 10 seconds. Then 
we could keep track of the last publishing time and use that to ensure we don't 
publish more often than 10 seconds. The following might be one example.

|| time || resource monitoring || timeline publishing ||
| 0 | yes | yes |
| 3 | yes | |
| 6 | yes | |
| 9 | yes | |
| 12 | yes | yes |
| 15 | yes | |
| 18 | yes | |
| 21 | yes | |
| 24 | yes | yes |
| 27 | yes | |
| 30 | yes | |

This is one idea, but we could have increasingly more sophisticated ideas. For 
example, we could also remember the regular intervals, and write one data even 
if it's less than 10 seconds from the previous writes so that we have fairly 
regular writes happening (in the above example, it would be at 0 seconds, 12 
seconds, 21 seconds, and 30 seconds).

We could also consider different intervals for CPU and memory, although one 
could argue that the YARN resource monitoring does not do that so we probably 
don't need to differentiate them. That's just my 2 cents.

> have a separate NM timeline publishing interval
> -----------------------------------------------
>
>                 Key: YARN-4821
>                 URL: https://issues.apache.org/jira/browse/YARN-4821
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>
> Currently the interval with which NM publishes container CPU and memory 
> metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose 
> default is 3 seconds. This is too aggressive.
> There should be a separate configuration that controls how often 
> {{NMTimelinePublisher}} publishes container metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to