[
https://issues.apache.org/jira/browse/YARN-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588799#comment-15588799
]
Varun Saxena commented on YARN-5751:
------------------------------------
Thanks [~rohithsharma] for sharing your views.
I do understand that it is not very clear which each metric value entails. For
instance, I had to look back into code to find out whether MEMORY reported from
NM for each container is in bytes or KB or MB, when I first looked at the REST
output from timeline service.
Let us assume that we add UNIT to TimelineMetric to indicate KB/MB, etc.
Question is how do we store it then ? Currently metric name is stored as a
column qualifier and metric value as column value along with timestamps, for
which we utilize HBase cell timestamps. So question is where do we store this
extra information ?
This can probably be stored as a suffix to the metric name but then this would
impact metric filters. Or we can just add another column with metric name
prefix with a character indicating UNIT(say, something like u!MEMORY) to store
metric unit and just read it back at all times or create necessary column
filters if metrics to retrieve are specified. I will choose latter if I have to
mandatorily choose some option.
But the question is can't memory name not indicate what the unit of metric is ?
For instance, most of the Mapreduce counter names indicate unit too. We can
publish MEMORY as MEMORY_BYTES instead.
Or is it even required ? Typically the systems publishing to us would know the
unit of the metric they are writing. And hence would know what they are reading
back. Except admins, it is unlikely somebody is going to use the REST URLs'
directly. These endpoints will typically be used in a system which has another
front-end to serve this data. Probably we can make metric names published from
YARN or MAPREDUCE more understandable(i.e. suffixed with units) if somebody has
to interpret REST output directly. Thoughts ?
You may say that this argument is based on HBase storage but then that is our
primary storage implementation for now. So, what to store and what not may
depend on combination of necessity and feasibility.
I am not completely sure if the need to store unit is strong enough to desire
another column qualifier in HBase implementation. We can probably adopt the
approach mentioned above if we have to store it. Do you have any other idea
regarding how to store it ?
Is the concern that one code path may change(say, publishing side) and other
may not (say, UI rendering) if we do not make unit part of our model ?
Let us see what others think though.
cc [~sjlee0], [~gtCarrera9]
> Support UNIT for TimelineMetric
> -------------------------------
>
> Key: YARN-5751
> URL: https://issues.apache.org/jira/browse/YARN-5751
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: ATSv2
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
> Priority: Critical
>
> ATSv2 allows users to write its metrics using TimelineMetric. But, there is
> no field to tell what is the UNIT of published metric. This is very difficult
> when metrics are read.
> I propose to add UNIT for TimelineMetric so that once user can use this field
> to tell what is the unit of published metric. May be this can be optional
> for few kind or metrics where unit is not required say CPU. But definitely
> there should be a way to set units while publishing the entities.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]