[
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025219#comment-15025219
]
Sangjin Lee commented on YARN-3862:
-----------------------------------
I agree that currently the config or metric ids are used directly as the column
names and what is being done in the patch is probably correct, and we would get
the same result if we went with *ColumnPrefix.getColumnPrefixBytes().
I think one of the reasons that we still want to leverage *ColumnPrefix is
because that way we're basically insulated against future changes. If we went
with the approach that the patch proposes and the column name format for config
or metric should change later, we would need to remember to visit
TimelineFilterUtils and modify this method accordingly. That would be rather
brittle.
Another interesting reason is consistency. Currently when configs and metrics
are written, they go through ColumnHelper.getColumnQualifier() to create the
column name bytes. ColumnHelper properly encodes them if there are spaces for
example. It would be consistent to treat them the same way for the read path. I
don't know that we allow spaces in config or metric names (I don't think we
discussed that possibility), but at least that way we'd be consistent.
My proposal for doing this was using the byte array returned by
{code}
EntityColumnPrefix.CONFIG.getColumnPrefixBytes(prefix_from_the_filter)
{code}
to use as argument to the BinaryPrefixComparator constructor. We'd need to work
out how the column prefix can be passed into TimelineFilterUtils. Hope this
helps.
While we're at it, can we also refactor the calls to
ColumnHelper.getColumnQualifier() in ApplicationColumnPrefix.store(),
EntityColumnPrefix.store(), etc. to use getColumnPrefixBytes()?
bq. So prefixes in createHBaseColQualPrefixFilter() can be anything and cannot
be fetched via a call to ColumnPrefix.getColumnPrefixBytes().
I'm not quite sure under what scenario
ColumnPrefix.getColumnPrefixBytes(prefix_passed_by_users) would not work for
this purpose. Could you kindly elaborate?
bq. Maybe confs and metrics can be renamed as configsToRetrieve and
metricsToRetrieve respectively. Thoughts ?
Those sound better.
{quote}
Current code is not hooked up to the REST layer, so it wont work end to end.
However, the current patch has already become quite big. So we can handle REST
related changes in another JIRA. I am fine with that.
{quote}
+1. We can put that in another JIRA.
> Decide which contents to retrieve and send back in response in TimelineReader
> -----------------------------------------------------------------------------
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Varun Saxena
> Assignee: Varun Saxena
> Labels: yarn-2928-1st-milestone
> Attachments: YARN-3862-YARN-2928.wip.01.patch,
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch,
> YARN-3862-feature-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is
> specified in the query API. In case of configs and metrics, this can become a
> lot of data even though the user doesn't need it. So we need to provide a way
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in
> a that window. This may be useful in plotting graphs
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)