Sangjin Lee commented on YARN-3862:

I agree that currently the config or metric ids are used directly as the column 
names and what is being done in the patch is probably correct, and we would get 
the same result if we went with *ColumnPrefix.getColumnPrefixBytes().

I think one of the reasons that we still want to leverage *ColumnPrefix is 
because that way we're basically insulated against future changes. If we went 
with the approach that the patch proposes and the column name format for config 
or metric should change later, we would need to remember to visit 
TimelineFilterUtils and modify this method accordingly. That would be rather 

Another interesting reason is consistency. Currently when configs and metrics 
are written, they go through ColumnHelper.getColumnQualifier() to create the 
column name bytes. ColumnHelper properly encodes them if there are spaces for 
example. It would be consistent to treat them the same way for the read path. I 
don't know that we allow spaces in config or metric names (I don't think we 
discussed that possibility), but at least that way we'd be consistent.

My proposal for doing this was using the byte array returned by


to use as argument to the BinaryPrefixComparator constructor. We'd need to work 
out how the column prefix can be passed into TimelineFilterUtils. Hope this 

While we're at it, can we also refactor the calls to 
ColumnHelper.getColumnQualifier() in ApplicationColumnPrefix.store(), 
EntityColumnPrefix.store(), etc. to use getColumnPrefixBytes()?

bq. So prefixes in createHBaseColQualPrefixFilter() can be anything and cannot 
be fetched via a call to ColumnPrefix.getColumnPrefixBytes().

I'm not quite sure under what scenario 
ColumnPrefix.getColumnPrefixBytes(prefix_passed_by_users) would not work for 
this purpose. Could you kindly elaborate?

bq. Maybe confs and metrics can be renamed as configsToRetrieve and 
metricsToRetrieve respectively. Thoughts ?

Those sound better.

Current code is not hooked up to the REST layer, so it wont work end to end. 
However, the current patch has already become quite big. So we can handle REST 
related changes in another JIRA. I am fine with that.

+1. We can put that in another JIRA.

> Decide which contents to retrieve and send back in response in TimelineReader
> -----------------------------------------------------------------------------
>                 Key: YARN-3862
>                 URL: https://issues.apache.org/jira/browse/YARN-3862
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch, 
> YARN-3862-feature-YARN-2928.wip.03.patch
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 

This message was sent by Atlassian JIRA

Reply via email to