Joep Rottinghuis commented on YARN-4074:

In TimelineEntityReader#readMetrics it seems safe to assume that if we have 
more than one value that this is a TimelineMetric.Type.TIME_SERIES.
Conversely it doesn't have to be true though right? I guess we'll just assume 
that for timelines we'd never have just one value? I can't quite oversee the 
impact of incorrectly assuming TimelineMetric.Type.SINGLE_VALUE if only one 
value has been written to HBase yet.

Wrt. ApplicationRowKey: at some point (perhaps not this jira) we should 
consider making the app_id a compound object that is stored with a ? separator. 
The prefix (in most cases in yarn right now would be "application_") would be 
separate and the RM start time and the final numeric part would be stored as a 
numerical value with a separate Bytes.to... conversion.

Otherwise we'll end up getting incorrect order for rowkeys when the application 
id wraps to 10K and each power of ten after that. For example, lexically 
application_1442351767756_10000 < application_1442351767756_9999

If we just access the application by specific key this doesn't matter, but if 
we do a row-scan and count on ordering to set an appropriate stop on the scan, 
we'll break things.
This happens on all rowkeys with the app_id in it.

> [timeline reader] implement support for querying for flows and flow runs
> ------------------------------------------------------------------------
>                 Key: YARN-4074
>                 URL: https://issues.apache.org/jira/browse/YARN-4074
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.

This message was sent by Atlassian JIRA

Reply via email to