Sangjin Lee commented on YARN-3051:

bq. In YTS v1, we use (entity type, entity id) to globally identify an unique 
timeline entity. In YTS v2, according to the data schema discussion, if my 
understanding is correct, we want to use (cluster id, user id, flow name, flow 
version*, flow run, app id, entity type, entity id) to globally identify the 

I think this needs clarification. I believe in Timeline Service v.2 it is 
(cluster id, entity type, entity id) that uniquely identify an entity. The 
remaining attributes (user id, flow name, flow run id, app id) are part of the 
primary key, and are required when a new entity is inserted. However, for reads 
if you have the cluster id, entity type, and entity id, that should be 
sufficient to locate an entity.

So essentially the only addition in terms of uniqueness is cluster id (as the 
storage in v.2 is multi-cluster).

Let me know if you have a different understanding.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---------------------------------------------------------------------------
>                 Key: YARN-3051
>                 URL: https://issues.apache.org/jira/browse/YARN-3051
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>         Attachments: YARN-3051_temp.patch
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.

This message was sent by Atlassian JIRA

Reply via email to