Zhijie Shen commented on YARN-3051:

bq.  Is there a strong reason to alter the uniqueness semantics?

I'm not proposing change the semantics, but according to the db schema design 
(PK), I'm suspecting we're changing the semantics implicitly, such that I want 
to make it clear. In fact, unless some strong requirement to support the 
additional use case, I'm incline to keeping the semantics stable to migrate the 
existing timeline service users smoothly. Otherwise, it will be difficult to 
make {{getEntity(entity type, entity Id)}} compatible.

bq. One could argue that even the addition of cluster id isn't really a change 
from v.1 as v.1 didn't envision multi-cluster storage, right?

Yeah, that's another part where I'm a bit confused why we allow the same 
identifier across clusters but not across apps. One step back, assume I develop 
the my framework YTS v2 integration, due to the constraint of unique entity 
identifier across apps, I need to carefully define entity identifier. By doing 
this, in most cases, the entity identifier is unique across cluster, thought 
the assumption of single cluster may break something in the scenario of 
multiple clusters. For example, in cluster_1, we will have application_1 and in 
cluster_2, we will have application_1 too. Having cluster_id can uniquely 
identify which application_1 it is. I'm just wondering using cluster_1 to 
distinguish more than one entities is a rare case if we force users to define 
unique entity identifiers across apps, and it sounds more like the cross 
cluster problem of this framework. Thoughts?

bq.  We can clearly state that within the same cluster (entity type + entity 
id) must be unique, and enforce it within the storage implementation.

+1, we should enforce this constraint. In addition, I think we need to have a 
index table such as <entity type, entity id, pointer to the entity in entity 
table> to support the single entity query.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---------------------------------------------------------------------------
>                 Key: YARN-3051
>                 URL: https://issues.apache.org/jira/browse/YARN-3051
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>         Attachments: YARN-3051_temp.patch
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.

This message was sent by Atlassian JIRA

Reply via email to