[ 
https://issues.apache.org/jira/browse/YARN-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15585744#comment-15585744
 ] 

Rohith Sharma K S commented on YARN-5715:
-----------------------------------------

bq. What do others think?
As a user, I would expect my data to be read as latest first when entities are 
written without any idPrefix. Lets say in Tez where long running AM executes 
multiple DAG over a period of week or month, while reading user would expect 
last executed DAG first rather than last week DAG execution detail. But, since 
we are supporting an option to sort entities, I think user himself can take 
decision how does he want to read it. Again, we should document it correctly. I 
am fine with any approach.

I think we should discuss another point that  does  idPrefix should be part of 
UID?

> introduce entity prefix for return and sort order
> -------------------------------------------------
>
>                 Key: YARN-5715
>                 URL: https://issues.apache.org/jira/browse/YARN-5715
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Rohith Sharma K S
>            Priority: Critical
>         Attachments: YARN-5715-YARN-5355.01.patch, 
> YARN-5715-YARN-5355.02.patch, YARN-5715-YARN-5355.03.patch
>
>
> While looking into YARN-5585, we have come across the need to provide a sort 
> order different than the current entity id order. The current entity id order 
> returns entities strictly in the lexicographical order, and as such it 
> returns the earliest entities first. This may not be the most natural return 
> order. A more natural return/sort order would be from the most recent 
> entities.
> To solve this, we would like to add what we call the "entity prefix" in the 
> row key for the entity table. It is a number (long) that can be easily 
> provided by the client on write. In the row key, it would be added before the 
> entity id itself.
> The entity prefix would be considered mandatory. On all writes (including 
> updates) the correct entity prefix should be set by the client so that the 
> correct row key is used. The entity prefix needs to be unique only within the 
> scope of the application and the entity type.
> For queries that return a list of entities, the prefix values will be 
> returned along with the entity id's. Queries that specify the prefix and the 
> id should be returned quickly using the row key. If the query omits the 
> prefix but specifies the id (query by id), the query may be less efficient.
> This JIRA should add the entity prefix to the entity API and add its handling 
> to the schema and the write path. The read path will be addressed in 
> YARN-5585.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to