[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513781#comment-15513781
 ] 

Sangjin Lee commented on YARN-5585:
-----------------------------------

Thanks for your comments [~varun_saxena]. Yes, we should discuss this during 
the call and report back here.

Before we go into how to implement, I think we need to have a consensus on the 
requirements first. Querying for entities is a fairly generic thing, and IMO 
there should be a clear expectation of in what order they should be queried. It 
affects *which* entities get selected as well as in what order they are sorted. 
As I mentioned, I don't think it would be desirable to leave this order 
completely arbitrary, or things could get quite confusing really quickly.

My preference for this sorting order is either the entity id (descending) order 
or the chronological order. I think the entity id order is the simplest and 
easiest to understand, and for the most part identical to the chronological 
order. YARN entities are mostly compliant (so are MR entities), and it would 
not be unreasonable to ask frameworks to maintain entity id's that way. Even if 
that is not feasible, there would be a very consistent understanding how 
entities would be returned to the reader. That's the default sorting order in 
the current YARN RM web UI too. Can tez adopt a stricter entity id scheme? If 
not, at least would it be acceptable if entities are consistently returned in 
that order?

If we go with the chronological order (created time), then I would want it to 
be consistent. Then we should do it not only for framework entities but also 
YARN entities and change the row key schema for all. And I think that may 
require the secondary lookup table (yes, I understand this would be only for 
lookups and not for data).

Another point about sorting within the timeline reader code. If the query is 
specified with a limit, the limit is passed to the hbase client, and as such it 
will only return that number of entities (or fewer), right? I don't think hbase 
will return more than the specified limit, no? Then I don't understand how you 
would get a *different* set of tez entities than what you expected. For 
example, if there are entity 1 through 10, and your limit was 5, I would expect 
hbase to return 6 through 10 still. The reader code may rearrange them so that 
6 is at the top, but I don't expect hbase to return anything other than 6 
through 10. [~rohithsharma], could you confirm? Did I understand this right?

Also, apart from fixing the sorting in {{TimelineEntity.compareTo()}}, I am not 
sure if we need to re-sort the entities that are returned by hbase again in the 
timeline reader code. The result set from hbase should return them in the right 
order, right? Then I think we should simply return them in the same order 
without applying any further sorting. In other words, instead of using a sorted 
set, we should use the insertion-order set. Thoughts? [~varun_saxena]



> [Atsv2] Add a new filter fromId in REST endpoints
> -------------------------------------------------
>
>                 Key: YARN-5585
>                 URL: https://issues.apache.org/jira/browse/YARN-5585
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>            Priority: Critical
>         Attachments: YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities 
> then REST call gives first/last 100 entities. How to retrieve next set of 100 
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is 
> no way to achieve this. 
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> Since ATS is targeting large number of entities storage, it is very common 
> use case to get next set of entities using fromId rather than querying all 
> the entites. This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to