[
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513781#comment-15513781
]
Sangjin Lee commented on YARN-5585:
-----------------------------------
Thanks for your comments [~varun_saxena]. Yes, we should discuss this during
the call and report back here.
Before we go into how to implement, I think we need to have a consensus on the
requirements first. Querying for entities is a fairly generic thing, and IMO
there should be a clear expectation of in what order they should be queried. It
affects *which* entities get selected as well as in what order they are sorted.
As I mentioned, I don't think it would be desirable to leave this order
completely arbitrary, or things could get quite confusing really quickly.
My preference for this sorting order is either the entity id (descending) order
or the chronological order. I think the entity id order is the simplest and
easiest to understand, and for the most part identical to the chronological
order. YARN entities are mostly compliant (so are MR entities), and it would
not be unreasonable to ask frameworks to maintain entity id's that way. Even if
that is not feasible, there would be a very consistent understanding how
entities would be returned to the reader. That's the default sorting order in
the current YARN RM web UI too. Can tez adopt a stricter entity id scheme? If
not, at least would it be acceptable if entities are consistently returned in
that order?
If we go with the chronological order (created time), then I would want it to
be consistent. Then we should do it not only for framework entities but also
YARN entities and change the row key schema for all. And I think that may
require the secondary lookup table (yes, I understand this would be only for
lookups and not for data).
Another point about sorting within the timeline reader code. If the query is
specified with a limit, the limit is passed to the hbase client, and as such it
will only return that number of entities (or fewer), right? I don't think hbase
will return more than the specified limit, no? Then I don't understand how you
would get a *different* set of tez entities than what you expected. For
example, if there are entity 1 through 10, and your limit was 5, I would expect
hbase to return 6 through 10 still. The reader code may rearrange them so that
6 is at the top, but I don't expect hbase to return anything other than 6
through 10. [~rohithsharma], could you confirm? Did I understand this right?
Also, apart from fixing the sorting in {{TimelineEntity.compareTo()}}, I am not
sure if we need to re-sort the entities that are returned by hbase again in the
timeline reader code. The result set from hbase should return them in the right
order, right? Then I think we should simply return them in the same order
without applying any further sorting. In other words, instead of using a sorted
set, we should use the insertion-order set. Thoughts? [~varun_saxena]
> [Atsv2] Add a new filter fromId in REST endpoints
> -------------------------------------------------
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelinereader
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
> Priority: Critical
> Attachments: YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the
> applications. Along with those, it would be good to add new filter i.e fromId
> so that entities can be retrieved after the fromId.
> Current Behavior : Default limit is set to 100. If there are 1000 entities
> then REST call gives first/last 100 entities. How to retrieve next set of 100
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is
> no way to achieve this.
> So proposal is to have fromId in the filter like
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to
> app-10.
> Since ATS is targeting large number of entities storage, it is very common
> use case to get next set of entities using fromId rather than querying all
> the entites. This is very useful for pagination in web UI.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]