[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15472083#comment-15472083 ]
Li Lu commented on YARN-5585: ----------------------------- I think we're overcomplicating the problem here... I believe the general use case of this JIRA is mostly on pagination: given an uniquely defined type of entities in one application, if the total number of entities is greater than the given limit, can we provide an API to allow fetching data in multiple batches. So right now we have <entity_001>, <entity_002>, ..., <entity_100>, and limit = 10. What we want is initially we fetch <entity_001> to <entity_010>, then given fromId = entity_010, we fetch <entity_011> to <entity_020>, and so on and so forth. According to Rohith's use case, I think it's totally fine to say that all entities are ordered by their Ids lexicographically (especially for entities with proper padding on numbers like container id). Actually, any consistent order will do the work for pagination, the only problem is how to make it makes sense to the users. The real problem here is we need to return everything in an order sorted by their creation time, which seems to be quite hard in our current data model. This was pretty easy in ATS v1, where creation time is baked in the row key for each entity. I remember there were some discussions about this a while ago, but the general conclusion was that we mainly rely on the use cases themselves to guarantee consistency between creation time and entity id. To me, the potential problem of sorting entities according to their creation time to implement pagination is that we have to firstly fetch _all_ of them from HBase to form the order, which really kills the most advantage of pagination. An ID encoder/decoder will be very helpful to this use case. However, having the application write the encode/decode process seems to be introducing more load to application programmers. It also introduces extra work for deployments since cluster operators need to handle third-party plugins. Can we provide several "SORT BY" options for timeline entity types, so that we store their ids accordingly? > [Atsv2] Add a new filter fromId in REST endpoints > ------------------------------------------------- > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader > Reporter: Rohith Sharma K S > Assignee: Rohith Sharma K S > Attachments: YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-10. But to retrieve next 5 apps, it is > difficult. > So proposal is to have fromId in the filter like > *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to > app-10. > This is very useful for pagination in web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org