[ https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15795862#comment-15795862 ]
Sangjin Lee commented on YARN-6027: ----------------------------------- Catching up on this discussion after the break. I am +1 on adding pagination support for flow runs and apps. Regarding flows (flow activity), I think [~varun_saxena] explained well how that table is supposed to be used and served. The intent of the flow activity is it's strongly organized by *dates*; i.e. it is stored as daily activities. Therefore, a natural way to segment and paginate it would be using date range filters. Please note that "flows" and "flow runs" are different. Flows are closer to the application that drives runs. Flow runs are *actualized instances* of those flows. For example, if you run a MR sleep job on every hour, there would be *one* flow that says "Sleep Job". And there would be 24 flow runs for a given day that belong to that flow. The flow activity table surfaces all the flows (not the flow runs) that had activity on a given day, and that should be a good landing point for users, not unlike the current RM page where it shows the latest active YARN applications. In the above sleep job case, it should show only *one entry* that says "Sleep Job". You can drill down (i.e. obtain the list of flow runs from that) to get at the 24 flow runs. bq. Landing page should be list of all flows. I think we had this discussion in the past but I forget the JIRA id. Given the current structure of the data, it would not be very feasible. Also, for any sufficiently old cluster, you can easily imagine how big (and slow) this result can be. Even if we introduced pagination, you'd be looking at flows that start with "A" on this landing page. You'd need to move many pages to get to your flows. Again, like the RM landing page, IMO *recency* is the key to the usefulness of this UI, and that's why we organized the flow activity that way. That way most users would find their flows within the first few pages (at most) of the data. Hope that helps. cc [~jrottinghuis] [~vrushalic] > Support fromId for flows/flowrun apps > ------------------------------------- > > Key: YARN-6027 > URL: https://issues.apache.org/jira/browse/YARN-6027 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Rohith Sharma K S > Assignee: Rohith Sharma K S > Labels: yarn-5355-merge-blocker > > In YARN-5585 , fromId is supported for retrieving entities. We need similar > filter for flows/flowRun apps and flow run and flow as well. > Along with supporting fromId, this JIRA should also discuss following points > * Should we throw an exception for entities/entity retrieval if duplicates > found? > * TimelieEntity : > ** Should equals method also check for idPrefix? > ** Does idPrefix is part of identifiers? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org