[
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15795862#comment-15795862
]
Sangjin Lee commented on YARN-6027:
-----------------------------------
Catching up on this discussion after the break.
I am +1 on adding pagination support for flow runs and apps.
Regarding flows (flow activity), I think [~varun_saxena] explained well how
that table is supposed to be used and served. The intent of the flow activity
is it's strongly organized by *dates*; i.e. it is stored as daily activities.
Therefore, a natural way to segment and paginate it would be using date range
filters.
Please note that "flows" and "flow runs" are different. Flows are closer to the
application that drives runs. Flow runs are *actualized instances* of those
flows. For example, if you run a MR sleep job on every hour, there would be
*one* flow that says "Sleep Job". And there would be 24 flow runs for a given
day that belong to that flow.
The flow activity table surfaces all the flows (not the flow runs) that had
activity on a given day, and that should be a good landing point for users, not
unlike the current RM page where it shows the latest active YARN applications.
In the above sleep job case, it should show only *one entry* that says "Sleep
Job". You can drill down (i.e. obtain the list of flow runs from that) to get
at the 24 flow runs.
bq. Landing page should be list of all flows.
I think we had this discussion in the past but I forget the JIRA id. Given the
current structure of the data, it would not be very feasible. Also, for any
sufficiently old cluster, you can easily imagine how big (and slow) this result
can be. Even if we introduced pagination, you'd be looking at flows that start
with "A" on this landing page. You'd need to move many pages to get to your
flows. Again, like the RM landing page, IMO *recency* is the key to the
usefulness of this UI, and that's why we organized the flow activity that way.
That way most users would find their flows within the first few pages (at most)
of the data. Hope that helps.
cc [~jrottinghuis] [~vrushalic]
> Support fromId for flows/flowrun apps
> -------------------------------------
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
> Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar
> filter for flows/flowRun apps and flow run and flow as well.
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]