[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717146#comment-14717146
 ] 

Sangjin Lee commented on YARN-4074:
-----------------------------------

cc [~gtCarrera9] and [~vrushalic] also for their thoughts.

There are some options for this, and there are pros and cons. I'm leaning 
towards the current proposal ((1) below) for now, but we could enhance this 
later as the UI jells more.

# do a specific entity query for each of the flow runs obtained from the flow 
activity entity
# return all flow runs (possibly with limits and time windows) for the given 
flow
# do a single query for all flow runs specified as a list of flow run id's

One interesting thing to note is that a flow activity entity (record) is an 
activity of that flow *for a given day*. In other words, there can be multiple 
flow activity entities for the same flow. The flow runs that are returned in 
the flow activity entity are only for that given day.

Then the question is, when I click that flow activity record, what flow runs do 
I expect to see? It's bit ambiguous, but I think it might make more sense to 
return only the flow runs that are referenced in that particular day if we're 
using the flow activity to render the landing page.

If we assume that, then (2) is probably not needed for this. Then it leaves us 
with (1) or (3). The benefit of (1) is that it fits easily into the existing 
reader API (getEntity). The downside is that you may need to make multiple 
reader calls to retrieve flow runs But normally the number of flow runs in a 
day for a given flow should be very small, so it might not be a big deal.

One hybrid approach may be that the REST API supports URLs based on the list 
but the web service code can make multiple reader getEntity() calls. We'd still 
need to define the form of the URLs to support that type of queries.

Thoughts?

> [timeline reader] implement support for querying for flows and flow runs
> ------------------------------------------------------------------------
>
>                 Key: YARN-4074
>                 URL: https://issues.apache.org/jira/browse/YARN-4074
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: YARN-4074-YARN-2928.POC.001.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to