[ 
https://issues.apache.org/jira/browse/YARN-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325492#comment-14325492
 ] 

Vrushali C commented on YARN-3041:
----------------------------------



To add to my previous comment, this is the way I see it:

A flow is uniquely identified by cluster, user, queue, flow name and run id. So 
these are metadata/attributes/class members of the flow class.  FlowRun is not 
a class, run id is an attribute/member of a Flow class. An Application is a 
child of a Flow. There would also be an AggregatedFlow class which has members 
like startTime and endTime of aggregation etc. Similarly, user and queue are 
attributes of the Flow class. But AggregatedUser and AggregatedQueue are 
classes, which have aggregated information for that user (or queue) over a time 
range.

Maybe I can give some examples of queries. 

For Flow: 
Example 1 : we query for “Give me all the runs of this flow that happened 
yesterday”, Say the flow ran 10 times yesterday.  This should return a list of 
10 flows, one flow object for each run. Each flow object in turn has a list of 
Applications. 

Example 2 : we query for “ How much did this flow take up on the cluster 
yesterday? “ Say the flow ran 10 times yesterday. This query should return an 
aggregated flow object which has the summation of all metrics from all the run 
of the flow yesterday.  This aggregatedFlow now also has the startTime and 
endTime of aggregation as it’s members. (While we would allow for custom time 
ranges, for efficiency we would want to aggregate daily, weekly etc.) 

For User 
Example 1: 
Query: give me all flows that this user ran over this time range. Returns a 
list of such flows, one flow object for each individual run.

Example 2:
Query: give me how much this user consumed on the cluster during this time 
range. Would return an AggregatedUser object which has startTime and endTime of 
this aggregation and summations of metrics over that time range. Again, for 
aggregations, we would probably want to aggregate daily, weekly etc while 
allowing for custom ranges. 


> [Data Model] create overall data objects of TS next gen
> -------------------------------------------------------
>
>                 Key: YARN-3041
>                 URL: https://issues.apache.org/jira/browse/YARN-3041
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Zhijie Shen
>         Attachments: Data_model_proposal_v2.pdf, YARN-3041.2.patch, 
> YARN-3041.3.patch, YARN-3041.4.patch, YARN-3041.preliminary.001.patch
>
>
> Per design in YARN-2928, create the ATS entity and events API.
> Also, as part of this JIRA, create YARN system entities (e.g. cluster, user, 
> flow, flow run, YARN app, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to