[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

Vrushali C (JIRA) Wed, 27 May 2015 14:29:28 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561770#comment-14561770
 ]


Vrushali C commented on YARN-3051:
----------------------------------

Hi Varun,

Good points.. My answers inline.
bq. We can either return a single timeline entity for a flow ID(having 
aggregated metric values) or multiple entities indicating multiple flows runs 
for a flow ID. I have included an API for the former as of now. I think there 
can be uses cases for both though. Vrushali C, did hRaven have the facility for 
both kinds of queries ? I mean, is there a known use case ?

Yes, there are use cases for both. hRaven has apis for both types of calls, 
they are named differently though. The /flow endpoint in hRaven will return 
multiple flow runs (limited by filters). The /summary will return aggregated 
values for all the runs of that flow in that time range filter. Let me give an 
example (a hadoop sleep job for simplicity).

Say user janedoe runs a hadoop sleep job 3 times today and has run it 5 times 
yesterday and say 6 times on one day about a month back. Now, we may want to 
see two different things:

#1 summarized stats for flow “Sleep job” invoked between last 2 days: It would 
say this flow was run 8 times, first was at timestamp X, last run was at 
timestamp Y, it took up a total of N megabytemillis, had a total of M 
containers across all runs, etc etc. It tells us how much of the cluster 
capacity a particular flow from a particular user is taking up.

-#2 List of flow runs: Will show us details about each flow run. If we say 
limit = 3 in the query parameters, it would return latest 3 runs of this flow. 
If we say limit = 100, it would return all the runs in this particular case 
(including the ones from a month back). If we pass in flowVersion=XXYYZZ, then 
it would return the list of flows that match this version. 

For the initial development, I think we may want to work on #2 first (return 
list of flow runs). The summary api will need aggregated tables which we can 
add later on, we could file a jira for that, my 2c.

bq. Do we plan to include additional info in the user table which can be used 
for filtering user level entites ? Could not think of any use case but just for 
flexibility I have added filters in the API getUserEntities.

I haven’t looked at the code in detail, but as such, for user level entities, 
we would want time range, limit on number of records returns, flow name filter, 
cluster name filter.

bq. I have included an API to query flow information based on the appid. As of 
now I return the flow to which app belongs to(includes multiple runs) instead 
of flow run it belongs to. Which is a more viable scenario ? Or we need to 
support both ?

An app id can belong to exactly one flow run. App id is the hadoop yarn 
application id, which should be unique on the cluster. Given an app id, we 
should be able to look up the exact flow run and return just that. The 
equivalent api in hRaven is /jobFlow.

bq.  But if metrics are aggregated daily and weekly, we wont be able to get 
something like value of specific metric for a flow from say Thursday 4 pm to 
Friday 9 am. Vrushali C, can you confirm ? If this is so, a timestamp doesnt 
make much sense. Dates can be specified instead.

The thinking is to split the querying across tables. We would query both the 
daily summary table for the complete day details and the regular flow tables 
for the details like those of Thursday 4 pm to Friday 9 am. But this does mean 
aggregating on the query side. So, I think, for starters, we could start off by 
allowing Date boundaries. We can enhance the API to accept finer timestamps 
later.

bq. Will there be queue table(s) in addition to user table(s) ? If yes, how 
will queue data be aggregated ? Based on entity type ? I may need an additional 
API for queues then.
Yes, we would need a queue based aggregation table. Right now, those details 
are to be worked out. So perhaps we can leave aside the queue based APIs (or 
file a different jira to handle queue based apis).

Hope this helps. I can give you more examples if you would like to get more 
details or have any other questions. I will also look at the patch this week.  
Also, we should ensure we use the same classes/methods used for key related 
(flow keys, row keys) construction and parsing across reader apis and writer 
apis else they will diverge.

thanks
Vrushali


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---------------------------------------------------------------------------
>
>                 Key: YARN-3051
>                 URL: https://issues.apache.org/jira/browse/YARN-3051
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>         Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

Reply via email to