[
https://issues.apache.org/jira/browse/YARN-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122159#comment-16122159
]
Vrushali C edited comment on YARN-6861 at 8/10/17 7:23 PM:
-----------------------------------------------------------
[~rohithsharma] [~varun_saxena] and I discussed this in our weekly call today.
The discussion was about the naming of these apis. The consensus was, as of
now, we will proceed with this patch.
We discussed if we should call it something other than
/users/<user>/entities/<entityid>/ to indicate that these are entities that are
being queried for without knowledge of the yarn application id.
At present, these apis will return sub-application entities. For example, a
query that an user "userA" runs on a Tez setup. This user is different from the
user, say user "userYARN" who is running the Tez AM.
Note 1:
Entities from only such queries will go to two places in the backend:
- in the entity table within the context of an application: {code} userYARN
/ cluster/ flow / flowrun id / appid / entity {code}
- in the sub application table outside the context of an application: {code}
userA / cluster / entity {code}
Note 2:
In this same example, the Tez AM itself writes some lifecycle events and
metrics of it's containers. These will go only to entity table for user
"userYARN".
The reader APIs in this patch are going to return data that belongs to the
context of entities stored outside of an application, that is, from the sub
application table.
The reader APIs like {code} GET /ws/v2/timeline/clusters/{cluster
name}/apps/{app id}/entities/{entity type}
or GET /ws/v2/timeline/apps/{app id}/entities/{entity type} {code} will
return all entities, that is, entities written in "Note 1" as well as written
in "Note 2".
The reader APIs in this patch will return a subset of entities, those written
in "Note 1".
The point we discussed was that when we move on to having user level (and queue
level) aggregations, we would need reader APIs to return that data. For
example, an API that returns say megabytemillis (or all MR counters) for a user
within a time range, say like last week. These APIs help understand usage of a
user or queue on the cluster. This data is aggregated data and those APIs could
like have similar API format /users/<userid>/entities perhaps. In this case, we
could call the API /usersummary/<userid>/entities.
As of now, we will proceed with this patch.
was (Author: vrushalic):
[~rohithsharma] [~varun_saxena] and I discussed this in our weekly call today.
The discussion was about the naming of these apis. The consensus was, as of
now, we will proceed with this patch.
We discussed if we should call it something other than
/users/<user>/entities/<entityid>/ to indicate that these are entities that are
being queried for without knowledge of the yarn application id.
At present, these apis will return sub-application entities. For example, a
query that an user "userA" runs on a Tez setup. This user is different from the
user, say user "userYARN" who is running the Tez AM.
Note 1:
Entities from only such queries will go to two places in the backend:
- in the entity table within the context of an application: {code} userYARN
/ cluster/ flow / flowrun id / appid / entity {code}
- in the sub application table outside the context of an application: {code}
userA / cluster / entity {code}
Note 2:
In this same example, the Tez AM itself writes some lifecycle events and
metrics of it's containers. These will go only to entity table for user
"userYARN".
The reader APIs in this patch are going to return data that belongs to the
context of entities stored outside of an application, that is, from the sub
application table.
The reader APIs like GET /ws/v2/timeline/clusters/{cluster name}/apps/{app
id}/entities/{entity type}
or GET /ws/v2/timeline/apps/{app id}/entities/{entity type} will return all
entities, that is, entities written in "Note 1" as well as written in "Note
2".
The reader APIs in this patch will return a subset of entities, those written
in "Note 1".
The point we discussed was that when we move on to having user level (and queue
level) aggregations, we would need reader APIs to return that data. For
example, an API that returns say megabytemillis (or all MR counters) for a user
within a time range, say like last week. These APIs help understand usage of a
user or queue on the cluster. This data is aggregated data and those APIs could
like have similar API format /users/<userid>/entities perhaps. In this case, we
could call the API /usersummary/<userid>/entities.
As of now, we will proceed with this patch.
> Reader API for sub application entities
> ---------------------------------------
>
> Key: YARN-6861
> URL: https://issues.apache.org/jira/browse/YARN-6861
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelinereader
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
> Attachments: YARN-6861-YARN-5355.001.patch,
> YARN-6861-YARN-5355.002.patch
>
>
> YARN-6733 and YARN-6734 writes data into sub application table. There should
> be a way to read those entities.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]