[jira] [Commented] (YARN-6733) Add table for storing sub-application entities

Vrushali C (JIRA) Wed, 12 Jul 2017 22:46:25 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085210#comment-16085210
 ]


Vrushali C commented on YARN-6733:
----------------------------------

bq. Unlike entity table schema, if user is the preference across clusters then 
I think row key should start with subAppUser name 

For the entity table, we put the username first since we wanted to ensure 
frequent writes by one user go to the same regionserver. That way, some user 
who is heavy on writes does not affect another one with less writes. But that 
holds good with entities since we would write a lot of entities.

As such, cluster ! user prefix seems more appropriate for nesting. For sub 
application entities, I believe cluster!user would be a good prefix. 

bq. IIRC, we have entity type and entity id to distinguish between the entity 
so sub app name not required right? Am I missing anything?
Hmm, so here is how I see this. Let's says a user is running a particular query 
again and again on this tez setup. Each time the query is run, it will write 
things to atsv2. Let's call that query "queryA". Each run of this query would 
(should) generate a different entity id.

For example, to store the status of this query, I think the key would then be
cluster ! sub app user ! queryA ! DAG ! 1234 ! queryA_1499924426 
with column name "status" and perhaps value of "SUCCESS".  

Say next time it runs, they can write
cluster ! sub app user ! queryA ! DAG ! 1234 ! queryA_1450024426 

Hence the sub app name and entity id. What do you think? I could remove the 
sub-app name and keep only entity id but each time they run this query, the 
framework has to anyways generate a new entity id, so the row key I am 
proposing will give them a way to look at different entities within the same 
sub app. 

But now I am wondering if we should make it simple and keep only the entity id 
and not have any sub app name. 


> Add table for storing sub-application entities
> ----------------------------------------------
>
>                 Key: YARN-6733
>                 URL: https://issues.apache.org/jira/browse/YARN-6733
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Vrushali C
>         Attachments: IMG_7040.JPG, YARN-6733-YARN-5355.001.patch
>
>
> After a discussion with Tez folks, we have been thinking over introducing a 
> table to store  sub-application information.
> For example, if a Tez session runs for a certain period as User X and runs a 
> few AMs. These AMs accept DAGs from other users. Tez will execute these dags 
> with a doAs user. ATSv2 should store this information in a new table perhaps 
> called as "sub_application" table. 
> This jira tracks the code changes needed for  table schema creation.
> I will file other jiras for writing to that table, updating the user name 
> fields to include sub-application user etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-6733) Add table for storing sub-application entities

Reply via email to