[
https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335849#comment-16335849
]
Jason Lowe commented on YARN-3895:
----------------------------------
bq. Yes, doing a lookup in two tables at read time (regular entity table and
'domain' or 'ACLs' table) would be very slow in HBase.
If desired this could be changed from a read-time lookup to a write-time
lookup. In other words, the collector could be responsible for
translating/expanding the ACL identifier into the actual ACLs when writing the
row. The collector could then cache these ACL IDs so very few writes would
require a lookup. It is _very_ likely that the ACL ID isn't changing between
entity posts. This would mean that ACLs could not be easily updated once
specified, as all existing rows would need to be updated, but that's going to
be true even if we don't have a domain/ACL ID for indirection on writes given
the proposal to replicate it on each entity row.
bq. How much big would be ACL's size?
ACLs aren't going to be hundreds of kilobytes, but it could get larger than
what is typical if it is an explicit list of many users and/or groups. That's
one of the reasons ATS v1 made this indirect via domains, so ACLs are only sent
once per DAG and a very small bit of info for each post ties the entity to its
corresponding ACL.
Also, as alluded to above, what's the plan to update ACLs after the application
completed? I assume this would have to be a full rewrite of every ACL column
on every entity posted by the application. I don't expect that to be a common
occurrence, but will it be supported or only via HBase admin intervention to
doctor the database?
bq. The ACLs details need to sent one time per entity-id. ACLs object will
contains only reader details which is similar to TimelineDomain#reader field.
Any update for entity-id need not to send acls details again.
Isn't this essentially sending the ACLs on most posts? If we need to avoid
HBase double lookups on reads then the ACL has to be in the entity row data,
correct? For Tez I believe a large chunk of the posting is going to be new
entities and not updates to existing ones. An application like Tez will end up
sending full ACLs on about 50% of its posts. (I think most entities have just
a start event and a stop event.)
> Support ACLs in ATSv2
> ---------------------
>
> Key: YARN-3895
> URL: https://issues.apache.org/jira/browse/YARN-3895
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Varun Saxena
> Assignee: Varun Saxena
> Priority: Major
> Labels: YARN-5355
>
> This JIRA is to keep track of authorization support design discussions for
> both readers and collectors.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]