[
https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353115#comment-16353115
]
Vrushali C commented on YARN-3895:
----------------------------------
Here is the design after several rounds of discussions in the community. Thanks
[~jlowe] , [~jrottinghuis] [~lohit] for discussing with us (me, [~rohithsharma]
and [~varun_saxena]).
- We will go with the domain concept as in ATSv1. Entities will be written with
a TimelineDomain (like in ATSv1) and there will be putDomain calls just like
ATSv1.
- The domain information will be persisted to the backend in a domain table.
- The domain information will also be retained in the TimelineCollector. This
now makes the Timeline Collector stateful.
- If a timeline collector goes down (for whatever reason) and comes back up, it
knows which app ids it had in memory. The collector will in this specific case,
“refresh” it’s ACLs state by reading back from HBase, the domain ids for those
app ids.
- Each time an entity is received by the collector, it looks up the app id +
domain id in it’s memory and appends the TimelineDomain to entity.
- The entity when written to HBase has not only the domain id but also the
Timeline Domain information.
- Thus, each row in HBase will have the ACLs info which can be used for
filtering at read time.
- When a read request comes in, the user and user’s group will be sent to the
HBase cluster in the scan/get request and a check will be performed on the
region server to determine if this user is allowed to read that entity or not
based on the user & group membership.
- Since we want to evaluate group of group memberships, this check will be a
UserGroupInformation check just like it’s done in any other yarn ACL
evaluation. This implies, the yarn cluster AND the HBase cluster have to have
the same username & group ldap mappings so that evaluation checks will work as
expected.
- I believe this would be done within a coprocessor but I will check if there
is any other way to run java code as part of scan column value filter
operation.
- If the querying user is an yarn admin, then no checks are necessary.
- In case the ACLs for a domain ids need to be updated, that will mean scanning
through the set of entities for that application id and updating the domain
information for those.
- The domain table will have domain id as row key and other fields in the
TimelineDomain object as columns. Perhaps only one column family is fine.
Details per table in HBase:
- Domain table schema
Rowkey : domain id
ColumnFamily: i (stands for info)
Columns: (listing a few here, there can be others)
- application_id
- created time
- description
- modified time
- owner
- readers
- writers (not used but can be stored for completeness)
We can consider setting compression for this table at a high level, since we do
not anticipate reading frequently from this table.
- Entity table, SubApplication table, Application table. can store the domain
id as a column and the fields in the domain object as separate columns.
- FlowRun table. We can start with doing a union of ACLs for all applications
within a flow run.
- FlowActivity table. We can start by doing a union of ACLs for all runs in a
flow in that time frame. This may turn out to a bit more involved. Let’s
discuss on the jira we file for this.
thanks
Vrushali
> Support ACLs in ATSv2
> ---------------------
>
> Key: YARN-3895
> URL: https://issues.apache.org/jira/browse/YARN-3895
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Varun Saxena
> Assignee: Vrushali C
> Priority: Major
> Labels: YARN-5355
>
> This JIRA is to keep track of authorization support design discussions for
> both readers and collectors.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]