[
https://issues.apache.org/jira/browse/YARN-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236192#comment-15236192
]
Vinod Kumar Vavilapalli commented on YARN-3452:
-----------------------------------------------
Old JIRA.
bq. However YARN really should not be using bogus users on tokens anyway in
case the RPC layer (or other non-YARN systems) try to do something with those
users like HADOOP-10650 did.
bq. if someone else tries to do something with the ugi assuming it actually was
a valid user.
They aren't really bogus, for many of these calls, we need them to identify the
incoming security context / identity when per-app, per-container tokens are
used. Server logging / audit logs also can depend on this for operations where
the identifier should be app / container etc.
Even though, our core layer is named as UserGroupInformation, in many part of
YARN and MapReduce (other than application submission), it is used as a way of
propogating "IdentityInformation". Arguably, the server side code could simply
look at the incoming tokens, find the incoming ID and ignore the user-name
altogether. On the flip side, obviously Service-level authorization layer
(hadoop-policy.xml) etc are wired into it as system-level users (HADOOP-10650
being a symptom) so I agree with you in that it is kind of disconnected.
Most of this code goes all the way back when I originally implemented security
for YARN. And I borrowed this way of doing things strictly from how JobTokens
were done in Hadoop 1.x MapReduce. I doubt if we can change this now - we'll
have to change each and every API depending on this and their usage to
understand both user-name and the specific identifier (like Application ID).
Given that HADOOP-12413 too care of the invalid group lookups, we are good for
now. Changing the usage of UGI to only use real kerberos-names is likely going
to be a huge one in YARN / MR.
> Bogus token usernames cause many invalid group lookups
> ------------------------------------------------------
>
> Key: YARN-3452
> URL: https://issues.apache.org/jira/browse/YARN-3452
> Project: Hadoop YARN
> Issue Type: Bug
> Components: security
> Reporter: Jason Lowe
> Attachments: tactical_defense.patch
>
>
> YARN uses a number of bogus usernames for tokens, like application attempt
> IDs for NM tokens or even the hardcoded "testing" for the container localizer
> token. These tokens cause the RPC layer to do group lookups on these bogus
> usernames which will never succeed but can take a long time to perform.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)