[ 
https://issues.apache.org/jira/browse/YARN-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236192#comment-15236192
 ] 

Vinod Kumar Vavilapalli commented on YARN-3452:
-----------------------------------------------

Old JIRA.

bq. However YARN really should not be using bogus users on tokens anyway in 
case the RPC layer (or other non-YARN systems) try to do something with those 
users like HADOOP-10650 did.
bq. if someone else tries to do something with the ugi assuming it actually was 
a valid user.
They aren't really bogus, for many of these calls, we need them to identify the 
incoming security context / identity when per-app, per-container tokens are 
used. Server logging / audit logs also can depend on this for operations where 
the identifier should be app / container etc.

Even though, our core layer is named as UserGroupInformation, in many part of 
YARN and MapReduce (other than application submission), it is used as a way of 
propogating "IdentityInformation". Arguably, the server side code could simply 
look at the incoming tokens, find the incoming ID and ignore the user-name 
altogether. On the flip side, obviously Service-level authorization layer 
(hadoop-policy.xml) etc are wired into it as system-level users (HADOOP-10650 
being a symptom) so I agree with you in that it is kind of disconnected.

Most of this code goes all the way back when I originally implemented security 
for YARN. And I borrowed this way of doing things strictly from how JobTokens 
were done in Hadoop 1.x MapReduce. I doubt if we can change this now - we'll 
have to change each and every API depending on this and their usage to 
understand both user-name and the specific identifier (like Application ID).

Given that HADOOP-12413 too care of the invalid group lookups, we are good for 
now. Changing the usage of UGI to only use real kerberos-names is likely going 
to be a huge one in YARN / MR.


> Bogus token usernames cause many invalid group lookups
> ------------------------------------------------------
>
>                 Key: YARN-3452
>                 URL: https://issues.apache.org/jira/browse/YARN-3452
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: security
>            Reporter: Jason Lowe
>         Attachments: tactical_defense.patch
>
>
> YARN uses a number of bogus usernames for tokens, like application attempt 
> IDs for NM tokens or even the hardcoded "testing" for the container localizer 
> token.  These tokens cause the RPC layer to do group lookups on these bogus 
> usernames which will never succeed but can take a long time to perform.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to