[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484201#comment-14484201
 ] 

Daryn Sharp commented on YARN-3055:
-----------------------------------

This appears to go back to the really old days of renewing the token for its 
entire lifetime.  Most unfortunate.

The renewer looks like it may turn into a DOS weapon.  Renewing a token returns 
the next expiration.  The renewer uses a timer to renew 90% before expiration.  
After the last renewal, the same expiration ("the wall") will be returned as 
before.  90% of "the wall" eventually becomes a rapid fire renewal.  There's an 
army of 50 threads prepared to fire concurrently.

My other concern is that it used to be the first job submitted with a given 
token that determined if the token is to be cancelled.  Now any job can 
influence the cancelling.  This patch didn't specifically break that behavior, 
but the original YARN-2704 did, which precipitated YARN-2964 to break it 
differently, and now this jira.

The ramification is we used to tell users to make sure the first job set the 
conf correctly, and essentially don't worry after that.  Now they do have to 
worry.  Any sub-job with the default of canceling tokens will kill the overall 
workflow.  Sub-jobs should not have jurisdiction over the tokens.

> The token is not renewed properly if it's shared by jobs (oozie) in 
> DelegationTokenRenewer
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-3055
>                 URL: https://issues.apache.org/jira/browse/YARN-3055
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: security
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>            Priority: Blocker
>         Attachments: YARN-3055.001.patch, YARN-3055.002.patch
>
>
> After YARN-2964, there is only one timer to renew the token if it's shared by 
> jobs. 
> In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
> token is shared by other jobs, we will not cancel the token. 
> Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
> from {{allTokens}}. Otherwise for the existing submitted applications which 
> share this token will not get renew any more, and for new submitted 
> applications which share this token, the token will be renew immediately.
> For example, we have 3 applications: app1, app2, app3. And they share the 
> token1. See following scenario:
> *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
> there is only one token renewal timer for token1, and is scheduled when app1 
> is submitted
> *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
> be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to