[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498004#comment-16498004
 ] 

Billie Rinaldi commented on YARN-7962:
--------------------------------------

Ah, I see what Wilfred is saying now. I don't think the locking is done 
correctly in patch 6. I think Wangda is right that isServiceStarted = false and 
renewerService.shutdown() both need to be performed while the lock is held in 
serviceStop, and Wilfred is right that the locking in serviceStart should not 
be changed. ThreadPoolExecutor.shutdown says that previously submitted tasks 
are executed, but it still seems like it would be safer to include the shutdown 
in the lock. I'll attach patch 7 with my proposed changes.

> Race Condition When Stopping DelegationTokenRenewer
> ---------------------------------------------------
>
>                 Key: YARN-7962
>                 URL: https://issues.apache.org/jira/browse/YARN-7962
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0
>            Reporter: BELUGA BEHR
>            Priority: Critical
>         Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>       DelegationTokenRenewerEvent evt) {
>     serviceStateLock.readLock().lock();
>     try {
>       if (isServiceStarted) {
>         renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>       } else {
>         pendingEventQueue.add(evt);
>       }
>     } finally {
>       serviceStateLock.readLock().unlock();
>     }
>   }
>   @Override
>   protected void serviceStop() {
>     if (renewalTimer != null) {
>       renewalTimer.cancel();
>     }
>     appTokens.clear();
>     allTokens.clear();
>     this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>       at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>       at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>       at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to