[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854691#comment-17854691
 ] 

ASF GitHub Bot commented on YARN-11178:
---------------------------------------

pstrzelczak commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-2165145077

   As already mentioned, this issue make 1vcpu to be always occupied due to 
busy loop. The makes the adoption of 3.3 line difficult in production 
environments. Can it be fixed?




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --------------------------------------------------------------------------------------
>
>                 Key: YARN-11178
>                 URL: https://issues.apache.org/jira/browse/YARN-11178
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, security
>    Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
>         Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>            Reporter: Lennon Chin
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
>     for (Map.Entry<DelegationTokenRenewerEvent, Future<?>> entry : futures
>         .entrySet()) {
>       DelegationTokenRenewerEvent evt = entry.getKey();
>       Future<?> future = entry.getValue();
>       try {
>         future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>       } catch (TimeoutException e) {
>         // Cancel thread and retry the same event in case of timeout
>         if (future != null && !future.isDone() && !future.isCancelled()) {
>           future.cancel(true);
>           futures.remove(evt);
>           if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
>             renewalTimer.schedule(
>                 getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
>                 tokenRenewerThreadRetryInterval);
>           } else {
>             LOG.info(
>                 "Exhausted max retry attempts {} in token renewer "
>                     + "thread for {}",
>                 tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>           }
>         }
>       } catch (Exception e) {
>         LOG.info("Problem in submitting renew tasks in token renewer "
>             + "thread.", e);
>       }
>     }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
>     // waiting for some time when futures map is empty
>     if (futures.isEmpty()) {
>       synchronized (this) {
>         try {
>           // waiting for tokenRenewerThreadTimeout milliseconds
>           long waitingTimeMs = Math.min(10000, Math.max(500, 
> tokenRenewerThreadTimeout));
>           LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>           wait(waitingTimeMs);
>         } catch (InterruptedException e) {
>           LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>           Thread.currentThread().interrupt();
>         }
>       }
>       if (futures.isEmpty()) {
>         continue;
>       }
>     }
>     for (Map.Entry<DelegationTokenRenewerEvent, Future<?>> entry : futures
>         .entrySet()) {
>       DelegationTokenRenewerEvent evt = entry.getKey();
>       Future<?> future = entry.getValue();
>       try {
>         future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>       } catch (TimeoutException e) {
>         // Cancel thread and retry the same event in case of timeout
>         if (future != null && !future.isDone() && !future.isCancelled()) {
>           future.cancel(true);
>           futures.remove(evt);
>           if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
>             renewalTimer.schedule(
>                 getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
>                 tokenRenewerThreadRetryInterval);
>           } else {
>             LOG.info(
>                 "Exhausted max retry attempts {} in token renewer "
>                     + "thread for {}",
>                 tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>           }
>         }
>       } catch (Exception e) {
>         LOG.info("Problem in submitting renew tasks in token renewer "
>             + "thread.", e);
>       }
>       // remove done and cancelled task
>       if (future.isDone() || future.isCancelled()) {
>         try {
>           futures.remove(evt);
>           LOG.info("Removed done or cancelled renew tasks of {} in token 
> renewer thread.", evt.getApplicationId());
>         } catch (Exception e) {
>           LOG.warn("Problem in removing done or cancelled renew tasks in 
> token renewer thread.", e);
>         }
>       }
>     }
>   }
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to