[ 
https://issues.apache.org/jira/browse/YARN-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760897#comment-17760897
 ] 

ASF GitHub Bot commented on YARN-8980:
--------------------------------------

slfan1989 commented on code in PR #5975:
URL: https://github.com/apache/hadoop/pull/5975#discussion_r1311542321


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/FederationInterceptor.java:
##########
@@ -260,6 +261,16 @@ public class FederationInterceptor extends 
AbstractRequestInterceptor {
 
   private final MonotonicClock clock = new MonotonicClock();
 
+  /*
+   * For UAM, keepContainersAcrossApplicationAttempts is always true.
+   * When re-register to RM, RM will clear node set and regenerate NMToken for 
transferred
+   * container. But If keepContainersAcrossApplicationAttempts of MA is false, 
AM may not
+   * called getNMTokensFromPreviousAttempts, so the NMToken which is pass from
+   * RegisterApplicationMasterResponse will be missing. Here we cache these 
NMToken,
+   * then pass to AM in allocate stage.
+   * */
+  private Set<NMToken> nmTokenMapFromRegisterSecondaryCluster;

Review Comment:
   Using Set<NMToken> is feasible, but should we consider using 
Map<SubClusterId, Set<NMToken>> for better differentiation of NMToken lists for 
each subcluster?





> Mapreduce application container start  fail after AM restart.
> -------------------------------------------------------------
>
>                 Key: YARN-8980
>                 URL: https://issues.apache.org/jira/browse/YARN-8980
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bibin Chundatt
>            Assignee: zhengchenyu
>            Priority: Major
>              Labels: pull-request-available
>
> UAM to subclusters are always launched with keepContainers.
> On AM restart scenarios , UAM register again with RM . UAM receive running 
> containers with NMToken. NMToken received by UAM in 
> getPreviousAttemptContainersNMToken is never used by mapreduce application.  
> Federation Interceptor should take care of such scenarios too. Merge NMToken 
> received at registration to allocate response.
> Container allocation response on same node will have NMToken empty.
> issue credits : [~Nallasivan]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to