[ 
https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115167#comment-16115167
 ] 

Subru Krishnan commented on YARN-6955:
--------------------------------------

Thanks [~botong] for surfacing this issue. The patch looks mostly good (pending 
Yetus warnings fix) except that we should be save the registration request only 
if _this.amRegistrationRequest == null_.

> Concurrent registerAM thread in Federation Interceptor
> ------------------------------------------------------
>
>                 Key: YARN-6955
>                 URL: https://issues.apache.org/jira/browse/YARN-6955
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Botong Huang
>            Assignee: Botong Huang
>            Priority: Minor
>         Attachments: YARN-6955.v1.patch
>
>
>  The timeout between AM and AMRMProxy is shorter than the timeout + failOver 
> between FederationInterceptor (AMRMProxy) and RM. When the first register 
> thread in FI is blocked because of an RM failover, AM can timeout and resend 
> register call, leading to two outstanding register call inside FI. 
> Eventually when RM comes back up, one thread succeeds register and the other 
> thread got an application already registered exception. FI should swallow the 
> exception and return success back to AM in both threads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to