[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658604#comment-13658604
 ] 

Omkar Vinit Joshi commented on YARN-613:
----------------------------------------

This was discussed offline with [~vinodkv], [~bikassaha] and [~sseth].

There were 2 viable solutions to the problem of sending AMNMToken to AM for 
authenticating with NM.

Below problems need to be addressed
* The token will be generated by RM but how long the AMNMToken should be kept 
alive? How long AM should be able to talk to NM on which it ever launched any 
container during application life cycle.
* If the token doesn't have an expiry time then who will renew the token ? NM 
or RM ?.
* If NM reboots then can the old AMNMToken be reused? ( ideally when NM goes 
down right now containers are also lost, so there is nothing specific to that 
application there in NM after reboot)
* AM might handover the AMNMToken to some other external service ( other than 
AM ...may be another container) which should be able to communicate with NM. 
(problem:- how if implemented renewal will take place?)
* We need to support for long running services.
* When key roles over there should be no spiker in communicating renewed tokens 
if implemented.

Proposed solutions :-
* No AMNMToken renewal
** here RM will generate the token and will handover to AM only if the AM is 
getting the container on underlying NM for the first time otherwise it will not 
send. AM can use this token to talk to NM as long application is alive. So this 
is upper limited by number of applications in the cluster <= number of nodes * 
number of containers per node. 
*** RM will have to remember tokens given to AM per NM
*** NM will have to remember tokens per AM
*** AM will have to anyways remember token per NM
**** Problems : If NM reboots then the token is no longer valid in which case 
RM should reissue AM a new token for restarted NM
**** Advantages :
***** for every container RM doesn't have to generate and send token.
***** no need to renew the token. No added overhead. No need to remember past 
keys (other than current and previous master key).
***** even if AM hands over token to some other service, that service can keep 
using the same token.
* AMNMToken renewal
** here RM will generate and issue the token to AM during start container. RM 
also remembers which AM has what all tokens. So when key rolls over then RM 
will redistribute renewed tokens to AM for all NM on which it ever started 
container. AM if receives the updated token will have to replace older with new 
token.
*** RM will have to remember all the NMs fro which AM handed over token
*** NM doesn't have to remember tokens per application. It only has to remember 
current and previous key.
*** AM will receive AMNMToken per container request / or all tokens during key 
renewal. It will have to refresh internal tokens with it
**** Advantages:
***** NM doesn't need to remember the token so there will be no problem across 
NM reboot. (even though token will be valid across NM reboot but still there 
will be nothing on NM for AM before new container starts).
**** Problems:
***** RM has to either remember or regenerate and send tokens to AM for 
container start call. This can be avoided by just sending it when key rolls 
over.
***** AM has to refresh the tokens may be given to some another service for 
monitoring container progress.
***** There will be spike at key role over.

                
> Create NM proxy per NM instead of per container
> -----------------------------------------------
>
>                 Key: YARN-613
>                 URL: https://issues.apache.org/jira/browse/YARN-613
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Omkar Vinit Joshi
>
> Currently a new NM proxy has to be created per container since the secure 
> authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to