I believe a cleaner way to solve this problem is to create two, _separate_ 
UserGroupInformation objects and wrap each AM instances in a UGI doAs so they 
aren't trying to share the same credentials.  This is one example of a token 
bleeding over and causing problems. I suspect trying to fix these one-by-one as 
they pop up is going to be frustrating compared to just ensuring the 
credentials remain separate as if they really were running in separate JVMs.
Adding Daryn who knows a lot more about the UGI stuff so he can correct any 
misunderstandings on my part.
Jason
 

    On Wednesday, March 15, 2017 1:11 AM, Sergiy Matusevych 
<[email protected]> wrote:
 

 Hi YARN developers,

I have an interesting problem that I think is related to YARN Java client.
I am trying to launch *two* application masters in one container. To be
more specific, I am starting a Spark job on YARN, and launch an Apache REEF
Unmanaged AM from the Spark Driver.

Technically, YARN Resource Manager should not care which process each AM
runs in. However, there is a problem with the YARN Java client
implementation: there is a global UserGroupInformation object that holds
the user credentials of the current RM session. This data structure is
shared by all AMs, and when REEF application tries to register the second
(unmanaged) AM, the client library presents to YARN RM all credentials,
including the security token of the first (managed) AM. YARN rejects such
registration request, throwing InvalidApplicationMasterRequestException
"Application Master is already registered".

I feel like this issue can be resolved by a relatively small update to the
YARN Java client - e.g. by introducing a new variant of the
AMRMClientAsync.registerApplicationMaster() that would take the required
security token (instead of getting it implicitly from the
UserGroupInformation.getCurrentUser().getCredentials() etc.), or having
some sort of RM session class that would wrap all data that is currently
global. I need to think about the elegant API for it.

What do you guys think? I would love to work on this problem and send you a
pull request for the upcoming 2.9 release.

Cheers,
Sergiy.


   

Reply via email to