[ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1757:
-----------------------------

    Attachment: YARN-1757-v2.patch

Thanks for the review, Karthik!

bq. Nit: YarnConfiguration: We might want to add a NM_RECOVERY_PREFIX for all 
recovery related configs?

Done.

bq. The default recovery-dir should probably be something more specific to 
nm-recovery - /tmp/yarn-nm-recovery?

Yes, I did this in yarn-default.xml but forgot to update the default in 
YarnConfiguration.  Ended up removing the default and just rely on 
yarn-default.xml, much like the fs uri for RM's file system state store.

bq. Nit: Should we add an NMUtils class for static helper methods like 
isRecoveryEnabled()?

It's just a boolean config, very straightforward to access -- not sure an 
NMUtils method adds a lot of value here.

bq. Nit: Rename variables to stateStore instead of stateStorage - that would go 
with the conventions used in RM better, and is shorter

Done.

bq. Nit: AuxServices#createStorageDir: May be add a comment to say control flow 
through FileNotFound is cheaper than explicitly checking if the file exists?
bq. NameNode should also use something similar, instead of directly creating 
the directory? May be, another candidate to move to NMUtils?

Actually I ended up just using the straightforward mkdirs approach.  It already 
checks for existence and updates the permissions of the directory.  Since this 
is the local filesystem we're talking about, there shouldn't be any significant 
performance implications here and no need for a separate method to create 
directories.

bq. TestAuxServices: We should check if we have two directories created too?

It's already checking in each service init whether a directory specific to that 
service is being created, but I went ahead and added a check for two aux 
service recovery directories.


> Auxiliary service support for nodemanager recovery
> --------------------------------------------------
>
>                 Key: YARN-1757
>                 URL: https://issues.apache.org/jira/browse/YARN-1757
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.3.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: YARN-1757-v2.patch, YARN-1757.patch, YARN-1757.patch
>
>
> There needs to be a mechanism for communicating to auxiliary services whether 
> nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to