[ 
https://issues.apache.org/jira/browse/YARN-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13773019#comment-13773019
 ] 

Jason Lowe commented on YARN-819:
---------------------------------

Just to be clear, is it required to make these changes?  I was under the 
impression these configs were a safety net in case there was a truly 
incompatible change between YARN versions.  They allow the participants to 
reject connections when they detect an incompatible pairing.  If we do have 
truly incompatible versions for some reason then rolling upgrades across those 
versions are out of the question because upgrading the resourcemanager to the 
new version first instantly makes it unable to work properly with all the nodes 
running the older version, and upgrading a nodemanager first instantly makes it 
unable to work properly with the older resourcemanager.

I would expect a normal rolling upgrade case to not touch these configs.  They 
would be set to the earliest version that can properly participate in the 
cluster and typically not change unless a new version became incompatible with 
one or more older versions that used to work.  Having separate RM vs NM configs 
allows ops to use them as an enforcement tool if they desire a homogeneous YARN 
version across the cluster, but that requires reconfigs of the RM and nodes 
before the rolling upgrade can start and reconfigs after the rolling upgrade is 
complete.
                
> ResourceManager and NodeManager should check for a minimum allowed version
> --------------------------------------------------------------------------
>
>                 Key: YARN-819
>                 URL: https://issues.apache.org/jira/browse/YARN-819
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>    Affects Versions: 2.0.4-alpha
>            Reporter: Robert Parker
>            Assignee: Robert Parker
>         Attachments: YARN-819-1.patch, YARN-819-2.patch
>
>
> Our use case is during upgrade on a large cluster several NodeManagers may 
> not restart with the new version.  Once the RM comes back up the NodeManager 
> will re-register without issue to the RM.
> The NM should report the version the RM.  The RM should have a configuration 
> to disallow the check (default), equal to the RM (to prevent config change 
> for each release), equal to or greater than RM (to allow NM upgrades), and 
> finally an explicit version or version range.
> The RM should also have an configuration on how to treat the mismatch: 
> REJECT, or REBOOT the NM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to