[ 
https://issues.apache.org/jira/browse/YARN-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Anand updated YARN-3947:
-----------------------------
    Description: 
When running {{yarn decommissioning}} the {{yarn rmadmin -refreshNodes}} 
doesn't like short host names for the nodes to be decommissioned in 
{{yarn.exclude}} file. It requires {{FQDN}} for the host name to be present to 
be able to successfully decommission a node. The decommissioning behavior in 
{{HDFS}} is different as it can take short host names. 

Below are the details of what I am seeing:

My {{yarn.exlcude}} has short name for the host name:
bcpc-vm1

Running:
{code}
sudo -u yarn yarn rmadmin -refreshNodes
{code}

shows following entries in the log file:
{code}
2015-07-21 11:14:18,795 INFO org.apache.hadoop.conf.Configuration: found 
resource yarn-site.xml at file:/etc/hadoop/conf/yarn-site.xml
2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the includes file to 
2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the excludes file to /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Refreshing 
hosts (include/exclude) list
2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding 
bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding 
bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:18,803 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     
IP=10.0.100.12 OPERATION=refreshNodes  TARGET=AdminService     RESULT=SUCCESS
{code}

And the node is not decommissioned. 

When I add the {{FQDN}} for the host name the decommissioning works 
successfully and I see following in the RM logs:

{code}
2015-07-21 11:14:43,453 INFO org.apache.hadoop.conf.Configuration: found 
resource yarn-site.xml at file:/etc/hadoop/conf.LAB-A/yarn-site.xml
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the includes file to 
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the excludes file to /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Refreshing 
hosts (include/exclude) list
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Adding 
bcpc-vm1.example.com to the list of excluded hosts from 
/etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:43,456 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     
IP=10.100.0.11 OPERATION=refreshNodes  TARGET=AdminService     RESULT=SUCCESS
2015-07-21 11:14:44,198 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: 
Disallowed NodeManager nodeId: bcpc-vm1.example.com:35197 hostname: 
bcpc-vm1.example.com:35197
2015-07-21 11:14:44,198 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating 
Node bcpc-vm1.example.com:35197 as it is now DECOMMISSIONED
2015-07-21 11:14:44,199 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
bcpc-vm1.example.com:35197 Node Transitioned from RUNNING to DECOMMISSIONED
2015-07-21 11:14:44,199 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
Removed node bcpc-vm1.example.com:35197 cluster capacity: <memory:618723, 
vCores:96>
{code}



  was:
When running {yarn decommissioning} the {{yarn rmadmin -refreshNodes}} doesn't 
like short host names for the nodes to be decommissioned in {yarn.exclude} 
file. It requires {{FQDN}} for the host name to be present to be able to 
successfully decommission a node. The decommissioning behavior in {{HDFS}} is 
different as it can take short host names. 

Below are the details of what I am seeing:

My {{yarn.exlcude}} has short name for the host name:
bcpc-vm1

Running:
{code}
sudo -u yarn yarn rmadmin -refreshNodes
{code}

shows following entries in the log file:
{code}
2015-07-21 11:14:18,795 INFO org.apache.hadoop.conf.Configuration: found 
resource yarn-site.xml at file:/etc/hadoop/conf/yarn-site.xml
2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the includes file to 
2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the excludes file to /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Refreshing 
hosts (include/exclude) list
2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding 
bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding 
bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:18,803 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     
IP=10.0.100.12 OPERATION=refreshNodes  TARGET=AdminService     RESULT=SUCCESS
{code}

And the node is not decommissioned. 

When I add the {{FQDN}} for the host name the decommissioning works 
successfully and I see following in the RM logs:

{code}
2015-07-21 11:14:43,453 INFO org.apache.hadoop.conf.Configuration: found 
resource yarn-site.xml at file:/etc/hadoop/conf.LAB-A/yarn-site.xml
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the includes file to 
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting 
the excludes file to /etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Refreshing 
hosts (include/exclude) list
2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Adding 
bcpc-vm1.example.com to the list of excluded hosts from 
/etc/hadoop/conf/yarn.exclude
2015-07-21 11:14:43,456 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     
IP=10.100.0.11 OPERATION=refreshNodes  TARGET=AdminService     RESULT=SUCCESS
2015-07-21 11:14:44,198 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: 
Disallowed NodeManager nodeId: bcpc-vm1.example.com:35197 hostname: 
bcpc-vm1.example.com:35197
2015-07-21 11:14:44,198 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating 
Node bcpc-vm1.example.com:35197 as it is now DECOMMISSIONED
2015-07-21 11:14:44,199 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
bcpc-vm1.example.com:35197 Node Transitioned from RUNNING to DECOMMISSIONED
2015-07-21 11:14:44,199 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
Removed node bcpc-vm1.example.com:35197 cluster capacity: <memory:618723, 
vCores:96>
{code}




> Add support for short host names in yarn decommissioning process
> ----------------------------------------------------------------
>
>                 Key: YARN-3947
>                 URL: https://issues.apache.org/jira/browse/YARN-3947
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Amit Anand
>            Priority: Minor
>
> When running {{yarn decommissioning}} the {{yarn rmadmin -refreshNodes}} 
> doesn't like short host names for the nodes to be decommissioned in 
> {{yarn.exclude}} file. It requires {{FQDN}} for the host name to be present 
> to be able to successfully decommission a node. The decommissioning behavior 
> in {{HDFS}} is different as it can take short host names. 
> Below are the details of what I am seeing:
> My {{yarn.exlcude}} has short name for the host name:
> bcpc-vm1
> Running:
> {code}
> sudo -u yarn yarn rmadmin -refreshNodes
> {code}
> shows following entries in the log file:
> {code}
> 2015-07-21 11:14:18,795 INFO org.apache.hadoop.conf.Configuration: found 
> resource yarn-site.xml at file:/etc/hadoop/conf/yarn-site.xml
> 2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting 
> the includes file to 
> 2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting 
> the excludes file to /etc/hadoop/conf/yarn.exclude
> 2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: 
> Refreshing hosts (include/exclude) list
> 2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding 
> bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude
> 2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding 
> bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude
> 2015-07-21 11:14:18,803 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     
> IP=10.0.100.12 OPERATION=refreshNodes  TARGET=AdminService     RESULT=SUCCESS
> {code}
> And the node is not decommissioned. 
> When I add the {{FQDN}} for the host name the decommissioning works 
> successfully and I see following in the RM logs:
> {code}
> 2015-07-21 11:14:43,453 INFO org.apache.hadoop.conf.Configuration: found 
> resource yarn-site.xml at file:/etc/hadoop/conf.LAB-A/yarn-site.xml
> 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting 
> the includes file to 
> 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting 
> the excludes file to /etc/hadoop/conf/yarn.exclude
> 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: 
> Refreshing hosts (include/exclude) list
> 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Adding 
> bcpc-vm1.example.com to the list of excluded hosts from 
> /etc/hadoop/conf/yarn.exclude
> 2015-07-21 11:14:43,456 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     
> IP=10.100.0.11 OPERATION=refreshNodes  TARGET=AdminService     RESULT=SUCCESS
> 2015-07-21 11:14:44,198 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: 
> Disallowed NodeManager nodeId: bcpc-vm1.example.com:35197 hostname: 
> bcpc-vm1.example.com:35197
> 2015-07-21 11:14:44,198 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating 
> Node bcpc-vm1.example.com:35197 as it is now DECOMMISSIONED
> 2015-07-21 11:14:44,199 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
> bcpc-vm1.example.com:35197 Node Transitioned from RUNNING to DECOMMISSIONED
> 2015-07-21 11:14:44,199 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Removed node bcpc-vm1.example.com:35197 cluster capacity: <memory:618723, 
> vCores:96>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to