[ 
https://issues.apache.org/jira/browse/YARN-8330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542327#comment-16542327
 ] 

Wangda Tan commented on YARN-8330:
----------------------------------

Trying to remember this issue, and post it here before forgot again: 
- The issue is caused by we send container information to ATS inside 
RMContainerImpl's constructor, we should not do it: 
{code}
 // If saveNonAMContainerMetaInfo is true, store system metrics for all
 // containers. If false, and if this container is marked as the AM, metrics
 // will still be published for this container, but that calculation happens
 // later.
 if (saveNonAMContainerMetaInfo && null != container.getId()) {
 rmContext.getSystemMetricsPublisher().containerCreated(
 this, this.creationTime);
 }
{code

> An extra container got launched by RM for yarn-service
> ------------------------------------------------------
>
>                 Key: YARN-8330
>                 URL: https://issues.apache.org/jira/browse/YARN-8330
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn-native-services
>            Reporter: Yesha Vora
>            Assignee: Suma Shivaprasad
>            Priority: Critical
>
> Steps:
> launch Hbase tarball app
> list containers for hbase tarball app
> {code}
> /usr/hdp/current/hadoop-yarn-client/bin/yarn container -list 
> appattempt_1525463491331_0006_000001
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/05/04 22:36:11 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 18/05/04 22:36:11 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm2
> Total number of containers :5
> Container-Id            Start Time             Finish Time                   
> State                    Host       Node Http Address                         
>        LOG-URL
> container_e06_1525463491331_0006_01_000002    Fri May 04 22:34:26 +0000 2018  
>                  N/A                 RUNNING    xxx:25454  http://xxx:8042    
> http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_000002/hrt_qa
> 2018-05-04 22:36:11,216|INFO|MainThread|machine.py:167 - 
> run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_000003
>     Fri May 04 22:34:26 +0000 2018                   N/A                 
> RUNNING    xxx:25454  http://xxx:8042    
> http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_000003/hrt_qa
> 2018-05-04 22:36:11,217|INFO|MainThread|machine.py:167 - 
> run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_000001
>     Fri May 04 22:34:15 +0000 2018                   N/A                 
> RUNNING    xxx:25454  http://xxx:8042    
> http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_000001/hrt_qa
> 2018-05-04 22:36:11,217|INFO|MainThread|machine.py:167 - 
> run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_000005
>     Fri May 04 22:34:56 +0000 2018                   N/A                 
> RUNNING    xxx:25454  http://xxx:8042    
> http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_000005/hrt_qa
> 2018-05-04 22:36:11,218|INFO|MainThread|machine.py:167 - 
> run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_000004
>     Fri May 04 22:34:56 +0000 2018                   N/A                    
> null    xxx:25454  http://xxx:8042    
> http://xxx:8188/applicationhistory/logs/xxx:25454/container_e06_1525463491331_0006_01_000004/container_e06_1525463491331_0006_01_000004/hrt_qa{code}
> Total expected containers = 4 ( 3 components container + 1 am). Instead, RM 
> is listing 5 containers. 
> container_e06_1525463491331_0006_01_000004 is in null state.
> Yarn service utilized container 02, 03, 05 for component. There is no log 
> available in NM & AM related to container 04. Only one line in RM log is 
> printed
> {code}
> 2018-05-04 22:34:56,618 INFO  rmcontainer.RMContainerImpl 
> (RMContainerImpl.java:handle(489)) - 
> container_e06_1525463491331_0006_01_000004 Container Transitioned from NEW to 
> RESERVED{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to