[ 
https://issues.apache.org/jira/browse/YARN-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455683#comment-16455683
 ] 

Rohith Sharma K S commented on YARN-8215:
-----------------------------------------

This issues happens when NM is restarted and containers are recovered. All 
subsequent container publish are done with port zero. The reason is NM service 
are added in order containerManager, webserver. Because of this order, 
containerManager is started first and webserver is started second, 
containerManger recovers the application and corresponding containers and 
publishes into ATSv2. In NMTimelinePublisher, httpAddress is initialized only 
once that try to get port from webserver which returns zero since webserver is 
not started yet. 

I updated the patch that directly considering httpPort form config rather than 
web service. The same port will be used for starting NM web service.

> Ats v2 returns invalid YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS
> -----------------------------------------------------------------
>
>                 Key: YARN-8215
>                 URL: https://issues.apache.org/jira/browse/YARN-8215
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: ATSv2
>    Affects Versions: 3.1.0
>            Reporter: Yesha Vora
>            Assignee: Rohith Sharma K S
>            Priority: Critical
>         Attachments: YARN-8215.01.patch
>
>
> Steps:
> 1) Run Httpd yarn service
> 2) Stop Httpd yarn service
> 3) Validate application attempt page.
> ATS v2 call is returning invalid data for 
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS
> {code:java}
> http://xxx:8198/ws/v2/timeline/apps/application_1524698886838_0005/entities/YARN_CONTAINER?fields=ALL&_=1524705653569{code}
> {code}
> [{"metrics":[{"type":"SINGLE_VALUE","id":"CPU","aggregationOp":"NOP","values":{"1524704571187":0}},{"type":"SINGLE_VALUE","id":"MEMORY","aggregationOp":"NOP","values":{"1524704562126":30973952}}],"events":[{"id":"YARN_CONTAINER_FINISHED","timestamp":1524704571552,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_FINISHED","timestamp":1524704488410,"info":{}},{"id":"YARN_CONTAINER_CREATED","timestamp":1524704482976,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_STARTED","timestamp":1524704482976,"info":{}}],"createdtime":1524704482973,"idprefix":9223370512150292834,"id":"container_e12_1524698886838_0005_01_000003","info":{"YARN_CONTAINER_STATE":"COMPLETE","YARN_CONTAINER_ALLOCATED_HOST":"xxx","YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS":"xxx:0","YARN_CONTAINER_ALLOCATED_VCORE":1,"FROM_ID":"yarn-cluster!hbase!httpd-docker-config-3!1524704463727!application_1524698886838_0005!YARN_CONTAINER!9223370512150292834!container_e12_1524698886838_0005_01_000003","YARN_CONTAINER_ALLOCATED_PORT":25454,"UID":"yarn-cluster!application_1524698886838_0005!YARN_CONTAINER!9223370512150292834!container_e12_1524698886838_0005_01_000003","YARN_CONTAINER_ALLOCATED_MEMORY":1024,"SYSTEM_INFO_PARENT_ENTITY":{"type":"YARN_APPLICATION_ATTEMPT","id":"appattempt_1524698886838_0005_000001"},"YARN_CONTAINER_EXIT_STATUS":-105,"YARN_CONTAINER_ALLOCATED_PRIORITY":"0","YARN_CONTAINER_DIAGNOSTICS_INFO":"[2018-04-26
>  01:02:34.486]Container killed by the ApplicationMaster.\n[2018-04-26 
> 01:02:45.616]Container killed on request. Exit code is 137\n[2018-04-26 
> 01:02:49.387]Container exited with a non-zero exit code 137. 
> \n","YARN_CONTAINER_FINISHED_TIME":1524704571552},"relatesto":{},"configs":{},"isrelatedto":{},"type":"YARN_CONTAINER"},{"metrics":[{"type":"SINGLE_VALUE","id":"CPU","aggregationOp":"NOP","values":{"1524704564690":6}},{"type":"SINGLE_VALUE","id":"MEMORY","aggregationOp":"NOP","values":{"1524704564690":3710976}}],"events":[{"id":"YARN_CONTAINER_FINISHED","timestamp":1524704567244,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_FINISHED","timestamp":1524704487938,"info":{}},{"id":"YARN_CONTAINER_CREATED","timestamp":1524704483140,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_STARTED","timestamp":1524704483140,"info":{}}],"createdtime":1524704482919,"idprefix":9223370512150292888,"id":"container_e12_1524698886838_0005_01_000004","info":{"YARN_CONTAINER_STATE":"COMPLETE","YARN_CONTAINER_ALLOCATED_HOST":"xxx","YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS":"xxx:0","YARN_CONTAINER_ALLOCATED_VCORE":1,"FROM_ID":"yarn-cluster!hbase!httpd-docker-config-3!1524704463727!application_1524698886838_0005!YARN_CONTAINER!9223370512150292888!container_e12_1524698886838_0005_01_000004","YARN_CONTAINER_ALLOCATED_PORT":25454,"UID":"yarn-cluster!application_1524698886838_0005!YARN_CONTAINER!9223370512150292888!container_e12_1524698886838_0005_01_000004","YARN_CONTAINER_ALLOCATED_MEMORY":1024,"SYSTEM_INFO_PARENT_ENTITY":{"type":"YARN_APPLICATION_ATTEMPT","id":"appattempt_1524698886838_0005_000001"},"YARN_CONTAINER_EXIT_STATUS":-105,"YARN_CONTAINER_ALLOCATED_PRIORITY":"1","YARN_CONTAINER_DIAGNOSTICS_INFO":"[2018-04-26
>  01:02:34.500]Container killed by the ApplicationMaster.\n[2018-04-26 
> 01:02:45.771]Container killed on request. Exit code is 137\n[2018-04-26 
> 01:02:47.242]Container exited with a non-zero exit code 137. 
> \n","YARN_CONTAINER_FINISHED_TIME":1524704567244},"relatesto":{},"configs":{},"isrelatedto":{},"type":"YARN_CONTAINER"},{"metrics":[{"type":"SINGLE_VALUE","id":"CPU","aggregationOp":"NOP","values":{"1524704565478":0}},{"type":"SINGLE_VALUE","id":"MEMORY","aggregationOp":"NOP","values":{"1524704562467":30953472}}],"events":[{"id":"YARN_CONTAINER_FINISHED","timestamp":1524704567221,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_FINISHED","timestamp":1524704488211,"info":{}},{"id":"YARN_CONTAINER_CREATED","timestamp":1524704483171,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_STARTED","timestamp":1524704483171,"info":{}}],"createdtime":1524704482918,"idprefix":9223370512150292889,"id":"container_e12_1524698886838_0005_01_000002","info":{"YARN_CONTAINER_STATE":"COMPLETE","YARN_CONTAINER_ALLOCATED_HOST":"xxx","YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS":"xxx:8042","YARN_CONTAINER_ALLOCATED_VCORE":1,"FROM_ID":"yarn-cluster!hbase!httpd-docker-config-3!1524704463727!application_1524698886838_0005!YARN_CONTAINER!9223370512150292889!container_e12_1524698886838_0005_01_000002","YARN_CONTAINER_ALLOCATED_PORT":25454,"UID":"yarn-cluster!application_1524698886838_0005!YARN_CONTAINER!9223370512150292889!container_e12_1524698886838_0005_01_000002","YARN_CONTAINER_ALLOCATED_MEMORY":1024,"SYSTEM_INFO_PARENT_ENTITY":{"type":"YARN_APPLICATION_ATTEMPT","id":"appattempt_1524698886838_0005_000001"},"YARN_CONTAINER_EXIT_STATUS":-105,"YARN_CONTAINER_ALLOCATED_PRIORITY":"0","YARN_CONTAINER_DIAGNOSTICS_INFO":"[2018-04-26
>  01:02:34.509]Container killed by the ApplicationMaster.\n[2018-04-26 
> 01:02:45.776]Container killed on request. Exit code is 137\n[2018-04-26 
> 01:02:47.219]Container exited with a non-zero exit code 137. 
> \n","YARN_CONTAINER_FINISHED_TIME":1524704567221},"relatesto":{},"configs":{},"isrelatedto":{},"type":"YARN_CONTAINER"},{"metrics":[{"type":"SINGLE_VALUE","id":"CPU","aggregationOp":"NOP","values":{"1524704571200":0}},{"type":"SINGLE_VALUE","id":"MEMORY","aggregationOp":"NOP","values":{"1524704553076":461410304}}],"events":[{"id":"YARN_CONTAINER_FINISHED","timestamp":1524704571552,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_FINISHED","timestamp":1524704474786,"info":{}},{"id":"YARN_NM_CONTAINER_LOCALIZATION_STARTED","timestamp":1524704464168,"info":{}},{"id":"YARN_CONTAINER_CREATED","timestamp":1524704464158,"info":{}}],"createdtime":1524704463996,"idprefix":9223370512150311811,"id":"container_e12_1524698886838_0005_01_000001","info":{"YARN_CONTAINER_STATE":"COMPLETE","YARN_CONTAINER_ALLOCATED_HOST":"xxx","YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS":"xxx:0","YARN_CONTAINER_ALLOCATED_VCORE":1,"FROM_ID":"yarn-cluster!hbase!httpd-docker-config-3!1524704463727!application_1524698886838_0005!YARN_CONTAINER!9223370512150311811!container_e12_1524698886838_0005_01_000001","YARN_CONTAINER_ALLOCATED_PORT":25454,"UID":"yarn-cluster!application_1524698886838_0005!YARN_CONTAINER!9223370512150311811!container_e12_1524698886838_0005_01_000001","YARN_CONTAINER_ALLOCATED_MEMORY":1024,"SYSTEM_INFO_PARENT_ENTITY":{"type":"YARN_APPLICATION_ATTEMPT","id":"appattempt_1524698886838_0005_000001"},"YARN_CONTAINER_EXIT_STATUS":0,"YARN_CONTAINER_ALLOCATED_PRIORITY":"0","YARN_CONTAINER_DIAGNOSTICS_INFO":"","YARN_CONTAINER_FINISHED_TIME":1524704571552},"relatesto":{},"configs":{},"isrelatedto":{},"type":"YARN_CONTAINER"}]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to