[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451435#comment-16451435
 ] 

Eric Yang edited comment on YARN-7939 at 4/25/18 12:01 AM:
-----------------------------------------------------------

[~csingh] Thank you for patch 10.  It resolved the Invalid event: 
CONTAINER_ALLOCATED issue.  However, I am encountering another issue.  When 
first instance of container completed upgrade.  I continue to issue upgrade for 
second instance of container, I get an error message:

{code}
[hbase@eyang-1 hadoop-3.2.0-SNAPSHOT]$ curl --negotiate -u : -d@/tmp/u2.json -H 
"Content-Type: application/json" -X PUT 
http://eyang-2.openstacklocal:8088/app/v1/services/abc/components/ping/component-instances/ping-1
{"diagnostics":"The upgrade of service abc has not been initiated."}
{code}

The Status of the application shows:

{code}
[hbase@eyang-1 hadoop-3.2.0-SNAPSHOT]$ ./bin/yarn app -status abc
{"name":"abc","id":"application_1524613547912_0001","lifetime":-1,"components":[{"name":"ping","dependencies":[],"resource":{"cpus":1,"memory":"256","additional":{}},"state":"STABLE","configuration":{"properties":{},"env":{},"files":[]},"quicklinks":[],"containers":[{"id":"container_1524613547912_0001_01_000003","ip":"172.26.111.21","hostname":"eyang-4.openstacklocal","state":"NEEDS_UPGRADE","launch_time":1524613644367,"bare_host":"eyang-4.openstacklocal","component_instance_name":"ping-1"},{"id":"container_1524613547912_0001_01_000004","ip":"172.26.111.21","hostname":"eyang-4.openstacklocal","state":"READY","launch_time":1524613717682,"bare_host":"eyang-4.openstacklocal","component_instance_name":"ping-0"}],"launch_command":"sleep
 
1200000","number_of_containers":2,"run_privileged_container":false}],"configuration":{"properties":{},"env":{},"files":[]},"state":"STABLE","quicklinks":{},"version":"v1","kerberos_principal":{"principal_name":"hbase/_h...@example.com","keytab":"file:///etc/security/keytabs/hbase.service.keytab"}}
{code}

Th service state is STABLE instead of UPGRADING.  At this point, I can not 
continue the upgrade, or finalize the upgrade.  It appears that AM transition 
logic may set service state to STABLE prematurely.

Additional error check logic is recommended to prevent user from calling 
component instance upgrade when service upgrade has not been triggered.


was (Author: eyang):
[~csingh] Thank you for patch 10.  It resolved the Invalid event: 
CONTAINER_ALLOCATED issue.  However, I am encountering another issue.  When 
first instance of container completed upgrade.  I continue to issue upgrade for 
second instance of container, I get an error message:

{code}
[hbase@eyang-1 hadoop-3.2.0-SNAPSHOT]$ curl --negotiate -u : -d@/tmp/u2.json -H 
"Content-Type: application/json" -X PUT 
http://eyang-2.openstacklocal:8088/app/v1/services/abc/components/ping/component-instances/ping-1
{"diagnostics":"The upgrade of service abc has not been initiated."}
{code}

The Status of the application shows:

{code}
[hbase@eyang-1 hadoop-3.2.0-SNAPSHOT]$ ./bin/yarn app -status abc
{"name":"abc","id":"application_1524613547912_0001","lifetime":-1,"components":[{"name":"ping","dependencies":[],"resource":{"cpus":1,"memory":"256","additional":{}},"state":"STABLE","configuration":{"properties":{},"env":{},"files":[]},"quicklinks":[],"containers":[{"id":"container_1524613547912_0001_01_000003","ip":"172.26.111.21","hostname":"eyang-4.openstacklocal","state":"NEEDS_UPGRADE","launch_time":1524613644367,"bare_host":"eyang-4.openstacklocal","component_instance_name":"ping-1"},{"id":"container_1524613547912_0001_01_000004","ip":"172.26.111.21","hostname":"eyang-4.openstacklocal","state":"READY","launch_time":1524613717682,"bare_host":"eyang-4.openstacklocal","component_instance_name":"ping-0"}],"launch_command":"sleep
 
1200000","number_of_containers":2,"run_privileged_container":false}],"configuration":{"properties":{},"env":{},"files":[]},"state":"STABLE","quicklinks":{},"version":"v1","kerberos_principal":{"principal_name":"hbase/_h...@example.com","keytab":"file:///etc/security/keytabs/hbase.service.keytab"}}
{code}

Th service state is STABLE instead of UPGRADING.  At this point, I can not 
continue the upgrade, or finalize the upgrade.

Additional error check logic is recommended to prevent user from calling 
component instance upgrade when service upgrade has not been triggered.

> Yarn Service Upgrade: add support to upgrade a component instance 
> ------------------------------------------------------------------
>
>                 Key: YARN-7939
>                 URL: https://issues.apache.org/jira/browse/YARN-7939
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>            Priority: Major
>         Attachments: YARN-7939.001.patch, YARN-7939.002.patch, 
> YARN-7939.003.patch, YARN-7939.004.patch, YARN-7939.005.patch, 
> YARN-7939.006.patch, YARN-7939.007.patch, YARN-7939.008.patch, 
> YARN-7939.009.patch, YARN-7939.010.patch, serviceam.log, upgrade_logs.tgz
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to