[
https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754447#comment-16754447
]
Hudson commented on YARN-8901:
------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15841 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/15841/])
YARN-8901. Fixed restart policy NEVER/ON_FAILURE with component (eyang: rev
f5a95f7998e110cab81e52acd99b07e13ea9653d)
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/component/TestComponentRestartPolicy.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/monitor/TestServiceMonitor.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/NeverRestartPolicy.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/OnFailureRestartPolicy.java
> Restart "NEVER" policy does not work with component dependency
> --------------------------------------------------------------
>
> Key: YARN-8901
> URL: https://issues.apache.org/jira/browse/YARN-8901
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 3.1.1
> Reporter: Yesha Vora
> Assignee: Suma Shivaprasad
> Priority: Critical
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-8901.1.patch, YARN-8901.2.patch, YARN-8901.3.patch
>
>
> Scenario:
> 1) Launch an application with two components. master and worker. Here, worker
> is dependent on master. ( Worker should be launched only after master is
> launched )
> 2) Set restart_policy = NEVER for both master and worker.
> {code:title=sample launch.json}
> {
> "name": "mawo-hadoop-ut",
> "artifact": {
> "type": "DOCKER",
> "id": "xxx"
> },
> "configuration": {
> "env": {
> "YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK":
> "hadoop"
> },
> "properties": {
> "docker.network": "hadoop"
> }
> },
> "components": [{
> "dependencies": [],
> "resource": {
> "memory": "2048",
> "cpus": "1"
> },
> "name": "master",
> "run_privileged_container": true,
> "number_of_containers": 1,
> "launch_command": "start master",
> "restart_policy": "NEVER",
> }, {
> "dependencies": ["master"],
> "resource": {
> "memory": "8072",
> "cpus": "1"
> },
> "name": "worker",
> "run_privileged_container": true,
> "number_of_containers": 10,
> "launch_command": "start worker",
> "restart_policy": "NEVER",
> }],
> "lifetime": -1,
> "version": 1.0
> }{code}
> When restart policy is selected to NEVER, AM never launches Worker component.
> It get stuck with below message.
> {code}
> 2018-10-17 15:11:58,560 [Component dispatcher] INFO component.Component -
> [COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event.
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO instance.ComponentInstance -
> [COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_000002]
> Transitioned from STARTED to READY on BECOME_READY event
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO component.Component -
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances
> are ready or the dependent component has not completed
> 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO component.Component -
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances
> are ready or the dependent component has not completed
> 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO component.Component -
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances
> are ready or the dependent component has not completed
> 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO component.Component -
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances
> are ready or the dependent component has not completed
> 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO component.Component -
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances
> are ready or the dependent component has not completed
> 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO component.Component -
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances
> are ready or the dependent component has not completed {code}
> 'NEVER' restart policy expects master component to be finished before
> starting workers. Master component can not finish the job without workers.
> Thus, it create a deadlock.
> The logic for 'NEVER' restart policy should be fixed to allow worker
> components to be launched as soon as master component is in READY state.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]