Juan Rodríguez Hortalá created YARN-6483:
--------------------------------------------

             Summary: Add nodes transitioning to DECOMMISSIONING state to the 
list of updated nodes returned by the Resource Manager as a response to the 
Application Master heartbeat
                 Key: YARN-6483
                 URL: https://issues.apache.org/jira/browse/YARN-6483
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: resourcemanager
    Affects Versions: 2.7.3
            Reporter: Juan Rodríguez Hortalá


The DECOMMISSIONING node state is currently used as part of the graceful 
decommissioning mechanism to give time for tasks to complete in a node that is 
scheduled for decommission, and for reducer tasks to read the shuffle blocks in 
that node. Also, YARN effectively blacklists nodes in DECOMMISSIONING state by 
assigning them a capacity of 0, to prevent additional containers to be launched 
in those nodes, so no more shuffle blocks are written to the node. This 
blacklisting is not effective for applications like Spark, because a Spark 
executor running in a YARN container will keep receiving more tasks after the 
corresponding node has been blacklisted at the YARN level. We would like to 
propose a modification of the YARN heartbeat mechanism so nodes transitioning 
to DECOMMISSIONING are added to the list of updated nodes returned by the 
Resource Manager as a response to the Application Master heartbeat. This way a 
Spark application master would be able to blacklist a DECOMMISSIONING at the 
Spark level.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to