Vinod Kumar Vavilapalli created YARN-4726:
---------------------------------------------

             Summary: [Umbrella] Allocation reuse for application upgrades
                 Key: YARN-4726
                 URL: https://issues.apache.org/jira/browse/YARN-4726
             Project: Hadoop YARN
          Issue Type: New Feature
            Reporter: Vinod Kumar Vavilapalli


See overview doc at YARN-4692, copying the sub-section to track all related 
efforts.

Once auto-­restart of containers is taken care of (YARN-4725), we need to 
address what I believe is the second most important reason for service 
containers to restart : upgrades. Once a service is running on YARN, the way 
container allocation-­lifecycle works, any time the container exits, YARN will 
reclaim the resources. During an upgrade, with multitude of other applications 
running in the system, giving up and getting back resources allocated to the 
service is hard to manage. Things like N​ode­Labels in YARN ​help this cause 
but are not straight­forward to use to address the app­-specific use­cases.

We need a first class way of letting application reuse the same 
resource­allocation for multiple launches of the processes inside the 
container. This is done by decoupling allocation lifecycle and the process 
life­cycle.

The JIRA YARN-1040 initiated this conversation. We need two things here: 
 - (1) (​Task) ​the ApplicationMaster should be able to use the same 
container-allocation and issue multiple s​tartContainer​requests to the 
NodeManager.
 - (2) [(Task) To support the upgrade of the ApplicationMaster itself, clients 
should be able to inform YARN to restart AM within the same allocation but with 
new bits.

The JIRAs YARN-3417 and YARN-4470 talk about the second task above ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to