Vinod Kumar Vavilapalli created YARN-4726:
---------------------------------------------
Summary: [Umbrella] Allocation reuse for application upgrades
Key: YARN-4726
URL: https://issues.apache.org/jira/browse/YARN-4726
Project: Hadoop YARN
Issue Type: New Feature
Reporter: Vinod Kumar Vavilapalli
See overview doc at YARN-4692, copying the sub-section to track all related
efforts.
Once auto-restart of containers is taken care of (YARN-4725), we need to
address what I believe is the second most important reason for service
containers to restart : upgrades. Once a service is running on YARN, the way
container allocation-lifecycle works, any time the container exits, YARN will
reclaim the resources. During an upgrade, with multitude of other applications
running in the system, giving up and getting back resources allocated to the
service is hard to manage. Things like NodeLabels in YARN help this cause
but are not straightforward to use to address the app-specific usecases.
We need a first class way of letting application reuse the same
resourceallocation for multiple launches of the processes inside the
container. This is done by decoupling allocation lifecycle and the process
lifecycle.
The JIRA YARN-1040 initiated this conversation. We need two things here:
- (1) (Task) the ApplicationMaster should be able to use the same
container-allocation and issue multiple startContainerrequests to the
NodeManager.
- (2) [(Task) To support the upgrade of the ApplicationMaster itself, clients
should be able to inform YARN to restart AM within the same allocation but with
new bits.
The JIRAs YARN-3417 and YARN-4470 talk about the second task above ...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)