[
https://issues.apache.org/jira/browse/YARN-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284441#comment-16284441
]
Subru Krishnan commented on YARN-7591:
--------------------------------------
Thanks [~Tao Yang] for the contribution and [~leftnoteasy] for the
review/commit.
[~leftnoteasy], I see the commit in trunk but not in branch-2/2.9 so are you
planning cherry-pick down?
> NPE in async-scheduling mode of CapacityScheduler
> -------------------------------------------------
>
> Key: YARN-7591
> URL: https://issues.apache.org/jira/browse/YARN-7591
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacityscheduler
> Affects Versions: 3.0.0-alpha4, 2.9.1
> Reporter: Tao Yang
> Assignee: Tao Yang
> Priority: Critical
> Attachments: YARN-7591.001.patch, YARN-7591.002.patch
>
>
> Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in
> special scenarios as below.
> (1) The user should be removed after its last application finished, NPE may
> be raised if getting something from user object without the null check in
> async-scheduling threads.
> (2) NPE may be raised when trying fulfill reservation for a finished
> application in {{CapacityScheduler#allocateContainerOnSingleNode}}.
> {code}
> RMContainer reservedContainer = node.getReservedContainer();
> if (reservedContainer != null) {
> FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer(
> reservedContainer.getContainerId());
> // NPE here: reservedApplication could be null after this application
> finished
> // Try to fulfill the reservation
> LOG.info(
> "Trying to fulfill reservation for application " +
> reservedApplication
> .getApplicationId() + " on node: " + node.getNodeID());
> {code}
> (3) If proposal1 (allocate containerX on node1) and proposal2 (reserve
> containerY on node1) were generated by different async-scheduling threads
> around the same time and proposal2 was submitted in front of proposal1, NPE
> is raised when trying to submit proposal2 in
> {{FiCaSchedulerApp#commonCheckContainerAllocation}}.
> {code}
> if (reservedContainerOnNode != null) {
> // NPE here: allocation.getAllocateFromReservedContainer() should be
> null for proposal2 in this case
> RMContainer fromReservedContainer =
> allocation.getAllocateFromReservedContainer().getRmContainer();
> if (fromReservedContainer != reservedContainerOnNode) {
> if (LOG.isDebugEnabled()) {
> LOG.debug(
> "Try to allocate from a non-existed reserved container");
> }
> return false;
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]