[
https://issues.apache.org/jira/browse/YARN-11732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885650#comment-17885650
]
ASF GitHub Bot commented on YARN-11732:
---------------------------------------
TaoYang526 commented on code in PR #7065:
URL: https://github.com/apache/hadoop/pull/7065#discussion_r1779890901
##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java:
##########
@@ -757,10 +757,9 @@ private void completeOustandingUpdatesWhichAreReserved(
RMContainer rmContainer, ContainerStatus containerStatus,
RMContainerEventType event) {
N schedulerNode = getSchedulerNode(rmContainer.getNodeId());
- if (schedulerNode != null &&
- schedulerNode.getReservedContainer() != null) {
+ if (schedulerNode != null) {
RMContainer resContainer = schedulerNode.getReservedContainer();
- if (resContainer.getReservedSchedulerKey() != null) {
+ if (resContainer != null && resContainer.getReservedSchedulerKey() !=
null) {
Review Comment:
Thanks @zeekling for the review.
I'm not sure why your environment still reported NPE after changing like
that, `resContainer.getReservedSchedulerKey()` won't throw NPE any more with
the previous not-null check: `if resContainer is null`. Could you please attach
some details of the NPE and your changes?
For this change, I think NPE should be fixed instead of be caught in
general.
> Potential NPE when calling SchedulerNode#reservedContainer for
> CapacityScheduler
> --------------------------------------------------------------------------------
>
> Key: YARN-11732
> URL: https://issues.apache.org/jira/browse/YARN-11732
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacityscheduler
> Affects Versions: 3.4.0, 3.2.4, 3.3.6, 3.5.0
> Reporter: Tao Yang
> Assignee: Tao Yang
> Priority: Major
> Labels: pull-request-available
>
> I found some places calling *SchedulerNode#getReservedContainer* to get
> reservedContainer (returned value) but not do sanity(not-null) check before
> calling internal methods of it, which can have a risk to raise
> NullPointerException if it's null.
> Most of these places have a premise that node has reserved container a few
> moments ago, but may getting null by calling
> *SchedulerNode#getReservedContainer* in the next moment, since the
> reservedContainer can be updated to null concurrently in scheduling and
> monitoring(preemption) thread. So that not-null check should be done before
> calling internal methods of the reservedContainer.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]