[
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tao Yang updated YARN-6629:
---------------------------
Description:
I wrote a test case to reproduce another problem for branch-2 and found new NPE
error, log:
{code}
FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in handling
event type NODE_UPDATE to the Event Dispatcher
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
at
org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
at
org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
at
org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply(<generated>)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
at
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
at java.lang.Thread.run(Thread.java:745)
{code}
Reproduce this error in chronological order:
1. AM started and requested 1 container with schedulerRequestKey#1 :
ApplicationMasterService#allocate --> CapacityScheduler#allocate -->
SchedulerApplicationAttempt#updateResourceRequests -->
AppSchedulingInfo#updateResourceRequests
Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
2. Scheduler allocatd 1 container for this request and accepted the proposal
3. AM removed this request
ApplicationMasterService#allocate --> CapacityScheduler#allocate -->
SchedulerApplicationAttempt#updateResourceRequests -->
AppSchedulingInfo#updateResourceRequests -->
AppSchedulingInfo#addToPlacementSets -->
AppSchedulingInfo#updatePendingResources
Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
4. Scheduler applied this proposal and wanted to deduct the pending resource
CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply -->
AppSchedulingInfo#allocate
Throw NPE when called
schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey,
type, node);
was:
I wrote a test case to test other problem for branch-2 and found new NPE error,
log:
{code}
FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in handling
event type NODE_UPDATE to the Event Dispatcher
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
at
org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
at
org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
at
org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply(<generated>)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
at
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
at java.lang.Thread.run(Thread.java:745)
{code}
Reproduce this error in chronological order:
1. AM started and requested 1 container with schedulerRequestKey#1 :
ApplicationMasterService#allocate --> CapacityScheduler#allocate -->
SchedulerApplicationAttempt#updateResourceRequests -->
AppSchedulingInfo#updateResourceRequests
Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
2. Scheduler allocatd 1 container for this request and accepted the proposal
3. AM removed this request
ApplicationMasterService#allocate --> CapacityScheduler#allocate -->
SchedulerApplicationAttempt#updateResourceRequests -->
AppSchedulingInfo#updateResourceRequests -->
AppSchedulingInfo#addToPlacementSets -->
AppSchedulingInfo#updatePendingResources
Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
4. Scheduler applied this proposal and wanted to deduct the pending resource
CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply -->
AppSchedulingInfo#allocate
Throw NPE when called
schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey,
type, node);
> NPE occurred when container allocation proposal is applied but its resource
> requests are removed before
> -------------------------------------------------------------------------------------------------------
>
> Key: YARN-6629
> URL: https://issues.apache.org/jira/browse/YARN-6629
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.9.0, 3.0.0-alpha2
> Reporter: Tao Yang
> Assignee: Tao Yang
> Attachments: YARN-6629.001.patch
>
>
> I wrote a test case to reproduce another problem for branch-2 and found new
> NPE error, log:
> {code}
> FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
> at
> org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
> at
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
> at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
> at
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply(<generated>)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
> at
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Reproduce this error in chronological order:
> 1. AM started and requested 1 container with schedulerRequestKey#1 :
> ApplicationMasterService#allocate --> CapacityScheduler#allocate -->
> SchedulerApplicationAttempt#updateResourceRequests -->
> AppSchedulingInfo#updateResourceRequests
> Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
> 2. Scheduler allocatd 1 container for this request and accepted the proposal
> 3. AM removed this request
> ApplicationMasterService#allocate --> CapacityScheduler#allocate -->
> SchedulerApplicationAttempt#updateResourceRequests -->
> AppSchedulingInfo#updateResourceRequests -->
> AppSchedulingInfo#addToPlacementSets -->
> AppSchedulingInfo#updatePendingResources
> Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
> 4. Scheduler applied this proposal and wanted to deduct the pending resource
> CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply -->
> AppSchedulingInfo#allocate
> Throw NPE when called
> schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey,
> type, node);
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]