[ https://issues.apache.org/jira/browse/YARN-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907508#comment-16907508 ]
Prabhu Joseph commented on YARN-9290: ------------------------------------- Thanks [~snemeth] for checking this. *Overall design:* When there is a invalid scheduling request made by an application, the allocate will retry for configured retry attempts ({{yarn.resourcemanager.placement-constraints.retry-attempts)}} and then sets the invalid SchedulingRequests in AllocateResponse so that ApplicationMaster is aware. *Changes:* {{AppPlacementAllocator.java}} - Maintains the retry made so far. Added new instance variable placementAttempt to track it. {{SingleConstraintAppPlacementAllocator.java}} - Increments the retry attempt when the scheduling request is invalid. {{AppSchedulingInfo.java}} - Rejects the SchedulingRequest if the retry attempts has exceeded the configured value. {{Allocation.java}} - Maintains the list of rejected SchedulingRequest. Constructors changed to accept the list. {{FiCaSchedulerApp.java}} - Creates Allocation object with the list of rejected scheduling requests. It fetches them from AppSchedulingInfo#getSchedulingRequest. {{DefaultAMSProcessor.java}} - Sets the rejected scheduling requests in AllocateResponse which is passed to AM. {{FairScheduler.java}} - Changed the Allocation constructor call. *Test Cases:* {{TestSchedulingRequestContainerAllocation.java}} - Actual TestCase which verifies the rejection of invalid scheduling request. All Below test classes are added with getYarnConfiguration in the mock RMContext object as AppSchedulingInfo is changed to get the YarnConfigiration from RMContext to read configured {{yarn.resourcemanager.placement-constraints.retry-attempts}}. {{TestAppSchedulingInfo.java}} {{TestSchedulerApplicationAttempt.java}} {{TestLeafQueue.java}} {{TestUtils.java}} {{TestFSAppAttempt.java}} {{TestMaxRunningAppsEnforcer.java}} {{TestQueueManager.java}} {{TestFifoScheduler.java}} > Invalid SchedulingRequest not rejected in Scheduler > PlacementConstraintsHandler > -------------------------------------------------------------------------------- > > Key: YARN-9290 > URL: https://issues.apache.org/jira/browse/YARN-9290 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 3.2.0 > Reporter: Prabhu Joseph > Assignee: Prabhu Joseph > Priority: Major > Attachments: YARN-9290-001.patch, YARN-9290-002.patch, > YARN-9290-003.patch, YARN-9290-004.patch, YARN-9290-005.patch, > YARN-9290-006.patch > > > SchedulingRequest with Invalid namespace is not rejected in Scheduler > PlacementConstraintsHandler. RM keeps on trying to allocateOnNode with > logging the exception. This is rejected in case of placement-processor > handler. > {code} > 2019-02-08 16:51:27,548 WARN > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator: > Failed to query node cardinality: > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.InvalidAllocationTagsQueryException: > Invalid namespace prefix: notselfi, valid values are: > all,not-self,app-id,app-tag,self > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.fromString(TargetApplicationsNamespace.java:277) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.parse(TargetApplicationsNamespace.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.AllocationTags.createAllocationTags(AllocationTags.java:93) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraintExpression(PlacementConstraintsUtil.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraint(PlacementConstraintsUtil.java:240) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyAndConstraint(PlacementConstraintsUtil.java:272) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:365) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.checkCardinalityAndPending(SingleConstraintAppPlacementAllocator.java:355) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.precheckNode(SingleConstraintAppPlacementAllocator.java:395) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.precheckNode(AppSchedulingInfo.java:779) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.preCheckForNodeCandidateSet(RegularContainerAllocator.java:145) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:837) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:890) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:977) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1173) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:795) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:623) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1630) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1624) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1727) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1476) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1312) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1785) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:171) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org