[ 
https://issues.apache.org/jira/browse/YARN-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-7548:
-----------------------------------

    Assignee: Susheel Gupta  (was: Szilard Nemeth)

> TestCapacityOverTimePolicy.testAllocation is flaky
> --------------------------------------------------
>
>                 Key: YARN-7548
>                 URL: https://issues.apache.org/jira/browse/YARN-7548
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: reservation system
>    Affects Versions: 3.0.0-beta1
>            Reporter: Haibo Chen
>            Assignee: Susheel Gupta
>            Priority: Major
>
> *Reported at: 15/Nov/18 20:32*
> It failed in both YARN-7337 and YARN-6921 jenkins jobs.
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation[Duration
>  90,000,000, height 0.25, numSubmission 1, periodic 86400000)]
> *Stacktrace*
> {code:java}
> junit.framework.AssertionFailedError: null
>  at junit.framework.Assert.fail(Assert.java:55)
>  at junit.framework.Assert.fail(Assert.java:64)
>  at junit.framework.TestCase.fail(TestCase.java:235)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:146)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136){code}
> *Standard Output*
> {code:java}
> 2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (RMStateStore.java:transition(538)) - Storing reservation 
> allocation.reservation_-9026698577416205920_6337917439559340517
>  2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (MemoryRMStateStore.java:storeReservationState(247)) - Storing 
> reservationallocation for 
> reservation_-9026698577416205920_6337917439559340517 for plan dedicated
>  2017-11-20 23:57:03,760 INFO [main] reservation.InMemoryPlan 
> (InMemoryPlan.java:addReservation(373)) - Successfully added reservation: 
> reservation_-9026698577416205920_6337917439559340517 to plan.
>  In-memory Plan: Parent Queue: dedicatedTotal Capacity: <memory:1024000, 
> vCores:1000>Step: 1000reservation_-9026698577416205920_6337917439559340517 
> user:u1 startTime: 0 endTime: 86400000 Periodiciy: 86400000 alloc:
>  [Period: 86400000
>  0: <memory:256000, vCores:250>
>  3423748: <memory:0, vCores:0>
>  86223748: <memory:256000, vCores:250>
>  86400000: <memory:0, vCores:0>
>  9223372036854775807: null
>  ]
> {code}
> *Reported at: 21/Feb/24*
> Ran TestCapacityOverTimePolicy testcase locally 100 times in a row and found 
> it failed 5 times with the below error:
> [INFO] Running 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
> [ERROR] Tests run: 30, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 0.503 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
> [ERROR] testAllocation[Duration 60,000, height 0.25, numSubmission 3, 
> periodic 
> 7200000)](org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy)
>   Time elapsed: 0.009 s  <<< ERROR!
> org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningQuotaException:
>  Integral (avg over time) quota capacity 0.25 over a window of 86400 seconds, 
>  would be exceeded by accepting reservation: 
> reservation_-7619846766601560789_3793931544284185119
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.CapacityOverTimePolicy.validate(CapacityOverTimePolicy.java:206)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.InMemoryPlan.addReservation(InMemoryPlan.java:348)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:141)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>         at org.junit.runners.Suite.runChild(Suite.java:128)
>         at org.junit.runners.Suite.runChild(Suite.java:27)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> Caused by: 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
>  RLESparseResourceAllocation: merge failed as the resulting 
> RLESparseResourceAllocation would be negative, when testing: 
> (-9223372036768375809=<memory:545778, vCores:533>) > 
> (-172800000=<memory:256000, vCores:250>)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.RLESparseResourceAllocation.combineValue(RLESparseResourceAllocation.java:462)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.RLESparseResourceAllocation.merge(RLESparseResourceAllocation.java:353)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.RLESparseResourceAllocation.merge(RLESparseResourceAllocation.java:312)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.CapacityOverTimePolicy.validate(CapacityOverTimePolicy.java:197)
>         ... 40 more
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to