[ 
https://issues.apache.org/jira/browse/YARN-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-5092:
-----------------------------
    Attachment: YARN-5092.001.patch

There are two unrelated problems here.

The class cast exception is caused when the two tests run in a particular 
order.  The setup method configures the fair scheduler, but one of the tests 
ignores the setup conf and uses a default one.  Therefore one of the tests was 
running with the fair scheduler and the other with the capacity scheduler.  If 
the capacity scheduler test runs first, the QueueMetrics will be initialized 
with a CSQueueMetrics.  Later when the fair scheduler tries to wield the 
already existing queue metric it fails to cast it because it's the wrong type.  
I fixed this by having both tests use the same base config, and I also cleared 
out the queue metrics in-between tests just for good measure.

The rolling master key failure is triggered because there's a small benign 
window of time in the AbstractDelegationTokenSecretManager where a master key 
can have its expiry updated but not in the state store yet.  The test 
occasionally catches this window, and because DelegationKey leverages the 
expiration date in its hashcode and equals methods, the contains method on the 
set of delegation keys fails to find it.  As [~daryn] pointed out to me 
offline, arguably DelegationKey should not be using the expiration date for 
hashcode/equals.  However there's tons of stuff using and deriving from 
DelegationKey, so it's somewhat of a risky change to remove it.  Instead I 
updated the unit test to check for a matching key ID in the state store rather 
than the contains method.

If we don't update the DelegationKey hashcode/equals then there should be a 
followup JIRA to fix the MemoryRMStateStore, as it currently leaks delegation 
keys as they roll.  The key's expiration date gets updated, and the state store 
cannot find them in the set of keys to remove them.  A simple fix is to store 
them by key ID like the other RM state store implementations and the secret 
manager itself already do.

> TestRMDelegationTokens fails intermittently 
> --------------------------------------------
>
>                 Key: YARN-5092
>                 URL: https://issues.apache.org/jira/browse/YARN-5092
>             Project: Hadoop YARN
>          Issue Type: Test
>          Components: test
>            Reporter: Rohith Sharma K S
>         Attachments: YARN-5092.001.patch
>
>
> In build 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/11476/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_101.txt]
>  , TestRMDelegationTokens fails for 2 test cases
> # TestRMDelegationTokens.testRMDTMasterKeyStateOnRollingMasterKey
> # TestRMDelegationTokens.testRemoveExpiredMasterKeyInRMStateStore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to