[
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844223#comment-16844223
]
Steve Loughran commented on YARN-9568:
--------------------------------------
I can make this "go away" by rm'ing everything in /tmp/hadoop-yarn-stevel/*
That is: the state of a single shared path can break all unit tests running
locally. And presumably in production, cause RM startup to fail with not very
meaningful error text
Proposed
* init code handles unreadable files somehow
* for the minicluster we don't use a fixed location for the files, as with
parallel test runs its inevitable that eventually they will end up in a
corrupted state
> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> ------------------------------------------------------------------
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager, test
> Affects Versions: 3.3.0
> Environment: macos
> Reporter: Steve Loughran
> Priority: Major
> Attachments: npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its
> surfacing in the S3A committer tests which are trying to create
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to
> say "this is causing the problem".
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]