[
https://issues.apache.org/jira/browse/YARN-9714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916290#comment-16916290
]
Tao Yang commented on YARN-9714:
--------------------------------
TestZKRMStateStore#testZKRootPathAcls UT failure is caused by itself,
stateStore (ZKRMStateStore instance) used for verification is not updated after
RM HA transition. Will attach v4 patch to fix this UT problem.
> ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby
> -----------------------------------------------------------------------------
>
> Key: YARN-9714
> URL: https://issues.apache.org/jira/browse/YARN-9714
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Reporter: Tao Yang
> Assignee: Tao Yang
> Priority: Major
> Labels: memory-leak
> Attachments: YARN-9714.001.patch, YARN-9714.002.patch,
> YARN-9714.003.patch
>
>
> Recently RM full GC happened in one of our clusters, after investigating the
> dump memory and jstack, I found two places in RM may cause memory leaks after
> RM transitioned to standby:
> # Release cache cleanup timer in AbstractYarnScheduler never be canceled.
> # ZooKeeper connection in ZKRMStateStore never be closed.
> To solve those leaks, we should close the connection or cancel the timer when
> services are stopping.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]