[
https://issues.apache.org/jira/browse/YARN-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720512#comment-16720512
]
Chandni Singh commented on YARN-9040:
-------------------------------------
[~tarunparimi] the change looks good to me.
There aren't any existing tests for {{KeyValueBasedTimelineStore}} so any
changes made to it cannot be verified by unit tests. We should create tests for
{{KeyValueBasedTimelineStore}} but that doesn't have to be part of this change.
[~rohithsharma] [~eyang] Could you please help review.
> LevelDBCacheTimelineStore in ATS 1.5 leaks native memory
> --------------------------------------------------------
>
> Key: YARN-9040
> URL: https://issues.apache.org/jira/browse/YARN-9040
> Project: Hadoop YARN
> Issue Type: Bug
> Components: timelineserver
> Affects Versions: 2.8.0
> Reporter: Tarun Parimi
> Assignee: Tarun Parimi
> Priority: Major
> Attachments: YARN-9040.001.patch, YARN-9040.002.patch
>
>
> When LevelDBCacheTimelineStore from YARN-4219 is used as ATS 1.5 entity
> caching storage, we observe memory leak due to leveldb files even after the
> fix of YARN-5368 .
> Top output shows 0.024TB (25GB) RES, even though heap size is only 8GB.
>
>
> {code:java}
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 25519 yarn 20 0 33.024g 0.024t 41468 S 6.2 26.0 21:07.39
> /usr/java/default/bin/java -Dproc_timelineserver -Xmx8192m
> {code}
>
> Lsof shows a lot of open timeline-cache.ldb files which are referenced by
> ATS, even though are deleted (DEL), since they are not present when listing
> them .
>
> {code:java}
> java 25519 yarn DEL REG 253,28 9438452
> /var/yarn/timeline/timelineEntityGroupId_1542280269959_55569_dag_1542280269959_55569_2-timeline-cache.ldb/000007.sst
> java 25519 yarn DEL REG 253,28 9438438
> /var/yarn/timeline/timelineEntityGroupId_1542280269959_55569_dag_1542280269959_55569_2-timeline-cache.ldb/000007.sst
> java 25519 yarn DEL REG 253,28 9438437
> /var/yarn/timeline/timelineEntityGroupId_1542280269959_55569_dag_1542280269959_55569_2-timeline-cache.ldb/000005.sst
> {code}
>
> Looks like LevelDBCacheTimelineStore is not closing these files as the
> LevelDB DBIterator is not closed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]