[
https://issues.apache.org/jira/browse/YARN-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902757#comment-14902757
]
Jason Lowe commented on YARN-4199:
----------------------------------
Have you looked at the rolling leveldb implementation from YARN-3448? One of
its design goals was to solve this same problem.
> Minimize lock time in LeveldbTimelineStore.discardOldEntities
> -------------------------------------------------------------
>
> Key: YARN-4199
> URL: https://issues.apache.org/jira/browse/YARN-4199
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: timelineserver, yarn
> Reporter: Shiwei Guo
>
> In current implementation, LeveldbTimelineStore.discardOldEntities holds a
> writeLock on deleteLock, which will block other put operation, which
> eventually block the execution of YARN jobs(e.g. TEZ). When there is lots of
> history jobs in timelinestore, the block time will be very long. In our
> observation, it block all the TEZ jobs for several hours or longer.
> The possible solutions are:
> - Optimize leveldb configuration, so a full scan won't take long time.
> - Take a snapshot of leveldb, and scan the snapshot, so we only need to hold
> lock while getSnapshot. One question is that whether snapshot will take long
> time or not, cause I have no experience with leveldb.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)