[
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648235#comment-16648235
]
Misha Dmitriev commented on YARN-8872:
--------------------------------------
I would leave this decision to [~haibochen].
> Optimize collections used by Yarn JHS to reduce its memory
> ----------------------------------------------------------
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: yarn
> Reporter: Misha Dmitriev
> Assignee: Misha Dmitriev
> Priority: Major
> Attachments: YARN-8872.01.patch, jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big
> heap in a large clusters, handling large MapReduce jobs. The heap is large
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java
> collections, mostly maps and lists that are either empty or contain only one
> element. In such under-populated collections considerable amount of memory is
> still used by just the internal implementation objects. See the attached
> excerpt from the jxray report for the details. If certain collections are
> almost always empty, they should be initialized lazily. If others almost
> always have just 1 or 2 elements, they should be initialized with the
> appropriate initial capacity of 1 or 2 (the default capacity is 16 for
> HashMap and 10 for ArrayList).
> Based on the attached report, we should do the following:
> # {{FileSystemCounterGroup.map}} - initialize lazily
> # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks
> only have one or two attempts
> # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
> # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it
> contains one diagnostic message most of the time
> # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to
> use the more wasteful LinkedList here) and initialize with capacity 1.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]