[
https://issues.apache.org/jira/browse/YARN-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855155#comment-16855155
]
Ahmed Hussein commented on YARN-9597:
-------------------------------------
By looking at previous Jira reports, it is clear that "memory leakage" is a
persistent bug.
My concern is how many Hadoop modules have hidden bloating data structures? If
this is a common issue, would it be useful to investigate test-cases/tools to
detect memory leaks in Hadoop?
[[email protected]] are you aware of ongoing efforts in that direction?
> Memory efficiency in speculator
> --------------------------------
>
> Key: YARN-9597
> URL: https://issues.apache.org/jira/browse/YARN-9597
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Ahmed Hussein
> Priority: Minor
>
> The data structures in speculator and runtime-estimator are bloating. Data
> elements such as (taskID, TA-ID, task stats, tasks speculated, tasks
> finished..etc) are added to the concurrent maps but never removed.
> For long running jobs, there are couple of issues:
> # memory leakage: the speculator memory usage increases over time.
> # performance: keeping large structures in the heap affects the performance
> due to locality and cache misses.
> *Suggested Fixes:*
> - When a TA transitions to {{MoveContainerToSucceededFinishingTransition}},
> the TA notifies the speculator. The latter handles the event by cleaning the
> internal structure accordingly.
> - When a task transitions is failed/killed, the speculator is notified to
> clean the internal data structure.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]