[ 
https://issues.apache.org/jira/browse/YARN-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855155#comment-16855155
 ] 

Ahmed Hussein commented on YARN-9597:
-------------------------------------

By looking at previous Jira reports, it is clear that "memory leakage" is a 
persistent bug.

My concern is how many Hadoop modules have hidden bloating data structures? If 
this is a common issue, would it be useful to investigate test-cases/tools to 
detect memory leaks in Hadoop?

[[email protected]] are you aware of ongoing efforts in that direction?

> Memory efficiency in speculator 
> --------------------------------
>
>                 Key: YARN-9597
>                 URL: https://issues.apache.org/jira/browse/YARN-9597
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Ahmed Hussein
>            Priority: Minor
>
> The data structures in speculator and runtime-estimator are bloating. Data 
> elements such as (taskID, TA-ID, task stats, tasks speculated, tasks 
> finished..etc) are added to the concurrent maps but never removed.
> For long running jobs, there are couple of issues:
>  # memory leakage: the speculator memory usage increases over time. 
>  # performance: keeping large structures in the heap affects the performance 
> due to locality and cache misses.
> *Suggested Fixes:*
> - When a TA transitions to {{MoveContainerToSucceededFinishingTransition}}, 
> the TA notifies the speculator. The latter handles the event by cleaning the 
> internal structure accordingly.
> - When a task transitions is failed/killed, the speculator is notified to 
> clean the internal data structure.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to