[
https://issues.apache.org/jira/browse/YARN-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939946#comment-13939946
]
Rajesh Balamohan commented on YARN-1775:
----------------------------------------
[~kkambatl]
- ProcfsBasedProcessTree implementation relies on /proc/<pid>/stat file for
computing RSS information of a process. RSS information reported by "stat"
file is not very accurate. RSS reported in “stat” file does not care about
whether some of its RAM is being shared with other processes or not; Every
process counts it up separately. This can cause problems for applications
which make use of features like mmap in hadoop and their containers can get
killed by YARN even though they have not exceeded the configured physical
memory limits.
- Purpose of SMAPBasedProcessTree is to compute the realistic RSS (Resident Set
Size) of the process by looking at the memory-mappings populated in
/proc/<pid>/smaps file. It does not take into account the "read only shared
memory mappings" in the process (i.e r—s, r-xs) and computes the RSS as RSS =
PRIVATE_CLEAN + PRIVATE_DIRTY + Min(SHARED_DIRTY, PSS), where
- PRIVATE_CLEAN = pages that are mapped by the process & not
modified
- PRIVATE_DIRTY = pages that are mapped by the process &
modified
- SHARED_DIRTY = pages that are shared with other process &
modified
- PSS = The count of all pages mapped uniquely by the process,
plus a fraction of each shared page, said fraction to be proportional to the
number of processes which have mapped the page.
It would be a good idea to restrict the number of implementations on
ResourceCalculatorProcessTree to ProcfsBased, SmapBased and Cgroup
(memory.stat) based. SMAP implementation can be used in scenarios where
cgroups are not enabled.
[~cnauroth]
For testing, a large 1 GB file (also tried with 1.5 GB file) was memory mapped
in the mapper. i.e using
RandomAccessFile...getChannel().map(FileChannel.MapMode.READ_ONLY, 0,
fileSize).load();
-mapreduce.map.memory.mb" and mapreduce.reduce.memory.mb were set to 2048MB.
-Max heap size of mapred.map.child.java.opts and mapred.reduce.child.java.opts"
were set to 1500MB
-io.sort.mb=600
-With ProcfsBasedProcessTree, the tasks were getting killed, as the RSS
computation included the mapped pages of the 1 GB file.
-With SMAPBasedProcessTree, RSS computation did not include the mapped pages
and the tasks were able to run successfully. No performance degradation
observed.
> Create SMAPBasedProcessTree to get PSS information
> --------------------------------------------------
>
> Key: YARN-1775
> URL: https://issues.apache.org/jira/browse/YARN-1775
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Priority: Minor
> Fix For: 2.5.0
>
> Attachments: yarn-1775-2.4.0.patch
>
>
> Create SMAPBasedProcessTree (by extending ProcfsBasedProcessTree), which will
> make use of PSS for computing the memory usage.
--
This message was sent by Atlassian JIRA
(v6.2#6252)