[ 
https://issues.apache.org/jira/browse/YARN-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939946#comment-13939946
 ] 

Rajesh Balamohan commented on YARN-1775:
----------------------------------------

[~kkambatl]
- ProcfsBasedProcessTree implementation relies on /proc/<pid>/stat file for 
computing RSS information of a process.  RSS information reported by "stat" 
file is not very accurate.  RSS reported in “stat” file does not care about 
whether some of its RAM is being shared with other processes or not; Every 
process counts it up separately.  This can cause problems for applications 
which make use of features like mmap in hadoop and their containers can get 
killed by YARN even though they have not exceeded the configured physical 
memory limits.  
- Purpose of SMAPBasedProcessTree is to compute the realistic RSS (Resident Set 
Size) of the process by looking at the memory-mappings populated in 
/proc/<pid>/smaps file.  It does not take into account the "read only shared 
memory mappings" in the process (i.e r—s, r-xs) and computes the RSS as  RSS = 
PRIVATE_CLEAN + PRIVATE_DIRTY + Min(SHARED_DIRTY, PSS), where 
                - PRIVATE_CLEAN = pages that are mapped by the process & not 
modified
                - PRIVATE_DIRTY = pages that are mapped by the process & 
modified
                - SHARED_DIRTY = pages that are shared with other process & 
modified
                - PSS = The count of all pages mapped uniquely by the process, 
plus a fraction of each shared page, said fraction to be proportional to the 
number of processes which have mapped the page.
It would be a good idea to restrict the number of implementations on 
ResourceCalculatorProcessTree to ProcfsBased, SmapBased and Cgroup 
(memory.stat) based.  SMAP implementation can be used in scenarios where 
cgroups are not enabled.

[~cnauroth]
For testing, a large 1 GB file (also tried with 1.5 GB file) was memory mapped 
in the mapper.  i.e using 
RandomAccessFile...getChannel().map(FileChannel.MapMode.READ_ONLY, 0, 
fileSize).load();
-mapreduce.map.memory.mb" and mapreduce.reduce.memory.mb were set to 2048MB. 
-Max heap size of mapred.map.child.java.opts and mapred.reduce.child.java.opts" 
were set to 1500MB
-io.sort.mb=600
-With ProcfsBasedProcessTree, the tasks were getting killed, as the RSS 
computation included the mapped pages of the 1 GB file.
-With SMAPBasedProcessTree, RSS computation did not include the mapped pages 
and the tasks were able to run successfully.  No performance degradation 
observed.  

> Create SMAPBasedProcessTree to get PSS information
> --------------------------------------------------
>
>                 Key: YARN-1775
>                 URL: https://issues.apache.org/jira/browse/YARN-1775
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>            Priority: Minor
>             Fix For: 2.5.0
>
>         Attachments: yarn-1775-2.4.0.patch
>
>
> Create SMAPBasedProcessTree (by extending ProcfsBasedProcessTree), which will 
> make use of PSS for computing the memory usage. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to