[ 
https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15626529#comment-15626529
 ] 

Jason Lowe commented on YARN-5368:
----------------------------------

bq. Recently I noticed same issue with NodeManger when recovery is enabled.NM 
RES is keep on growing which leads ResourceLocalization slow.

We have not seen that on our clusters.  Three minutes is a _really_ long time.  
Do you have gc logging enabled for the nodemanager JVM?  It would be 
interesting to know if it was trying to run one or more GC cycles during that 
time.  If it wasn't GC cycles then I'm not sure how increased off-heap memory 
would directly contribute to slower resource localization unless the machine 
was near or at the point where it started swapping.

As for the timeline server memory usage, it looks like the rolling level db 
instances are starting to pile up, accumulating a lot of off-heap memory.  
Pinging [~jeagles] since I vaguely remember something like this occurring in 
the past, and there may be a known fix for that issue.

> memory leak at timeline server
> ------------------------------
>
>                 Key: YARN-5368
>                 URL: https://issues.apache.org/jira/browse/YARN-5368
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: timelineserver
>    Affects Versions: 2.7.1
>         Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>            Reporter: Wataru Yukawa
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn      20   0 28.4g  25g  12m S  0.0 40.1   5162:53 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?        Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java 
> -Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir= 
> -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA 
> -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -Dyarn.policy.file=hadoop-policy.xml 
> -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver 
> -Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop 
> -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -classpath 
> /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
>  
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine 
> decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn      20   0 3959m 783m  46m S  0.3  1.2   3:37.60 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to