[ https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Naganarasimha G R updated YARN-3127: ------------------------------------ Description: 1.Start RM with HA and ATS configured and run some yarn applications 2.Once applications are finished sucessfully start timeline server 3.Now failover HA form active to standby 4.Access timeline server URL <IP>:<PORT>/applicationhistory //Note Earlier exception was thrown when accessed. Incomplete information is shown in the ATS web UI. i.e. attempt container and other information is not displayed. Also even if timeline server is started with RM, and on RM restart/ recovery ATS events for the applications already existing in ATS are resent which is not required. was: 1.Start RM with HA and ATS configured and run some yarn applications 2.Once applications are finished sucessfully start timeline server 3.Now failover HA form active to standby 4.Access timeline server URL <IP>:<PORT>/applicationhistory Result: Application history URL fails with below info {quote} 2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to read the applications. java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643) at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) ... Caused by: org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The entity for application attempt appattempt_1422972608379_0001_000001 doesn't exist in the timeline store at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108) at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84) at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) ... 51 more 2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /applicationhistory org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5 at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) {quote} Behaviour with AHS with file based history store -Apphistory url is working -No attempt entries are shown for each application. Based on inital analysis when RM switches ,application attempts from state store are not replayed but only applications are. So when /applicaitonhistory url is accessed it tries for all attempt id and fails > Avoid timeline events during RM recovery or restart > --------------------------------------------------- > > Key: YARN-3127 > URL: https://issues.apache.org/jira/browse/YARN-3127 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver > Affects Versions: 2.6.0 > Environment: RM HA with ATS > Reporter: Bibin A Chundatt > Assignee: Naganarasimha G R > Priority: Critical > Attachments: YARN-3127.20150213-1.patch, YARN-3127.20150329-1.patch > > > 1.Start RM with HA and ATS configured and run some yarn applications > 2.Once applications are finished sucessfully start timeline server > 3.Now failover HA form active to standby > 4.Access timeline server URL <IP>:<PORT>/applicationhistory > //Note Earlier exception was thrown when accessed. > Incomplete information is shown in the ATS web UI. i.e. attempt container and > other information is not displayed. > Also even if timeline server is started with RM, and on RM restart/ recovery > ATS events for the applications already existing in ATS are resent which is > not required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)