[
https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529606#comment-14529606
]
Naganarasimha G R commented on YARN-3127:
-----------------------------------------
Thanks for reviewing [~gtCarrera9],
Issue mentioned over here main cause is already addressed in another jira by
[~xgong] and but when we test in this way we still get to see null in the
webui and also more importantly this jira addressing is required as events are
published for every app (start and finished) on RM failover. So if 10000 apps
are maintained then so many additional non required events are getting
triggered. this we need to address. And for the issue pointed by [~xgong], i
had asked for suggestion of approach being taken and hence waiting for it,
AFAIK we need to ensure first ATS events are sent and then store the final
application state to RMstate store in FINAL_SAVING transition (and also other
possible cases where app is created and will be killed b4 attempt is created in
which case FINAL_SAVING is not called). If this approach is fine then will
update the patch and the description.
> Apphistory url crashes when RM switches with ATS enabled
> --------------------------------------------------------
>
> Key: YARN-3127
> URL: https://issues.apache.org/jira/browse/YARN-3127
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager, timelineserver
> Affects Versions: 2.6.0
> Environment: RM HA with ATS
> Reporter: Bibin A Chundatt
> Assignee: Naganarasimha G R
> Attachments: YARN-3127.20150213-1.patch, YARN-3127.20150329-1.patch
>
>
> 1.Start RM with HA and ATS configured and run some yarn applications
> 2.Once applications are finished sucessfully start timeline server
> 3.Now failover HA form active to standby
> 4.Access timeline server URL <IP>:<PORT>/applicationhistory
> Result: Application history URL fails with below info
> {quote}
> 2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to
> read the applications.
> java.lang.reflect.UndeclaredThrowableException
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
> at
> org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80)
> at
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
> at
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
> at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> at
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> ...
> Caused by:
> org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The
> entity for application attempt appattempt_1422972608379_0001_000001 doesn't
> exist in the timeline store
> at
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151)
> at
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499)
> at
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108)
> at
> org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84)
> at
> org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> ... 51 more
> 2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error
> handling URI: /applicationhistory
> org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block:
> nestLevel=6 expected 5
> at
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
> at
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
> {quote}
> Behaviour with AHS with file based history store
> -Apphistory url is working
> -No attempt entries are shown for each application.
>
> Based on inital analysis when RM switches ,application attempts from state
> store are not replayed but only applications are.
> So when /applicaitonhistory url is accessed it tries for all attempt id and
> fails
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)