[
https://issues.apache.org/jira/browse/YARN-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812459#comment-15812459
]
Naganarasimha G R commented on YARN-6054:
-----------------------------------------
Thanks for the patch [~raviprakashu],
bq. Also, as pointed out by Jason, (e.g. in the case of NM) graceful
degradation would be a very hard thing to achieve. More likely, the state is
corrupt and will cause undefined behavior.
Agree, but may be we can give some kind of tool and set of steps which can be
taken to over come it as we too faced it once. but agree its not within this
jira's scope !
Changes look good enough will wait for the jenkins report and if no further
comments will commit it tomorrow !
> TimelineServer fails to start when some LevelDb state files are missing.
> ------------------------------------------------------------------------
>
> Key: YARN-6054
> URL: https://issues.apache.org/jira/browse/YARN-6054
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 3.0.0-alpha2
> Reporter: Ravi Prakash
> Assignee: Ravi Prakash
> Attachments: YARN-6054.01.patch, YARN-6054.02.patch,
> YARN-6054.03.patch
>
>
> We encountered an issue recently where the TimelineServer failed to start
> because some state files went missing.
> {code}
> 2016-11-21 20:46:43,134 INFO org.apache.hadoop.service.AbstractService:
> Service
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> failed in state INITED
> ; cause: org.apache.hadoop.service.ServiceStateException:
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9
> missing files; e.g.: <levelDbStorePath>/timelines
> erver/leveldb-timeline-store.ldb/127897.sst
> org.apache.hadoop.service.ServiceStateException:
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9
> missing files; e.g.: <levelDbStorePath>/timelineserver/lev
> eldb-timeline-store.ldb/127897.sst
> 2016-11-21 20:46:43,135 FATAL
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer:
> Error starting ApplicationHistoryServer
> org.apache.hadoop.service.ServiceStateException:
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9
> missing files; e.g.:
> <levelDbStorePath>/timelineserver/leveldb-timeline-store.ldb/127897.sst
> at
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:172)
> at
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:182)
> Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException:
> Corruption: 9 missing files; e.g.:
> <levelDbStorePath>/timelineserver/leveldb-timeline-store.ldb/127897.sst
> at
> org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
> at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
> at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
> at
> org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.serviceInit(LeveldbTimelineStore.java:229)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> ... 5 more
> 2016-11-21 20:46:43,136 INFO org.apache.hadoop.util.ExitUtil: Exiting with
> status -1
> {code}
> Ideally we shouldn't have any missing state files. However I'd posit that the
> TimelineServer should have graceful degradation instead of failing to start
> at all.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]