[
https://issues.apache.org/jira/browse/YARN-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395762#comment-15395762
]
Junping Du commented on YARN-5432:
----------------------------------
bq. So, instead of using refcounts and handle corner cases like the one
reported here, another solution is to insist on the app cache size.
+1. KISS (keep it simple & stupid) rules still works here.
003 patch LGTM. Will commit it later today if not further comments from others.
> Lock already held by another process while LevelDB cache store creation for
> dag
> -------------------------------------------------------------------------------
>
> Key: YARN-5432
> URL: https://issues.apache.org/jira/browse/YARN-5432
> Project: Hadoop YARN
> Issue Type: Bug
> Components: timelineserver
> Affects Versions: 2.8.0, 2.7.3
> Reporter: Karam Singh
> Assignee: Li Lu
> Attachments: YARN-5432-trunk.001.patch, YARN-5432-trunk.002.patch,
> YARN-5432-trunk.003.patch
>
>
> While running ATS stress tests, 15 concurrent ATS reads (python thread
> which gives ws/v1/time/TEZ_DAG_ID,
> ws/v1/time/TEZ_VERTEX_DI?primaryFilter=TEZ_DAG_ID:<dag_id> etc) calls.
> Note: Summary store for ATSv1.5 is RLD, but as we for each dag/application
> ATS also creates leveldb cache when vertex/task/taskattempts information is
> queried from ATS.
>
> Getting following type of excpetion very frequently in ATS logs :-
> 2016-07-23 00:01:56,089 [1517798697@qtp-1198158701-850] INFO
> org.apache.hadoop.service.AbstractService: Service
> LeveldbCache.timelineEntityGroupId_1469090881194_4832_application_1469090881194_4832
> failed in state INITED; cause:
> org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock
> /grid/4/yarn_ats/atsv15_rld/timelineEntityGroupId_1469090881194_4832_application_1469090881194_4832-timeline-cache.ldb/LOCK:
> already held by process
> org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock
> /grid/4/yarn_ats/atsv15_rld/timelineEntityGroupId_1469090881194_4832_application_1469090881194_4832-timeline-cache.ldb/LOCK:
> already held by process
> at
> org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
> at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
> at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
> at
> org.apache.hadoop.yarn.server.timeline.LevelDBCacheTimelineStore.serviceInit(LevelDBCacheTimelineStore.java:108)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.timeline.EntityCacheItem.refreshCache(EntityCacheItem.java:113)
> at
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getCachedStore(EntityGroupFSTimelineStore.java:1021)
> at
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getTimelineStoresFromCacheIds(EntityGroupFSTimelineStore.java:936)
> at
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getTimelineStoresForRead(EntityGroupFSTimelineStore.java:989)
> at
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1041)
> at
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
> at
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
> at
> org.apache.hadoop.yarn.server.timeline.webapp.TimelineWebServices.getEntities(TimelineWebServices.java:117)
> at sun.reflect.GeneratedMethodAccessor82.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
> at
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
> at
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
> at
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
> at
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
> at
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
> at
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
> at
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
> at
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
> at
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
> at
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
> at
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
> at
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
> at
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
> at
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886)
> at
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
> at
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
> at
> com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
> at
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
> at
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
> at
> com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]