[
https://issues.apache.org/jira/browse/YARN-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Li Lu updated YARN-4265:
------------------------
Attachment: YARN-4265-trunk.002.patch
Thanks [~djp] for the review! In the 002 patch I addressed most of the
checkstyle problems, as well as most existing comments. Please feel free to add
more. Some comments:
bq. I noticed that we are setting 1 minutes as default scan interval but
original patch in HDFS-3942 is 5 minutes. Why shall we do any update here?
For now I increased the default frequency to scan HDFS and pull timeline data.
Having a 5-minute time interval means users are less likely to see any running
status for apps that finish within 5 minutes. Right now I'm setting this value
to 1 minute to reduce reader react time.
bq. The same question on "app-cache-size", the default value in HDFS-3942 is 5
but here is 10. Any reason to update the value?
In YARN-3942, caching is performed on application level. In this patch, caching
is performed in entity groups. Each application may have a few to tens of
entity groups. Normally, there are slightly more active entity groups than
active applications in the system. For now, I'm increasing this default value
to hold slightly more entity groups in cache.
bq. Why we don't have any default value specified in property of
"yarn.timeline-service.entity-group-fs-store.group-id-plugin-classes"?
Plugins are provided by third-party applications such as Tez. Right now we
cannot assume which exact entity group plugin the user is using, therefore we
have to conservatively leave this config as empty.
bq. For EmptyTimelineEntityGroupPlugin.java, why we need this class? I didn't
see any help provided even in tests. We should remove it if useless.
Ah, nice catch. Removed it.
bq. Can we optimize the synchronization logic here? Like in synchronized method
refreshCache, we are intialize/start/stop TimelineDataManager (and
MemoryTimelineStore) which is quite expensive and unnecessary to block other
synchronized operations. Shall we move these operations out of synchronized
block?
It's certainly doable. Right now I have yet to optimize this part because it's
a little bit tricky to fine tune synchronization performance before we have a
relatively stable starting point. Also, we're using fine-grained locking for
each cached item in the reader cache, and cache refresh only happens
infrequently (~10 secs by default), so maybe we'd like to stabilize the whole
synchronization story before fine tune this part?
> Provide new timeline plugin storage to support fine-grained entity caching
> --------------------------------------------------------------------------
>
> Key: YARN-4265
> URL: https://issues.apache.org/jira/browse/YARN-4265
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Li Lu
> Assignee: Li Lu
> Attachments: YARN-4265-trunk.001.patch, YARN-4265-trunk.002.patch,
> YARN-4265.YARN-4234.001.patch, YARN-4265.YARN-4234.002.patch
>
>
> To support the newly proposed APIs in YARN-4234, we need to create a new
> plugin timeline store. The store may have similar behavior as the
> EntityFileTimelineStore proposed in YARN-3942, but cache date in cache id
> granularity, instead of application id granularity. Let's have this storage
> as a standalone one, instead of updating EntityFileTimelineStore, to keep the
> existing store (EntityFileTimelineStore) stable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)