[
https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149562#comment-15149562
]
Li Lu commented on YARN-4696:
-----------------------------
Thanks for the work [[email protected]]! My main question is that what is the
assumed use case for the "non-RM" mode of the reader, other than unit tests? If
it's only for unit tests, are there any ways we can clearly restrict this?
Because IIUC, if detached from the RM, all app states will be unknown and
eventually completed. However, the status is not accurate because it's only a
timeout from unknownActiveMillis.
For unit tests, is it possible to have a mock RM to to the same job? If there
are too much troubles then having this looks fine, but we need to clearly
restrict the use case.
nits:
- Line 46, EntityGroupFSTimelineStore, I think we'd incline to avoid import
\*s?
- There is a findbugs warning about an inconsistent synchronization condition
for LevelDBCacheTimelineStore, where we may want to synchronize on the
constructor? This is an unrelated failure, so feel free to skip it. However, if
you happen to have time, a quick fix would also be helpful.
[~xgong] to double check the logic on the writer side. Exception handling looks
fine but I would like to double check the logic on the flush.
> EntityGroupFSTimelineStore to work in the absence of an RM
> ----------------------------------------------------------
>
> Key: YARN-4696
> URL: https://issues.apache.org/jira/browse/YARN-4696
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Attachments: YARN-4696-001.patch
>
>
> {{EntityGroupFSTimelineStore}} now depends on an RM being up and running; the
> configuration pointing to it. This is a new change, and impacts testing where
> you have historically been able to test without an RM running.
> The sole purpose of the probe is to automatically determine if an app is
> running; it falls back to "unknown" if not. If the RM connection was
> optional, the "unknown" codepath could be called directly, relying on age of
> file as a metric of completion
> Options
> # add a flag to disable RM connect
> # skip automatically if RM not defined/set to 0.0.0.0
> # disable retries on yarn client IPC; if it fails, tag app as unknown.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)