[ 
https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149562#comment-15149562
 ] 

Li Lu commented on YARN-4696:
-----------------------------

Thanks for the work [[email protected]]! My main question is that what is the 
assumed use case for the "non-RM" mode of the reader, other than unit tests? If 
it's only for unit tests, are there any ways we can clearly restrict this? 
Because IIUC, if detached from the RM, all app states will be unknown and 
eventually completed. However, the status is not accurate because it's only a 
timeout from unknownActiveMillis. 

For unit tests, is it possible to have a mock RM to to the same job? If there 
are too much troubles then having this looks fine, but we need to clearly 
restrict the use case. 

nits:
- Line 46, EntityGroupFSTimelineStore, I think we'd incline to avoid import 
\*s? 
- There is a findbugs warning about an inconsistent synchronization condition 
for LevelDBCacheTimelineStore, where we may want to synchronize on the 
constructor? This is an unrelated failure, so feel free to skip it. However, if 
you happen to have time, a quick fix would also be helpful. 

[~xgong] to double check the logic on the writer side. Exception handling looks 
fine but I would like to double check the logic on the flush. 

> EntityGroupFSTimelineStore to work in the absence of an RM
> ----------------------------------------------------------
>
>                 Key: YARN-4696
>                 URL: https://issues.apache.org/jira/browse/YARN-4696
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>         Attachments: YARN-4696-001.patch
>
>
> {{EntityGroupFSTimelineStore}} now depends on an RM being up and running; the 
> configuration pointing to it. This is a new change, and impacts testing where 
> you have historically been able to test without an RM running.
> The sole purpose of the probe is to automatically determine if an app is 
> running; it falls back to "unknown" if not. If the RM connection was 
> optional, the "unknown" codepath could be called directly, relying on age of 
> file as a metric of completion
> Options
> # add a flag to disable RM connect
> # skip automatically if RM not defined/set to 0.0.0.0
> # disable retries on yarn client IPC; if it fails, tag app as unknown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to