[ 
https://issues.apache.org/jira/browse/YARN-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984945#comment-13984945
 ] 

Wangda Tan commented on YARN-1885:
----------------------------------

[~jianhe], Thanks for your review!
bq. some places exceed the 80 column limit, like the RMAppImpl transitions.
Will correct this later
bq. app.isAppFinalStateStored() better use isAppInFinalState instead ?
Agree, it's a bug using isAppFinalStateStored()
bq. sleeping for a fixed amount time is not deterministic, test may fail 
randomly. it’s better doing it in a while loop with heartbeats, and exit out of 
the loop if condition meets.
Agree
bq. timeout = 600000, timeout too long.
Sorry for this typo :)
bq. these two transitions cannot happen? Generally, we should not add events to 
states where the transitions can never happen, that’ll hide bugs.
Agree, and I think SUBMITTED is also cannot happen, because an app with 
SUBMITTED state doesn't launch any container, so NMs will not have the app in 
runningApplication list. Do you agree? 
bq. These two loops may block the register RPC call for a while, I think we may 
send them as the payload of RMNodeStartEvent and handle them in 
RMNodeAddTransition ?
IMO, this shouldn't be a big problem, because there's no blocking calls existed 
in handleRunningAppOnNode/handleContainerStatus. So additional microseconds of 
latency (just loop array) should be fine. Is it?
Attached new patch.

> RM may not send the finished signal to some nodes where the application ran 
> after RM restarts
> ---------------------------------------------------------------------------------------------
>
>                 Key: YARN-1885
>                 URL: https://issues.apache.org/jira/browse/YARN-1885
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Arpit Gupta
>            Assignee: Wangda Tan
>         Attachments: YARN-1885.patch, YARN-1885.patch, YARN-1885.patch
>
>
> During our HA testing we have seen cases where yarn application logs are not 
> available through the cli but i can look at AM logs through the UI. RM was 
> also being restarted in the background as the application was running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to