Jian He commented on YARN-1365:

Thanks for updating the patch

1.how about UnregisteredApplicationMasterException -> 
ApplicationMasterNotRegisteredException ?
please also add comments that this exception can happen even if application has 
registered before because RM may have restarted and the expectation to handle 
this exception is to re-register.

2.This newly added constructor is not used anywhere?  we can just use 
“app.handler.handle” to send the scheduler event in RMAppRecoverdTransition 
instead of refactoring the transition. 
    public void transition(RMAppImpl app, RMAppEvent event,
                           boolean shouldSchedulerNotifyAppAdded) {
      transitionImplementation(app, event, shouldSchedulerNotifyAppAdded);

3. the following code format in FifoScheduler can be consolidated to 2 lines. 
  public synchronized void addApplication(ApplicationId applicationId,
                                          String queue, String user, boolean
      shouldNotifyAppAccepted) {

4. some minor comments on testRMRestartWorkPreservingAppReregister:
this conf.set is not needed, it’s already enabled globally.
We can use MockRM.launchAndRegisterAM instead of changing 
TestRMRestart.launchAM to be static
MockAM am0 = TestRMRestart.launchAM(app0, rm1, nm1);
If using the global variable rm1,rm2, the following two statements are not 

> ApplicationMasterService to allow Register and Unregister of an app that was 
> running before restart
> ---------------------------------------------------------------------------------------------------
>                 Key: YARN-1365
>                 URL: https://issues.apache.org/jira/browse/YARN-1365
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1365.001.patch, YARN-1365.002.patch, 
> YARN-1365.003.patch, YARN-1365.004.patch, YARN-1365.005.patch, 
> YARN-1365.005.patch, YARN-1365.006.patch, YARN-1365.initial.patch
> For an application that was running before restart, the 
> ApplicationMasterService currently throws an exception when the app tries to 
> make the initial register or final unregister call. These should succeed and 
> the RMApp state machine should transition to completed like normal. 
> Unregistration should succeed for an app that the RM considers complete since 
> the RM may have died after saving completion in the store but before 
> notifying the AM that the AM is free to exit.

This message was sent by Atlassian JIRA

Reply via email to