[
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645288#comment-13645288
]
Vinod Kumar Vavilapalli commented on YARN-599:
----------------------------------------------
Hm, it isn't straight-forward to figure that failures during
RMAppManager.submitApplication() are properly put in Audit logs. But they are,
I just verified.
The latest patch looks good to me. +1, checking it in..
> Refactoring submitApplication in ClientRMService and RMAppManager
> -----------------------------------------------------------------
>
> Key: YARN-599
> URL: https://issues.apache.org/jira/browse/YARN-599
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Zhijie Shen
> Assignee: Zhijie Shen
> Attachments: YARN-599.1.patch, YARN-599.2.patch
>
>
> Currently, ClientRMService#submitApplication call RMAppManager#handle, and
> consequently call RMAppMangager#submitApplication directly, though the code
> looks like scheduling an APP_SUBMIT event.
> In addition, the validation code before creating an RMApp instance is not
> well organized. Ideally, the dynamic validation, which depends on the RM's
> configuration, should be put in RMAppMangager#submitApplication.
> RMAppMangager#submitApplication is called by
> ClientRMService#submitApplication and RMAppMangager#recover. Since the
> configuration may be changed after RM restarts, the validation needs to be
> done again even in recovery mode. Therefore, resource request validation,
> which based on min/max resource limits, should be moved from
> ClientRMService#submitApplication to RMAppMangager#submitApplication. On the
> other hand, the static validation, which is independent of the RM's
> configuration should be put in ClientRMService#submitApplication, because it
> is only need to be done once during the first submission.
> Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw.
> RMAppMangager#submitApplication has a flaw is not synchronized. If two
> application submissions with the same application ID enter the function, and
> one progresses to the completion of RMApp instantiation, and the other
> progresses the completion of putting the RMApp instance into rmContext, the
> slower submission will cause an exception due to the duplicate application
> ID. However, the exception will cause the RMApp instance already in rmContext
> (belongs to the faster submission) being rejected with the current code flow.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira