[
https://issues.apache.org/jira/browse/YARN-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16127102#comment-16127102
]
Manikandan R commented on YARN-65:
----------------------------------
[~rohithsharma] [~bibinchundatt] [~Naganarasimha] Thanks for taking a closer
look and suggestions.
Since ACLs are getting stored in {{ApplicationACLManager}} as part of
{{RMAppManager#createAndPopulateNewRMApp}}, we are setting {{AMContainerSpec}}
to null and attached patch for the same. Test cases using
{{MemoryRMStateStore}} were not passing because of NPE during recovery process.
Copy of the stack trace -
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:432)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:347)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:537)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1403)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:767)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1156)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1196)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
To fix this NPE and pass these test cases, preserved {{AMContainerSpec}} from
{{MemoryRMStateStore}}, after app submission into the running RM and restored
the same into {{MemoryRMStateStore}} before starting RM again. Attached patch
contains these test case changes as well.
> Reduce RM app memory footprint once app has completed
> -----------------------------------------------------
>
> Key: YARN-65
> URL: https://issues.apache.org/jira/browse/YARN-65
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Affects Versions: 0.23.3
> Reporter: Jason Lowe
> Assignee: Manikandan R
> Attachments: YARN-65.001.patch, YARN-65.002.patch, YARN-65.003.patch,
> YARN-65.004.patch, YARN-65.005.patch, YARN-65.006.patch, YARN-65.007.patch,
> YARN-65.008.patch
>
>
> The ResourceManager holds onto a configurable number of completed
> applications (yarn.resource.max-completed-applications, defaults to 10000),
> and the memory footprint of these completed applications can be significant.
> For example, the {{submissionContext}} in RMAppImpl contains references to
> protocolbuffer objects and other items that probably aren't necessary to keep
> around once the application has completed. We could significantly reduce the
> memory footprint of the RM by releasing objects that are no longer necessary
> once an application completes.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]