[
https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695037#comment-16695037
]
Manikandan R commented on YARN-6523:
------------------------------------
[~jlowe] Thanks for your detailed explanation.
I've made changes to compute SystemCredentialsForAppsProto in case of any
changes in credentials only once in
NodeHeartbeatResponsePBImpl.setSystemCredentialsForApps itself so that
SystemCredentialsForAppsProto computation can be avoided when every time
NodeHeartbeatResponsePBImpl.getProto() call happens. Thoughts?
{quote}Checking for hasTokenSequenceNo and returning zero if not present is
redundant.{quote}
Taken care
{quote}My concerns about the unit test duration have not been addressed. This
single unit test takes almost two minutes to execute{quote}
Sorry, thought of configuring appropriate token expiry time but missed this in
earlier patch. Taken care now. It has reduced overall unit tests duration
significantly.
> Newly retrieved security Tokens are sent as part of each heartbeat to each
> node from RM which is not desirable in large cluster
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-6523
> URL: https://issues.apache.org/jira/browse/YARN-6523
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: RM
> Affects Versions: 2.8.0, 2.7.3
> Reporter: Naganarasimha G R
> Assignee: Manikandan R
> Priority: Major
> Attachments: YARN-6523.001.patch, YARN-6523.002.patch,
> YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch,
> YARN-6523.006.patch
>
>
> Currently as part of heartbeat response RM sets all application's tokens
> though all applications might not be active on the node. On top of it
> NodeHeartbeatResponsePBImpl converts tokens for each app into
> SystemCredentialsForAppsProto. Hence for each node and each heartbeat too
> many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with
> 8GB RAM configured for RM
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]