[
https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725927#comment-16725927
]
Jason Lowe commented on YARN-6523:
----------------------------------
Thanks for updating the patch! I think it is really close now.
NodeHeartbeatResponsePBImpl can be more efficient on the handling of the system
credentials for apps collection. Rather than making a copy of it and delay
until mergeLocalToBuilder is called to set it on the builder (which will also
make a copy), it can be handled more like the token sequence number where we
just get it and set it on the proto/builder when it is get/set on the PBImpl.
For example:
{code}
@Override
public void setSystemCredentialsForApps(
Collection<SystemCredentialsForAppsProto> systemCredentialsForAppsProto) {
maybeInitBuilder();
builder.clearSystemCredentialsForApps();
if (systemCredentialsForAppsProto != null) {
builder.addAllSystemCredentialsForApps(systemCredentialsForAppsProto);
}
}
@Override
public Collection<SystemCredentialsForAppsProto>
getSystemCredentialsForApps() {
NodeHeartbeatResponseProtoOrBuilder p = viaProto ? proto : builder;
return p.getSystemCredentialsForAppsList();
}
{code}
Other than that the patch looks good to me.
> Newly retrieved security Tokens are sent as part of each heartbeat to each
> node from RM which is not desirable in large cluster
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-6523
> URL: https://issues.apache.org/jira/browse/YARN-6523
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: RM
> Affects Versions: 2.8.0, 2.7.3
> Reporter: Naganarasimha G R
> Assignee: Manikandan R
> Priority: Major
> Attachments: YARN-6523.001.patch, YARN-6523.002.patch,
> YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch,
> YARN-6523.006.patch, YARN-6523.007.patch, YARN-6523.008.patch,
> YARN-6523.009.patch, YARN-6523.010.patch, YARN-6523.011.patch
>
>
> Currently as part of heartbeat response RM sets all application's tokens
> though all applications might not be active on the node. On top of it
> NodeHeartbeatResponsePBImpl converts tokens for each app into
> SystemCredentialsForAppsProto. Hence for each node and each heartbeat too
> many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with
> 8GB RAM configured for RM
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]