[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725927#comment-16725927 ]
Jason Lowe commented on YARN-6523: ---------------------------------- Thanks for updating the patch! I think it is really close now. NodeHeartbeatResponsePBImpl can be more efficient on the handling of the system credentials for apps collection. Rather than making a copy of it and delay until mergeLocalToBuilder is called to set it on the builder (which will also make a copy), it can be handled more like the token sequence number where we just get it and set it on the proto/builder when it is get/set on the PBImpl. For example: {code} @Override public void setSystemCredentialsForApps( Collection<SystemCredentialsForAppsProto> systemCredentialsForAppsProto) { maybeInitBuilder(); builder.clearSystemCredentialsForApps(); if (systemCredentialsForAppsProto != null) { builder.addAllSystemCredentialsForApps(systemCredentialsForAppsProto); } } @Override public Collection<SystemCredentialsForAppsProto> getSystemCredentialsForApps() { NodeHeartbeatResponseProtoOrBuilder p = viaProto ? proto : builder; return p.getSystemCredentialsForAppsList(); } {code} Other than that the patch looks good to me. > Newly retrieved security Tokens are sent as part of each heartbeat to each > node from RM which is not desirable in large cluster > ------------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM > Affects Versions: 2.8.0, 2.7.3 > Reporter: Naganarasimha G R > Assignee: Manikandan R > Priority: Major > Attachments: YARN-6523.001.patch, YARN-6523.002.patch, > YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch, > YARN-6523.006.patch, YARN-6523.007.patch, YARN-6523.008.patch, > YARN-6523.009.patch, YARN-6523.010.patch, YARN-6523.011.patch > > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org