[
https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001009#comment-16001009
]
Naganarasimha G R commented on YARN-6523:
-----------------------------------------
[~jlowe], In our offline discussion you had mentioned as
bq. believe there's still some optimization that can be done given that once a
token is retrieved by the RM on behalf of an application that token is sent for
every heartbeat to every node in the cluster until that application completes.
That's very wasteful. Doing a sequence number version thing as I suggested
earlier with a precomputed system credentials would drastically cut down on the
traffic and garbage created for every heartbeat. However I agree in light of
the custom release findings that the priority of fixing this is far lower than
before.
Agree for the long running app unnecessary tokens will be exchanged after 7
days, which is unnecessary traffic and memory reclaiming. {{sequence number
version thing}} seems to be a good fit approach will try work on it further.
> RM requires large memory in sending out security tokens as part of Node
> Heartbeat in large cluster
> --------------------------------------------------------------------------------------------------
>
> Key: YARN-6523
> URL: https://issues.apache.org/jira/browse/YARN-6523
> Project: Hadoop YARN
> Issue Type: Bug
> Components: RM
> Affects Versions: 2.8.0, 2.7.3
> Reporter: Naganarasimha G R
> Assignee: Naganarasimha G R
> Priority: Critical
>
> Currently as part of heartbeat response RM sets all application's tokens
> though all applications might not be active on the node. On top of it
> NodeHeartbeatResponsePBImpl converts tokens for each app into
> SystemCredentialsForAppsProto. Hence for each node and each heartbeat too
> many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with
> 8GB RAM configured for RM
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]