[
https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16699188#comment-16699188
]
Jason Lowe commented on YARN-6523:
----------------------------------
Thanks for updating the patch! The whitespace and ASF warnings are related to
the patch. There are a lot of checkstyle warnings that should also be
addressed like unused imports, lacking braces on {{if}} statements, etc. The
unit test failure appears to be unrelated and tracked by YARN-8937.
I'm not sure all this PBImpl stuff is worth the boilerplate, especially in this
instance, and especially because it is not exposed to user code -- this is an
internal API between the RM and the NM. All we need to do here is cache the
list of SystemCredentialsForAppsProto values and have NodeHearbeatResponse take
that list of protos rather than a Map<String,ByteBuffer> for the system
credentials. NodeHeartbeatResponsePBImpl can then call
addAllSystemCredentialsForApps on the builder when it builds its protocol
buffer. IMHO adding more abstract classes and PBImpls here just adds
boilerplate for no real benefit and just adds room for errors. It's not like
we support anything other than protocol buffers -- there's blind downcasting to
PBImpl all over the place, including in this patch. I say ditch all that and
just cache the list of SystemCredentialsForAppsProto directly. No new
interfaces or classes needed, especially because we're not sending anything we
weren't already sending before.
TestYarnServerApiClasses has a commented import.
TestYarnServerApiClasses#testNodeHeartbeatResponsePBImpl has two "// create
token2" comments, and I'm assuming only one of them is accurate.
ResourceTrackerService#populateTokenSequenceNo has a few debug statement that
build log strings before calling the logger. These either need to be protected
by log debug enabled checks or use the SLF4J positional argument brace syntax
to avoid constructing the log string for no benefit when debug logging is not
enabled (i.e.: the common case).
The very long unit test was removed but equivalent tests were not added. The
only tests that now exist are simple ones that verify
NodeHeartbeatReponsePBImpl properly preserves the system credentials for apps
payload. There should be two additional tests as I described earlier:
bq. One approach is to split the testing into two, separate unit tests for the
two prime units requiring testing as part of this feature. One test can verify
the DelegationTokenRenewer updates the sequence number when new delegation
tokens are requested, and another test can verify that ResourceTrackerService
sends system credentials in the heartbeat response only when the sequence
numbers mismatch.
> Newly retrieved security Tokens are sent as part of each heartbeat to each
> node from RM which is not desirable in large cluster
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-6523
> URL: https://issues.apache.org/jira/browse/YARN-6523
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: RM
> Affects Versions: 2.8.0, 2.7.3
> Reporter: Naganarasimha G R
> Assignee: Manikandan R
> Priority: Major
> Attachments: YARN-6523.001.patch, YARN-6523.002.patch,
> YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch,
> YARN-6523.006.patch, YARN-6523.007.patch
>
>
> Currently as part of heartbeat response RM sets all application's tokens
> though all applications might not be active on the node. On top of it
> NodeHeartbeatResponsePBImpl converts tokens for each app into
> SystemCredentialsForAppsProto. Hence for each node and each heartbeat too
> many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with
> 8GB RAM configured for RM
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]