[
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763470#comment-13763470
]
Karthik Kambatla commented on YARN-1027:
----------------------------------------
Did some testing with several transitions to Standby and Active back and forth,
and ran MR jobs when in Active mode.
# The Standby mode (389719 objects worth 46661952 bytes) indeed has fewer
objects and uses less memory compared to the Active mode (399819 objects worth
50104584 bytes).
# The applicationId has the same timestamp from when the RM started, and starts
issuing ids starting from 1. This leads to issues ranging from client-side
failures due to entries in .staging/ to jobs hanging. Once enough jobs are
killed, subsequent jobs can be run as usual. To address this, I think it is
safe to reset the timestamp to when the RM becomes Active.
# The WebUI behaves as expected.
Regarding more involved tests, I was thinking of writing a
MiniYARNCluster-based one that checks if the RPC servers are shutdown in
Standby mode. We can check if a client can request applicationId etc. Is it
okay for these tests to live in hadoop-yarn-client. Or, would it make sense to
create a separate module for such end-to-end tests, including future HA tests,
stress tests etc.?
> Implement RMHAServiceProtocol
> -----------------------------
>
> Key: YARN-1027
> URL: https://issues.apache.org/jira/browse/YARN-1027
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Karthik Kambatla
> Attachments: test-yarn-1027.patch, yarn-1027-1.patch,
> yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch,
> yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch
>
>
> Implement existing HAServiceProtocol from Hadoop common. This protocol is the
> single point of interaction between the RM and HA clients/services.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira