[ 
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768492#comment-13768492
 ] 

Karthik Kambatla commented on YARN-1027:
----------------------------------------

{quote}
// TODO (YARN-1192): Update this post addition of STOPPING state to
Is this still needed since I see that HADOOP-9945 is already committed?
{quote}
Included in the latest patch.

bq. Looks like not all the memory was reclaimed upon Active->Standby. One thing 
to check would be if the memory keeps increasing after every transition for 
multiple transitions. 

The previous set of numbers were taken immediately after the transitions. I 
suspect it takes a little bit for GC to reclaim. I have done a new set of 
experiments:
# Started the RM, transitioned to Active (active-1) and ran 10 pi jobs.
# Transitioned to standby (Standby-1) -> Active (active-2) -> Standby 
(standby-2) -> Active (active-3) -> Standby (standby-3), waiting for some time 
after transitioning to standby to let the heap reach steady state. The memory 
consumption is as below. The first number is the number of objects, and the 
second number is bytes. 
{noformat}
active-1  417253 50697304
standby-1  387763 46063712
active-2  404628 48989688
standby-2  391583 46265168
active-3  381511 47552168
standby-3  379597 45919224
{noformat}
Interestingly, standby-3 bytes is the smallest. I think the objects are being 
reclaimed; however, the actual heap size depends on if/when/what GC kicks in. A 
more thorough evaluation can be done as part of YARN-1125 and YARN-1139.

bq. Do you plan to test this patch under secure cluster?
Filed YARN-1202 to verify RM HA in secure clusters. Given we haven't made any 
security-related changes, I don't think we are causing any regressions.
                
> Implement RMHAProtocolService
> -----------------------------
>
>                 Key: YARN-1027
>                 URL: https://issues.apache.org/jira/browse/YARN-1027
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Karthik Kambatla
>         Attachments: test-yarn-1027.patch, yarn-1027-1.patch, 
> yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, 
> yarn-1027-6.patch, yarn-1027-7.patch, yarn-1027-7.patch, yarn-1027-8.patch, 
> yarn-1027-9.patch, yarn-1027-including-yarn-1098-3.patch, 
> yarn-1027-in-rm-poc.patch
>
>
> Implement existing HAServiceProtocol from Hadoop common. This protocol is the 
> single point of interaction between the RM and HA clients/services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to