[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320001#comment-15320001
 ] 

Naganarasimha G R commented on YARN-4464:
-----------------------------------------

Thanks for the comments [~vinodkv],
bq.  I remember that Jian He did some benchmarking to demonstrate that recovery 
of 10K apps takes only 10 seconds. We need to understand the root-cause here.
You are right though initially there were some discussions on the propable 
cause for the delay later on it just went on modifying the default value. 
Initially i thought it might be because of YARN-3104 (as mentioned by [~kasha]) 
or YARN-4041, but not quite sure about it.

But having said that, i was thinking more in the lines whether its required to 
store so many finished apps when we are already supporting ATS. Apart from 
adding to the startup time (though nominal but unnecessary when we have many 
running apps in large cluster) it was also adding lot of unnecessary logs and 
publish of ATS events. Hence was more inclined  to reducing the default value.



> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-4464
>                 URL: https://issues.apache.org/jira/browse/YARN-4464
>             Project: Hadoop YARN
>          Issue Type: Wish
>          Components: resourcemanager
>            Reporter: KWON BYUNGCHANG
>            Assignee: Daniel Templeton
>            Priority: Blocker
>         Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to