Hi YARN devs,

I am working on the ZKRMStateStore, and had a very basic question - on RM
failure, how long does the AM fail before crashing, or more importantly
what controls it.

Looking into the code, I see the following two parameters:

   1. yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms - set to
   1 min
   2. Fix configs
   
yarn.resourcemanager.resourcemanager.connect.{max.wait.secs|retry_interval.secs}
   - set by default to 15 mins and 30 seconds respectively

The AM crashes only after 20 minutes.

Are there any other configs that influence this?

Thanks
Karthik

Reply via email to