[ https://issues.apache.org/jira/browse/YARN-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822026#comment-13822026 ]
Omkar Vinit Joshi commented on YARN-674: ---------------------------------------- Thanks [~vinodkv] bq. RMAppManager.submitApplication: Put a comment where you move apps to finish state saying we are doing this before token-renewal so that we don't renew tokens for finished apps. Added a comment. bq. isServiceStarted needs to be volatile? No.. it is updated only once just when service starts. bq. handleDTRenewerEvent -> handleDTRenewerAppSubmitEvent done.. bq. Add a comment in handleDTRenewerEvent to indicate why DTRenewer is starting the app as opposed to RMAppManager. added one.. bq. Instead of putting renewerCount in the main code path, you can access the thread count from ThreadPoolExecutor.getPoolSize() in the tests directly ? moved this to test code. bq. DelegationTokenRenewerAppSubmitEvent can be nested class inside DelegationTokenRenewer? This is not an event from outside the renewer. Similarly DelegationTokenRenewerEventType. Either nest them in, or create a separate package. moved the events and eventType inside DTTokenRenewer. bq. testInvalidDelegationTokenApplicationSubmit, testInvalidDTWithAddApplication: Seem similar but test different things. May be rename one or both? renamed both.. bq. The other point is the default number of threads in the renewer. 5 is too small, may be bump it up to existing number of RPC threads - 50 or something in that range? using thread pool with core pool size = 5 and max pool size = 50 (configurable). > Slow or failing DelegationToken renewals on submission itself make RM > unavailable > --------------------------------------------------------------------------------- > > Key: YARN-674 > URL: https://issues.apache.org/jira/browse/YARN-674 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Vinod Kumar Vavilapalli > Assignee: Omkar Vinit Joshi > Attachments: YARN-674.1.patch, YARN-674.2.patch, YARN-674.3.patch, > YARN-674.4.patch, YARN-674.5.patch, YARN-674.5.patch, YARN-674.6.patch > > > This was caused by YARN-280. A slow or a down NameNode for will make it look > like RM is unavailable as it may run out of RPC handlers due to blocked > client submissions. -- This message was sent by Atlassian JIRA (v6.1#6144)