[ https://issues.apache.org/jira/browse/YARN-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395402#comment-14395402 ]
Junping Du commented on YARN-402: --------------------------------- Thanks [~lohit] for reporting this issue. I think it could be a little too allergic to give a warn when half full of the queue. By default, the size of LinkedBlockingQueue is: Interger.MAX_VALUE which is 2^31-1. Half full means: still ~2^30 available for use so it could be too early. Do we want a configurable value here? I think it could be a little overkill. If so, we may need to pick up a more reasonable fixed value here. IMO, rmDispatcher could be the most busy AsynDispatcher in YARN today, RMNodeEvent, SchedulerEvent, RMAppEvent, RMAppAttemptEvent, NodeListManagerEvent, AMLauncherEvent, etc. are all get broadcasted on this single dispatcher. Within these events, SchedulerEvent seems to be the most active events: let's assume thousands of nodes events and thousands of application attempt events generated in 1 second (default heartbeat interval for NM-RM heartbeat and AMRMClientAsync heartbeat to RM) in large cluster, then we assume 10*1000 scheduler events could happens on rmDispatcher, then we can estimate up to 10*(10*1000) events (include other events than SchedulerEvent) could happens per second there. Based on this assumption, if we want to warn ahead of 10 seconds before queue get full (assume peek operations get slow), so may be 10 (seconds) * 10 (event types on rmScheduler) * (10*1000) (scale of Nodes and Apps / interval) sounds like a reasonable value here? In addition, I think we should fix tiny issue in below code (qSize % 1000 == 0) doesn't make sense as qSize default to be 2^32 -1: {code} int qSize = eventQueue.size(); if (qSize !=0 && qSize %1000 == 0) { LOG.info("Size of event-queue is " + qSize); } int remCapacity = eventQueue.remainingCapacity(); if (remCapacity < 1000) { LOG.warn("Very low remaining capacity in the event-queue: " + remCapacity); } {code} > Dispatcher warn message is too late > ----------------------------------- > > Key: YARN-402 > URL: https://issues.apache.org/jira/browse/YARN-402 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Lohit Vijayarenu > Priority: Minor > > AsyncDispatcher throws out Warn when capacity remaining is less than 1000 > {noformat} > if (remCapacity < 1000) { > LOG.warn("Very low remaining capacity in the event-queue: " > + remCapacity); > } > {noformat} > What would be useful is to warn much before that, may be half full instead of > when queue is completely full. I see that eventQueue capacity is int value. > So, if one warn's queue has only 1000 capacity left, then service definitely > has serious problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)