[ 
https://issues.apache.org/jira/browse/YARN-9979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973932#comment-16973932
 ] 

zhoukang commented on YARN-9979:
--------------------------------

I think we can add throttle logic for ContainerAllocationExpirer

> When a app expired with many containers , scheduler event size will be huge
> ---------------------------------------------------------------------------
>
>                 Key: YARN-9979
>                 URL: https://issues.apache.org/jira/browse/YARN-9979
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, scheduler
>            Reporter: zhoukang
>            Assignee: zhoukang
>            Priority: Major
>
> When there is an app expired with many containers, the scheduler event size 
> will be huge.
> {code:java}
> 2019-11-11,21:39:49,690 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 9000
> 2019-11-11,21:39:49,695 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 10000
> 2019-11-11,21:39:49,700 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 11000
> 2019-11-11,21:39:49,705 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 12000
> 2019-11-11,21:39:49,710 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 13000
> 2019-11-11,21:39:49,715 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 14000
> 2019-11-11,21:39:49,720 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Discarded 1 
> messages due to full event buffer including: Size of scheduler event-queue is 
> 15000
> 2019-11-11,21:39:49,724 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 16000
> 2019-11-11,21:39:49,729 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 17000
> 2019-11-11,21:39:49,733 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 18000
> 2019-11-11,21:40:14,953 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 19000
> 2019-11-11,21:43:09,743 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 19000
> 2019-11-11,21:43:09,750 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 20000
> 2019-11-11,21:43:09,758 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 21000
> 2019-11-11,21:43:09,766 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 22000
> 2019-11-11,21:43:09,775 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 23000
> 2019-11-11,21:43:09,783 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 24000
> 2019-11-11,21:43:09,792 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 25000
> 2019-11-11,21:43:09,800 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 26000
> 2019-11-11,21:43:09,807 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 27000
> 2019-11-11,21:43:09,814 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 28000
> 2019-11-11,21:46:29,830 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 29000
> 2019-11-11,21:46:29,841 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 30000
> 2019-11-11,21:46:29,850 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 31000
> 2019-11-11,21:46:29,862 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 32000
> 2019-11-11,21:49:49,875 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 33000
> 2019-11-11,21:49:49,875 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 34000
> 2019-11-11,21:49:49,876 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 35000
> 2019-11-11,21:49:49,882 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 36000
> 2019-11-11,21:49:49,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 37000
> 2019-11-11,21:49:49,891 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 38000
> 2019-11-11,21:49:49,896 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 39000
> 2019-11-11,21:49:49,900 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 40000
> 2019-11-11,21:49:49,905 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 41000
> 2019-11-11,21:49:49,910 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 42000
> 2019-11-11,21:49:49,914 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 43000
> 2019-11-11,21:49:49,919 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 44000
> 2019-11-11,21:49:49,923 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 45000
> 2019-11-11,21:49:49,927 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 46000
> 2019-11-11,21:49:49,932 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 47000
> 2019-11-11,21:49:49,938 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 48000
> 2019-11-11,21:49:49,943 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 49000
> 2019-11-11,21:49:49,947 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 50000
> 2019-11-11,21:49:49,951 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 51000
> 2019-11-11,21:49:49,956 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 52000
> 2019-11-11,21:49:49,961 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 53000
> 2019-11-11,21:49:49,967 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 54000
> 2019-11-11,21:49:49,972 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 55000
> 2019-11-11,21:49:49,976 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 56000
> 2019-11-11,21:49:49,980 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 57000
> 2019-11-11,21:49:49,983 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 58000
> 2019-11-11,21:49:49,988 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 59000
> 2019-11-11,21:49:49,991 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 60000
> 2019-11-11,21:49:49,996 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 61000
> 2019-11-11,21:53:10,004 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 61000
> 2019-11-11,21:53:10,014 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 62000
> 2019-11-11,21:53:10,022 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 63000
> 2019-11-11,21:53:10,032 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 64000
> 2019-11-11,21:53:10,034 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 65000
> 2019-11-11,21:53:10,040 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 66000
> 2019-11-11,21:53:10,046 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 67000
> 2019-11-11,21:56:30,056 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 68000
> 2019-11-11,21:56:30,067 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 69000
> 2019-11-11,21:56:30,077 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 70000
> 2019-11-11,21:56:30,086 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 71000
> 2019-11-11,21:56:30,094 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 72000
> 2019-11-11,21:56:30,102 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 73000
> 2019-11-11,21:56:30,107 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 74000
> 2019-11-11,21:56:30,111 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 75000
> 2019-11-11,21:56:30,116 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 76000
> 2019-11-11,21:56:30,122 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 77000
> 2019-11-11,21:59:50,128 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 78000
> 2019-11-11,21:59:50,135 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 79000
> 2019-11-11,21:59:50,140 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 80000
> 2019-11-11,21:59:50,145 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 81000
> 2019-11-11,21:59:50,149 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 82000
> 2019-11-11,21:59:50,154 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 83000
> 2019-11-11,21:59:50,159 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 84000
> 2019-11-11,21:59:50,164 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 85000
> 2019-11-11,21:59:50,168 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 86000
> 2019-11-11,21:59:52,305 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 87000
> 2019-11-11,22:03:10,175 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 87000
> 2019-11-11,22:03:10,181 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 88000
> 2019-11-11,22:03:10,186 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 89000
> 2019-11-11,22:03:10,191 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 90000
> 2019-11-11,22:03:10,196 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 91000
> 2019-11-11,22:03:10,201 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 92000
> 2019-11-11,22:03:10,206 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Discarded 1 
> messages due to full event buffer including: Size of scheduler event-queue is 
> 93000
> 2019-11-11,22:03:10,211 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 94000
> 2019-11-11,22:03:10,215 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Discarded 1 
> messages due to full event buffer including: Size of scheduler event-queue is 
> 95000
> 2019-11-11,22:06:30,221 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 96000
> 2019-11-11,22:06:30,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 97000
> 2019-11-11,22:06:30,234 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 98000
> 2019-11-11,22:06:30,240 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 99000
> 2019-11-11,22:06:30,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 100000
> 2019-11-11,22:06:30,250 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 101000
> 2019-11-11,22:07:40,962 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 102000
> 2019-11-11,22:09:50,259 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 91000
> 2019-11-11,22:09:50,269 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 92000
> 2019-11-11,22:09:50,278 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 93000
> 2019-11-11,22:09:50,287 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 94000
> 2019-11-11,22:09:50,295 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 95000
> 2019-11-11,22:09:50,302 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 96000
> 2019-11-11,22:09:50,310 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 97000
> 2019-11-11,22:13:03,082 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 53000
> 2019-11-11,22:13:10,318 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 54000
> 2019-11-11,22:13:10,324 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 55000
> 2019-11-11,22:13:10,330 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 56000
> 2019-11-11,22:13:10,338 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 57000
> 2019-11-11,22:13:10,347 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 58000
> 2019-11-11,22:13:10,354 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of 
> scheduler event-queue is 59000
> {code}
> Container expired at given time:
> {code:java}
> [work@xxx zhoukang-yarn]$ grep "21:39:" expired.1 | wc -l
> 11377
> [work@xxx zhoukang-yarn]$ grep "21:43:" expired.1 | wc -l
> 10508
> [work@xxx zhoukang-yarn]$ grep "21:49:" expired.1 | wc -l
> 29269
> {code}
> {code:java}
> private class PingChecker implements Runnable {
>     @Override
>     public void run() {
>       while (!stopped && !Thread.currentThread().isInterrupted()) {
>         synchronized (AbstractLivelinessMonitor.this) {
>           Iterator<Map.Entry<O, Long>> iterator = 
> running.entrySet().iterator();
>           // avoid calculating current time everytime in loop
>           long currentTime = clock.getTime();
>           while (iterator.hasNext()) {
>             Map.Entry<O, Long> entry = iterator.next();
>             O key = entry.getKey();
>             long interval = getExpireInterval(key);
>             if (currentTime > entry.getValue() + interval) {
>               iterator.remove();
>               expire(key);
>               LOG.info("Expired:" + entry.getKey().toString()
>                   + " Timed out after " + interval / 1000 + " secs");
>             }
>           }
>         }
>         try {
>           Thread.sleep(monitorInterval);
>         } catch (InterruptedException e) {
>           LOG.info(getName() + " thread interrupted");
>           break;
>         }
>       }
>     }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to