[ https://issues.apache.org/jira/browse/YARN-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063229#comment-14063229 ]
Sunil G commented on YARN-2297: ------------------------------- Hi [~wangda] Couple of points: 1. {code} if (toBePreempt > 0) and (container.resource < toBePreempt * 2): {code} Here you are trying identify a matching candidate container which is not more than what you need to preempt. May be 1GB to be preempted, and you have a container of 3GB. So 3Gb container should not be preempted if its only remaining container. But assume a scenario where in you need 3GB to be preempted. And candidate queue has 2 apps with below distribution. *app1*: 1 * 2GB , 1 * 3GB containers *app2*: 1 * 3GB container. Here, 2GB container will preempted and your preemption demand will become 1GB. And with above check, you will not preempt any more container. Actually a better suitable container was available later which satisfies the demand. I feel we can try tackle these corner cases also. 2. bq.We can change it to allocate container in queue from most lacking resource if there're multiple queus under a parent queue. +1 for this idea. *getUsedCapacity()* alone is used now for *queueComparator*. I feel now we will take a percentage here to find which queue is under utilized more based on its *used* vs *guaranteed_capacity* ? > Preemption can hang in corner case by not allowing any task container to > proceed. > --------------------------------------------------------------------------------- > > Key: YARN-2297 > URL: https://issues.apache.org/jira/browse/YARN-2297 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler > Affects Versions: 2.5.0 > Reporter: Tassapol Athiapinya > Assignee: Wangda Tan > Priority: Critical > > Preemption can cause hang issue in single-node cluster. Only AMs run. No task > container can run. > h3. queue configuration > Queue A/B has 1% and 99% respectively. > No max capacity. > h3. scenario > Turn on preemption. Configure 1 NM with 4 GB of memory. Use only 2 apps. Use > 1 user. > Submit app 1 to queue A. AM needs 2 GB. There is 1 task that needs 2 GB. > Occupy entire cluster. > Submit app 2 to queue B. AM needs 2 GB. There are 3 tasks that need 2 GB each. > Instead of entire app 1 preempted, app 1 AM will stay. App 2 AM will launch. > No task of either app can proceed. > h3. commands > /usr/lib/hadoop/bin/hadoop jar > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomtextwriter > "-Dmapreduce.map.memory.mb=2000" > "-Dyarn.app.mapreduce.am.command-opts=-Xmx1800M" > "-Dmapreduce.randomtextwriter.bytespermap=2147483648" > "-Dmapreduce.job.queuename=A" "-Dmapreduce.map.maxattempts=100" > "-Dmapreduce.am.max-attempts=1" "-Dyarn.app.mapreduce.am.resource.mb=2000" > "-Dmapreduce.map.java.opts=-Xmx1800M" > "-Dmapreduce.randomtextwriter.mapsperhost=1" > "-Dmapreduce.randomtextwriter.totalbytes=2147483648" dir1 > /usr/lib/hadoop/bin/hadoop jar > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep > "-Dmapreduce.map.memory.mb=2000" > "-Dyarn.app.mapreduce.am.command-opts=-Xmx1800M" > "-Dmapreduce.job.queuename=B" "-Dmapreduce.map.maxattempts=100" > "-Dmapreduce.am.max-attempts=1" "-Dyarn.app.mapreduce.am.resource.mb=2000" > "-Dmapreduce.map.java.opts=-Xmx1800M" -m 1 -r 0 -mt 4000 -rt 0 -- This message was sent by Atlassian JIRA (v6.2#6252)