[
https://issues.apache.org/jira/browse/YARN-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149745#comment-16149745
]
Junping Du commented on YARN-5731:
----------------------------------
Sounds like we forget to commit to branch-2.8.2. Just commit it.
> Preemption calculation is not accurate when reserved containers are present
> in queue.
> -------------------------------------------------------------------------------------
>
> Key: YARN-5731
> URL: https://issues.apache.org/jira/browse/YARN-5731
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Affects Versions: 2.8.0
> Reporter: Sunil G
> Assignee: Wangda Tan
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.2
>
> Attachments: YARN-5731.001.patch, YARN-5731.002.patch,
> YARN-5731.addendum.003.patch, YARN-5731.addendum.004.patch,
> YARN-5731.branch-2.002.patch, YARN-5731-branch-2.8.001.patch,
> YARN-5731-branch-2.8.001.patch, YARN-5731.branch-2.8.004.patch
>
>
> YARN Capacity Scheduler does not kick Preemption under below scenario.
> Two queues A and B each with 50% capacity and 100% maximum capacity and user
> limit factor 2. Minimum Container size is 1536MB and total cluster resource
> is 40GB. Now submit the first job which needs 1536MB for AM and 9 task
> containers each 4.5GB to queue A. Job will get 8 containers total (AM 1536MB
> + 7 * 4.5GB = 33GB) and the cluster usage is 93.8% and the job has reserved a
> container of 4.5GB.
> Now when next job (1536MB for AM and 9 task containers each 4.5GB) is
> submitted onto queue B. The job hangs in ACCEPTED state forever and RM
> scheduler never kicks in Preemption. (RM UI Image 2 attached)
> Test Case:
> ./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client
> --queue A --executor-memory 4G --executor-cores 4 --num-executors 9
> ../lib/spark-examples*.jar 1000000
> After a minute..
> ./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client
> --queue B --executor-memory 4G --executor-cores 4 --num-executors 9
> ../lib/spark-examples*.jar 1000000
> Credit to: [~Prabhu Joseph] for bug investigation and troubleshooting.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]