[ https://issues.apache.org/jira/browse/YARN-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17906028#comment-17906028 ]
Syed Shameerur Rahman commented on YARN-11728: ---------------------------------------------- [~zuston] - Shouldn't https://issues.apache.org/jira/browse/YARN-10259 solve this issue ? > Scheduling hang when multiple nodes placement is enabled > -------------------------------------------------------- > > Key: YARN-11728 > URL: https://issues.apache.org/jira/browse/YARN-11728 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, multi-node-placement > Reporter: Junfan Zhang > Priority: Major > Attachments: screenshot-2.png > > > When trying to use the multi node placement to enable the customize multiple > nodes lookup policy, I found this has some problems that will hang the > scheduling if having one container is reserved in one node although other > candidates nodes are with enough resources. > Let me to describe how to reproduce this problem. > h2. Preconditions > 1. Using the capacity-scheduler that enables the async scheduling > 2. Starting the hadoop yarn cluster with at least 2 nodemanagers > h2. How to reproduce > 1. Firstly, enable the default node lookup policy of > {{ResourceUsageMultiNodeLookupPolicy}} by using the following config options > in capacity-scheduler.xml > {code:xml} > <property> > <name>yarn.scheduler.capacity.multi-node-placement-enabled</name> > <value>true</value> > </property> > <property> > <name>yarn.scheduler.capacity.multi-node-sorting.policy.names</name> > <value>default</value> > </property> > <property> > > <name>yarn.scheduler.capacity.multi-node-sorting.policy</name><value>default</value> > </property> > <property> > > <name>yarn.scheduler.capacity.multi-node-sorting.policy.default.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.ResourceUsageMultiNodeLookupPolicy</value> > </property> > {code} > 2. Use the spark to submit the app with the exceeding 1 nodemanager's total > vcores container request. > If the 2 nodemanagers have the same total vcores of 96, and the spark app > request the executors instance 100, and every executor request the 1 vcores. > And then you will see this allocation will hang in the 97th container. You > will see the RM's log that will show the following logs like this: > !screenshot-2.png! > And at this time, If you want to submit another one app into this cluster, > you will see this app's AM will not be allocated any resource. > h2. Why > After digging into this yarn‘s async scheduling logic, I found something > strange about the multi node placement. Simple to say is that the scheduling > hange is caused by the one reserved container. > For the multiple node placement is enabled, for one container which is > selected by some specified policy, it is not noly matched with single one > candidate nodemanager, but with the multiple nodes. The sequece of the > multiple nodes is determinzed by the customize lookup policy, the default is > the {{ResourceUsageMultiNodeLookupPolicy}}. > And the policy is managed by the {{MultiNodeSortingManager}}, that will use > the policy to resort the cluster's all healthy nodes with the interval 1 > second. > 1. Now let's suppose in the first 1 second, the nodes sequence is (node1, > node2). And the 97th container(1th container is AM) will be reserved in the > node1. > 2. For the next time for async scheduling thread, this will find this > reserved container and try to re-reserve/re-start. Pity, no existing > container will be release. > 3. And after 1 second, the sorting policy make effect that will resort the > nodes sequence, and then it is (node2, node1). For normal thought, if the > node1 is full of container with no free resource, the reserved container > could be picked up by another node(like node2). But this is allowed for yarn, > and so, hang happens. > h2. How to fix this > 1. If having multiple nodes candidates, we should lookup all the nodes until > having the enough resource to start instead of reserving > 2. Allow to other nodes to pick up the reserved container > If you want to know more about this, I wrote the dedicated chinese blog to > describe this problem. https://zuston.vercel.app/publish/hadoop-yarn/ -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org