The answer is a combination of what Robert and Bikas mentioned above. - Priorities are used to order the scheduling requests. - At a given priority, if you have requests of different sizes, it could be looking at the last request. We can clarify this in docs.
Thanks, +Vinod Kumar Vavilapalli Hortonworks Inc. http://hortonworks.com/ On Jan 3, 2013, at 3:15 AM, Tsuyoshi OZAWA wrote: > Sandy, it also depends on the timing. For instance, in MapReduce's case, > MRAppMaster requests the containers for each task separately. Could you > explain the timing when you issue each request? > > > On Thu, Jan 3, 2013 at 5:52 AM, Robert Evans <[email protected]> wrote: > >> Mappers and reducers are requested at different priorities. Reducers have >> a higher priority. But the AM does not request all of the reducers at >> once. It waits and will request some at a time until all of the mappers >> have been satisfied at which point it then requests the rest of the >> reducers. >> >> --Bobby >> >> On 1/2/13 2:47 PM, "Sandy Ryza" <[email protected]> wrote: >> >>> Thanks for looking into it Bikas. What you wrote makes sense to me. >>> You're >>> right that it's the last request not the largest. Otherwise, you >>> summarize >>> my confusion well - why doesn't AppSchedulingInfo hold a list of >>> ResourceRequests for each node/priority? >>> >>> I also don't understand why this hasn't caused a problem already for >>> mapreduce when mappers and reducers request different amounts of memory. >>> It must be either because reduces are requested after all map containers >>> are completed? Or because they're requested at non-overlapping locations? >>> >>> On Wed, Jan 2, 2013 at 11:04 AM, Bikas Saha <[email protected]> >> wrote: >>> >>>> Reading the code seems to suggest that AppSchedulingInfo is not >>>> preferring >>>> the larger request. Its simply returning the last request for that >>>> priority and hostname. So it could be that in your case, the larger >>>> request is the second request. You could try and make it the first >>>> request >>>> and check if you get the same results. >>>> >>>> Wrt, your ResourceRequest question, having a single Resource capability >>>> simplifies ResourceRequest operations. Having heterogeneous resources is >>>> allowed by the API by submitting multiple ResourceRequests having >>>> different Resource capabilities. See the RMContainerRequestor code in >>>> the >>>> MR YARN app. Given the above, it looks like the Resource heterogeneity >>>> is >>>> lost inside the AppSchedulingInfo and that may be a bug or a conscious >>>> decision. Looking at folks experienced in that code for an answer. How >>>> is >>>> everything working despite this? Perhaps because the applications are >>>> not >>>> issuing heterogeneous requests for a given priority and location. >>>> Secondly, the * catch all is always around to save the day. >>>> >>>> Let me know if this makes sense. I may have missed stuff. >>>> >>>> -----Original Message----- >>>> From: Sandy Ryza [mailto:[email protected]] >>>> Sent: Friday, December 28, 2012 4:46 PM >>>> To: [email protected] >>>> Subject: scheduler satisfying heterogeneous resource requests at same >>>> priority >>>> >>>> I am trying to understand how YARN schedulers are able to satisfy >>>> smaller >>>> requests while larger requests are outstanding (per YARN-289). >>>> >>>> Consider the following situation: >>>> An application submits two requests - one for a container with 1024 MB >>>> and >>>> one for a container with 2048 MB. 1024 MB frees up on a node. The >>>> scheduler should (or might wish to) place the smaller container on the >>>> node, instead of placing a reservation for the larger one. >>>> >>>> However, currently, if I understand correctly, the larger request is >>>> always serviced first. AppSchedulingInfo, which is used by all the >>>> schedulers to find a container request when space becomes available, >>>> stores a map of priorities to maps of node/rack/* to ResourceRequests. >>>> A >>>> ResourceRequest contains a single Resource (capability), and the number >>>> of >>>> containers. Why does a ResourceRequest not allow for heterogeneous >>>> containers. Is this just not supported yet because it hasn't been >>>> needed >>>> yet? Or is there a more fundamental reason I'm missing about why it >>>> doesn't make sense? >>>> >>>> many thanks for any guidance, >>>> Sandy >>>> >> >> > > > -- > OZAWA Tsuyoshi
