As I knew,,I never seen RM assigning containers entirely one NM until that NM
is full....I might be wrong...
But I used to configure following three properties,which will restirict the
number of containers for one NM ..
Hope following configurations will help you to solve your disk saturation and
following formula is obtained based on some tests.
# of containers = min (2*CORES, 1.8*DISKS, (Total available RAM) /
MIN_CONTAINER_SIZE)
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>40960</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>8192</value>
</property>
Thanks & Regards
Brahma Reddy Battula
________________________________________
From: Sandy Ryza [[email protected]]
Sent: Saturday, March 22, 2014 12:18 PM
To: [email protected]
Subject: Re: Capacity scheduler puts all containers on one box
For a work-around, if you turn on DRF / multi-resource scheduling, you
could use vcore capacities to limit the number of containers per node?
On Fri, Mar 21, 2014 at 11:35 PM, Chris Riccomini
<[email protected]>wrote:
> Hey Guys,
>
> @Vinod: We aren't overriding the default, so we must be using -1 as the
> setting.
>
> @Sandy: We aren't specifying any racks/hosts when sending the resource
> requests. +1 regarding introducing a similar limit in capacity scheduler.
>
> Any recommended work-arounds in the mean time? Our utilization of the grid
> is very low because we're having to force high memory requests for the
> containers in order to guarantee a maximum number of containers on a
> single node (e.g. Set container memory MB set to 17GB to disallow more
> than 2 containers from being assigned to any one 48GB node).
>
> Cheers,
> Chris
>
> On 3/21/14 11:30 PM, "Sandy Ryza" <[email protected]> wrote:
>
> >yarn.scheduler.capacity.node-locality-delay will help if the app is
> >requesting containers at particular locations, but won't help spread
> >things
> >out evenly otherwise.
> >
> >The Fair Scheduler attempts an even spread. By default, it only schedules
> >a single container each time it considers a node. Decoupling scheduling
> >from node heartbeats (YARN-1010) makes it so that a high node heartbeat
> >interval doesn't result in this being slow. Now that the Capacity
> >Scheduler has similar capabilities (YARN-1512), it might make sense to
> >introduce a similar limit?
> >
> >-Sandy
> >
> >
> >On Fri, Mar 21, 2014 at 4:42 PM, Vinod Kumar Vavilapalli
> ><[email protected]
> >> wrote:
> >
> >> What's the value for yarn.scheduler.capacity.node-locality-delay? It is
> >>-1
> >> by default in 2.2.
> >>
> >> We fixed the default to be a reasonable 40 (nodes in a rack) in 2.3.0
> >>that
> >> should spread containers a bit.
> >>
> >> Thanks,
> >> +Vinod
> >>
> >> On Mar 21, 2014, at 12:48 PM, Chris Riccomini <[email protected]>
> >> wrote:
> >>
> >> > Hey Guys,
> >> >
> >> > We're running YARN 2.2 with the capacity scheduler. Each NM is running
> >> with 40G of memory capacity. When we request a series containers with
> >>2G of
> >> memory from a single AM, we see the RM assigning them entirely to one NM
> >> until that NM is full, and then moving on to the next, and so on.
> >> Essentially, we have a grid with 20 nodes, and two are completely full,
> >>and
> >> the rest are completely empty. This is problematic because our
> >>containers
> >> use disk heavily, and are completely saturating the disks on the two
> >>nodes,
> >> which slows all of the containers down on these NMs.
> >> >
> >> > 1. Is this expected behavior of the capacity scheduler? What about
> >>the
> >> fifo scheduler?
> >> > 2. Is the recommended work around just to increase memory allocation
> >> per-container as a proxy for the disk capacity that's required? Given
> >>that
> >> there's no disk-level isolation, and no disk-level resource, I don't see
> >> another way around this.
> >> >
> >> > Cheers,
> >> > Chris
> >>
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or
> >>entity to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> >>reader
> >> of this message is not the intended recipient, you are hereby notified
> >>that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> >>immediately
> >> and delete it from your system. Thank You.
> >>
>
>