Hey Guys,

We're running YARN 2.2 with the capacity scheduler. Each NM is running with 40G 
of memory capacity. When we request a series containers with 2G of memory from 
a single AM, we see the RM assigning them entirely to one NM until that NM is 
full, and then moving on to the next, and so on. Essentially, we have a grid 
with 20 nodes, and two are completely full, and the rest are completely empty. 
This is problematic because our containers use disk heavily, and are completely 
saturating the disks on the two nodes, which slows all of the containers down 
on these NMs.

  1.  Is this expected behavior of the capacity scheduler? What about the fifo 
scheduler?
  2.  Is the recommended work around just to increase memory allocation 
per-container as a proxy for the disk capacity that's required? Given that 
there's no disk-level isolation, and no disk-level resource, I don't see 
another way around this.

Cheers,
Chris

Reply via email to