[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-1197:
-----------------------------

    Attachment: yarn-server-resourcemanager.patch.ver.1
                yarn-server-nodemanager.patch.ver.1
                yarn-server-common.patch.ver.1
                yarn-pb-impl.patch.ver.1
                yarn-api-protocol.patch.ver.1
                tools-project.patch.ver.1
                mapreduce-project.patch.ver.1

I just finished container resource increase support including PB/API changes, 
make capacity scheduler support increasing and NM can support change monitoring 
size of a running container.

*I splitted it to several patches for easier review,*
* API/pb file changes in hadoop-yarn-api
* PB implementations in hadoop-yarn-common
* yarn-server-common changes 
* yarn-server-resourcemanager changes include capacity scheduler and AMS 
master, etc. changes  
* yarn-server-nodemanager changes include ContainerManagerImpl and 
ContainersMonitor changes
* other related project changes according to updated APIs (map-reduce/tools)
Aboves a preview patches, still very rough, [~bikassaha], [~sandyr], [~tucu00] 
, [~vinodkv] could you please do some review on them, I'm eager for your ideas!

*And some short notes for current implementations on RM/NM not covered in 
design doc,*
*1) Implementation in capacity scheduler for increasing a container size*
It's very close to allocate a new container, some details,
* Increase request can be only valid when asked size larger than existed 
resource, and container state is either RUNNING or ACQUIRED
* The entry point of increase request allocation is still in 
CapacityScheduler:nodeUpdate()
* When increase request cannot be allocated, it will also be reserved. Each 
node can only reserve at most one request (increase request or new container 
request). I created a new method isReserved() in FiCaSchedulerNode to make 
scheduler/queue identify if a node is reserved
* The major logic for increase request allocation is also placed in 
LeafQueue:assignContainers, increase requests will be proceeded before new 
container request.
* Queue(leaf/parent) capacity and user capacity checking will also be done 
before reserve or allocate a increase request
* Queue(leaf/parent) used resource will also be deduct when increase request 
reserved
* Users may submit increase request several times on a same container with 
different size.
** If asked size is equal to previous asked size, it will be ignored 
** If asked size is smaller or equal to existed size, this will cancel increase 
request on this container
** If asked size is different of previous asked size, and greater than existing 
size, it will replace previous ask and cancel previous reservations (if 
existed).

*2) Implementation in node manager for increasing a container size*
* It will do a similar check logic (like token verifications, etc.) like start 
container
* Increase logic will only valid when ContainerState(The internal 
ContainerState: 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerState)
 is RUNNING, to avoid a racing condition.
* ContainersMonitorImpl will put change requests to containersToBeChanged when 
received CHANGE_MONITORING_CONTAINER event. And it will be proceeded in 
MonitoringThread:run()

> Support changing resources of an allocated container
> ----------------------------------------------------
>
>                 Key: YARN-1197
>                 URL: https://issues.apache.org/jira/browse/YARN-1197
>             Project: Hadoop YARN
>          Issue Type: Task
>          Components: api, nodemanager, resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: mapreduce-project.patch.ver.1, 
> tools-project.patch.ver.1, yarn-1197-v2.pdf, yarn-1197-v3.pdf, 
> yarn-1197-v4.pdf, yarn-1197.pdf, yarn-api-protocol.patch.ver.1, 
> yarn-pb-impl.patch.ver.1, yarn-server-common.patch.ver.1, 
> yarn-server-nodemanager.patch.ver.1, yarn-server-resourcemanager.patch.ver.1
>
>
> Currently, YARN cannot support merge several containers in one node to a big 
> container, which can make us incrementally ask resources, merge them to a 
> bigger one, and launch our processes. The user scenario is described in the 
> comments.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to