[ 
https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MENG DING updated YARN-1449:
----------------------------
    Attachment: YARN-1449.1.patch

Attaching patch for review.

The patch has passed the {{test-patch}} script, and includes the following 
changes:
* Added *ChangeContainersResourceRequest*/*ChangeContainersResourceResponse* 
protocol
* Added *changeContainersResource* method in *ContainerManagementProtocol*
* Updated *ContainerManagerImpl* to implement the container resource change 
actions
* Updated unit tests

The patch does *NOT* include the implementation of changes to the *NodeStatus* 
yet. I would like to have some further discussion on the changes to the 
NodeStatusProto, especially now we want to update the node heartbeat response 
to let RM confirm the final resource changes with NM. [~leftnoteasy], do you 
think it would be a good idea to reopen YARN-1644 so that I can initiate the 
discussion and post patches in that thread for NodeStatus changes? If you think 
it is not necessary, I will discuss in this thread. 

I was able to reuse a lot of the code from the original patch :-). The major 
differences are listed as follows:

* The *ChangeContainersResourceResponse* now returns a containerID to exception 
Map for failed requests, instead of a list of failed containerIDs. This is to 
be consistent with other APIs.
* In {{ContainerManagerImpl.java}}
** More strict checking of the resource change request, including checking 
token expiration and RM identifier.
** Reject resource change requests with both resource increase and decrease 
specified for the same container in the same request.
** Check validity of the target resource. For decrease request, the target 
resource must fit in the current resource, otherwise, the request will be 
rejected right away.
** Added a {{CHANGE_CONTAINER}} event so that container resource change and 
nodemanager metrics updates will be routed to {{ContainerImpl}}. I believe this 
is more consistent with the current event model (e.g., from 
{{CONTAINER_LAUNCHED}} event to {{START_MONITORING_CONTAINER}}).
** Synchronize the calls to change/stop/getstatus of containers.
* In {{ContainerImpl}}
** The {{Resource}} field must be updated now after each successful resource 
change. It will be used to compare against any invalid resource change coming 
from AM.
** The nodemanager metrics needs to be updated as well.
** Fire {{CHANGE_MONITORING_CONTAINER}} event in 
{{ContainerResourceChangeTransition}}.

Thanks a lot.

> Protocol changes in NM side to support change container resource
> ----------------------------------------------------------------
>
>                 Key: YARN-1449
>                 URL: https://issues.apache.org/jira/browse/YARN-1449
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Wangda Tan (No longer used)
>         Attachments: YARN-1449.1.patch, yarn-1449.1.patch, yarn-1449.3.patch, 
> yarn-1449.4.patch, yarn-1449.5.patch
>
>
> As described in YARN-1197, we need add API/implementation changes,
> 1) Add a "changeContainersResources" method in ContainerManagementProtocol
> 2) Can get succeed/failed increased/decreased containers in response of 
> "changeContainersResources"
> 3) Add a "new decreased containers" field in NodeStatus which can help NM 
> notify RM such changes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to