[ 
https://issues.apache.org/jira/browse/YARN-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945988#comment-14945988
 ] 

MENG DING commented on YARN-4175:
---------------------------------

Update on my testing result.

Based on my tests of this feature against a 4 node cluster using the modified 
distributed shell app, the only critical issue I found is an NPE issue of 
resourcemanager when there is not enough headroom. The issue has been logged in 
YARN-4230. The only other minor issue I can think of is that some logging 
information can be improved, for which I will log a separate (low priority) 
issue.

The tests I performed so far include:
* Verify container resource increase/decrease when there are resources 
available, and no limits are exceeded. Verify container sizes are reported 
correctly on Web UI.
* Verify container resource increase reservation when host doesn't have enough 
resource for the additional allocation. Verify resource reservation information 
on Web UI (Memory Reserved, Lasts Reservation, etc)
* Verify that while an increase reservation is in place on a host, regular and 
increase allocation requests from other application will be skipped on this 
host.
* Verify that an increase reservation will be fulfilled when enough resource is 
freed up on the host.
* Verify that while increase reservation is in place for a container, a 
decrease request to the same container (with target resource <= original 
resource) will cancel the reservation.
* Verify that pending resource increase request will not be processed when 
there is no headroom left (after applying patch from YARN-4230).
* Verify that invalid resource increase/decrease request will throw exception 
in AMRMClient and distributed shell application master onError callback handler 
will be called.
* Verify that resource monitoring is changed on NM after container 
increase/decrease is completed.
* Verify that killing and restarting NM will recover increased/decreased 
containers if NM work preserving restart is enabled.
* All tests are verified using both DefaultResourceCalculator and 
DominantResourceCalculator.

Let me know if you have any comments or suggestions.

> Example of use YARN-1197
> ------------------------
>
>                 Key: YARN-4175
>                 URL: https://issues.apache.org/jira/browse/YARN-4175
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, nodemanager, resourcemanager
>            Reporter: Wangda Tan
>            Assignee: MENG DING
>         Attachments: YARN-4175.1.patch
>
>
> Like YARN-2609, we need a example program to demonstrate how to use YARN-1197 
> from end-to-end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to