MENG DING commented on YARN-4175:
Update on my testing result.
Based on my tests of this feature against a 4 node cluster using the modified
distributed shell app, the only critical issue I found is an NPE issue of
resourcemanager when there is not enough headroom. The issue has been logged in
YARN-4230. The only other minor issue I can think of is that some logging
information can be improved, for which I will log a separate (low priority)
The tests I performed so far include:
* Verify container resource increase/decrease when there are resources
available, and no limits are exceeded. Verify container sizes are reported
correctly on Web UI.
* Verify container resource increase reservation when host doesn't have enough
resource for the additional allocation. Verify resource reservation information
on Web UI (Memory Reserved, Lasts Reservation, etc)
* Verify that while an increase reservation is in place on a host, regular and
increase allocation requests from other application will be skipped on this
* Verify that an increase reservation will be fulfilled when enough resource is
freed up on the host.
* Verify that while increase reservation is in place for a container, a
decrease request to the same container (with target resource <= original
resource) will cancel the reservation.
* Verify that pending resource increase request will not be processed when
there is no headroom left (after applying patch from YARN-4230).
* Verify that invalid resource increase/decrease request will throw exception
in AMRMClient and distributed shell application master onError callback handler
will be called.
* Verify that resource monitoring is changed on NM after container
increase/decrease is completed.
* Verify that killing and restarting NM will recover increased/decreased
containers if NM work preserving restart is enabled.
* All tests are verified using both DefaultResourceCalculator and
Let me know if you have any comments or suggestions.
> Example of use YARN-1197
> Key: YARN-4175
> URL: https://issues.apache.org/jira/browse/YARN-4175
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api, nodemanager, resourcemanager
> Reporter: Wangda Tan
> Assignee: MENG DING
> Attachments: YARN-4175.1.patch
> Like YARN-2609, we need a example program to demonstrate how to use YARN-1197
> from end-to-end.
This message was sent by Atlassian JIRA