[
https://issues.apache.org/jira/browse/YARN-7693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309129#comment-16309129
]
Jiandan Yang commented on YARN-7693:
-------------------------------------
[[email protected]] Thanks for your attention. This jira does not
conflict with YARN-7064. I file this jira because currently
ContainersMonitorImpl has some problems:
1. online service may be crash due to high system resource utilization.
ContainersMonitorImpl only check pmem and vmem of every container, and did not
check the overall system utilization. This may be impact online service when
offline task and online service run on the Yarn at the same time. For example,
each container's memory did not exceed the limit, but the system's total memory
utilization may be 100% because of oversubscription, and the decision of
killing container by RM may not be timely enough, then it will affect the
online service.
2. Directly kill Opportunistic container is too violent. Dynamically adjusting
Opportunistic container resources may be a better choice.
So I proposal to:
1) Seperate containers into two different group Opportunistic_Group and
Guaranteed_Group under *hadoop-yarn*
2) Monitor system resource utilization and dynamically adjust resource of
Opportunistic_Group
3) Kill container only when adjust resource fail for given times
> ContainersMonitor support configurable
> --------------------------------------
>
> Key: YARN-7693
> URL: https://issues.apache.org/jira/browse/YARN-7693
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager
> Reporter: Jiandan Yang
> Assignee: Jiandan Yang
> Priority: Minor
> Attachments: YARN-7693.001.patch, YARN-7693.002.patch
>
>
> Currently ContainersMonitor has only one default implementation
> ContainersMonitorImpl,
> After introducing Opportunistic Container, ContainersMonitor needs to monitor
> system metrics and even dynamically adjust Opportunistic and Guaranteed
> resources in the cgroup, so another ContainersMonitor may need to be
> implemented.
> The current ContainerManagerImpl ContainersMonitorImpl direct new
> ContainerManagerImpl, so ContainersMonitor need to be configurable.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]