[ https://issues.apache.org/jira/browse/YARN-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493401#comment-14493401 ]
Junping Du commented on YARN-3443: ---------------------------------- +1. Latest patch LGTM. Double check all javadoc warnings under hadoop-yarn-server-nodemanager component, nothing is related to this patch (all under hadoop-yarn-server-nodemanager). Will commit it shortly. > Create a 'ResourceHandler' subsystem to ease addition of support for new > resource types on the NM > ------------------------------------------------------------------------------------------------- > > Key: YARN-3443 > URL: https://issues.apache.org/jira/browse/YARN-3443 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Reporter: Sidharta Seethana > Assignee: Sidharta Seethana > Attachments: YARN-3443.001.patch, YARN-3443.002.patch, > YARN-3443.003.patch, YARN-3443.004.patch, YARN-3443.005.patch > > > Today, support for CPU and memory as resources (on linux) are implemented in > a way that cannot be easily extended to other new resource types (e.g > network/disk). For example, some functionality cgroups functionality is > implemented in LCE (mountCgroups) and the rest in CgroupsLCEResourcesHandler. > CPU specific functionality is also implemented in CgroupsLCEResourcesHandler > - using this handler automatically enables CPU as a resource. Some cgroups > functionality requires elevated/super-user privileges and needs to be > implemented via the container-executor binary. Implementing support for a new > resource type in linux using the existing classes/mechanisms would be messy > (for example, we might have to significantly modify/bloat > CgroupsLCEResourceHandler). As an alternative, we have implemented a new > ‘ResourceHandler’ mechanism that makes things cleaner and enables easier > addition of new resource types. When adding support for a new resource type > in the NM (from an isolation/enforcement perspective), there are three > different pieces required : > 1) generic cgroups utilities that can be re-used across multiple resource > handler ( e.g for CPU, Network, Disk). For example for net_cls we want to be > able to create new cgroups, update cgroup params, read cgroup params etc. > 2) A mechanism to execute ‘PrivilegedOperation’s whose functionality requires > super-user privileges and is implemented by container-executor binary > 3) Implementation that is specific to a resource type ( i.e network, disk > would each have an implementation that provides isolation/enforcement for > that resource type) > Corresponding to the three pieces listed above, the patch for YARN-3443 > provides the following : > 1) cgroups functionality that can be used across different resource types. > CGroupsHandler.java specifies the interface and implementation is in > CGroupsHandlerImpl.java . New cgroups controller types can be easily added to > CGroupsHandler.java as and when necessary > 2) PrivilegedOperation.java and PrivilegedOperationExecutor.java wrap the > container-executor binary and provide a way of executing operations that > require elevated privileges. There are also utility functions that help > ‘batching’ of certain kinds of operations in order to avoid multiple > invocations of the container-executor binary > 3) ResourceHandler.java specifies an interface that custom resource handlers > are expected to implement. This interface provides hooks for various > operations during a container lifecyle - bootstrap, preStart, postComplete, > reAcquire, teardown. Each of these hooks return a list of privileged > operations - this is done so that the resulting set of privileged operations > can be batched for performance reasons, if necessary. > ResourceHandlerChain.java provides a simple chaining mechanism across > multiple resource handlers. This is useful when multiple resource handlers > are in place. They can be chained in sequence - e.g cpu, network, disk . A > resource handler chain would hook in directly into LCE at various points in > the container life cycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)