[ 
https://issues.apache.org/jira/browse/YARN-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179194#comment-15179194
 ] 

Sidharta Seethana commented on YARN-4762:
-----------------------------------------

/cc [~vvasudev]

When the new resource handler mechanism was introduced a CGroupHandlerImpl 
instance was only created/initialized if one of the resource handlers was 
enabled. Initialization does one of the following : 

#  if mounting of cgroups is enabled, does not mount anything because mounting 
is done on demand for individual resource handlers 
#  If mounting of cgroups is disabled, ‘initializeControllerPathsFromMtab’ gets 
called - which checks for writability for each of the cgroup mounts.  

(2) was correct behavior because the cgroups handler wasn’t created unless at 
least one of the (cgroups based) resource handlers was in use. However, with 
YARN-4553 , a CGroupsHandler is always created, even if there are no 
cgroups-based handlers in use. This (incorrectly) leads to an attempt to check 
if cgroups' mount paths are writable. 

I'll take a look at fixing this.

> NMs failing on DelegatingLinuxContainerRuntime init with LCE on
> ---------------------------------------------------------------
>
>                 Key: YARN-4762
>                 URL: https://issues.apache.org/jira/browse/YARN-4762
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>
> Seeing this exception and the NMs crash.
> {code}
> 2016-03-03 16:47:57,807 DEBUG org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService 
> is started
> 2016-03-03 16:47:58,027 DEBUG 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: 
> checkLinuxExecutorSetup: 
> [/hadoop/hadoop-yarn-nodemanager/bin/container-executor, --checksetup]
> 2016-03-03 16:47:58,043 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl:
>  Mount point Based on mtab file: /proc/mounts. Controller mount point not 
> writable for: cpu
> 2016-03-03 16:47:58,043 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime:
>  Unable to get cgroups handle.
> 2016-03-03 16:47:58,044 DEBUG org.apache.hadoop.service.AbstractService: 
> noteFailure org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to 
> initialize container executor
> 2016-03-03 16:47:58,044 INFO org.apache.hadoop.service.AbstractService: 
> Service NodeManager failed in state INITED; cause: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize 
> container executor
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize 
> container executor
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:240)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:539)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:587)
> Caused by: java.io.IOException: Failed to initialize linux container 
> runtime(s)!
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:207)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:238)
>         ... 3 more
> 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.AbstractService: 
> Service: NodeManager entered state STOPPED
> 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.CompositeService: 
> NodeManager: stopping services, size=0
> 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.AbstractService: 
> Service: 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService 
> entered state STOPPED
> 2016-03-03 16:47:58,047 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize 
> container executor
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:240)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:539)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:587)
> Caused by: java.io.IOException: Failed to initialize linux container 
> runtime(s)!
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:207)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:238)
>         ... 3 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to