[
https://issues.apache.org/jira/browse/YARN-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181322#comment-15181322
]
Vinod Kumar Vavilapalli commented on YARN-4762:
-----------------------------------------------
Overall, acknowledging that the new layering is a WIP, it takes a bit of effort
to wrap around the dependencies and component ordering..
Coming to the patch, it looks good overall to me.
One minor comment: If we can change DelegatingLinuxContainerRuntime to not know
about cGroupsHandler at all, that will be good.
Trivial issues, not related to the patch:
- ResourceHandlerModule
-- getcGroupsCpuResourceHandler() has a typo in the case: cGroups instead
of CGroups as in other methods.
-- Some modified lines in the patch overflowed 80 chars. Jenkins is likely
to complain.
- Unused import of ResourceHandlerException in DelegatingLinuxContainerRuntime
> NMs failing on DelegatingLinuxContainerRuntime init with LCE on
> ---------------------------------------------------------------
>
> Key: YARN-4762
> URL: https://issues.apache.org/jira/browse/YARN-4762
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Sidharta Seethana
> Priority: Blocker
> Attachments: YARN-4762.001.patch
>
>
> Seeing this exception and the NMs crash.
> {code}
> 2016-03-03 16:47:57,807 DEBUG org.apache.hadoop.service.AbstractService:
> Service
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService
> is started
> 2016-03-03 16:47:58,027 DEBUG
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
> checkLinuxExecutorSetup:
> [/hadoop/hadoop-yarn-nodemanager/bin/container-executor, --checksetup]
> 2016-03-03 16:47:58,043 ERROR
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl:
> Mount point Based on mtab file: /proc/mounts. Controller mount point not
> writable for: cpu
> 2016-03-03 16:47:58,043 ERROR
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime:
> Unable to get cgroups handle.
> 2016-03-03 16:47:58,044 DEBUG org.apache.hadoop.service.AbstractService:
> noteFailure org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
> initialize container executor
> 2016-03-03 16:47:58,044 INFO org.apache.hadoop.service.AbstractService:
> Service NodeManager failed in state INITED; cause:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize
> container executor
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize
> container executor
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:240)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:539)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:587)
> Caused by: java.io.IOException: Failed to initialize linux container
> runtime(s)!
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:207)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:238)
> ... 3 more
> 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.AbstractService:
> Service: NodeManager entered state STOPPED
> 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.CompositeService:
> NodeManager: stopping services, size=0
> 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.AbstractService:
> Service:
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService
> entered state STOPPED
> 2016-03-03 16:47:58,047 FATAL
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting
> NodeManager
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize
> container executor
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:240)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:539)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:587)
> Caused by: java.io.IOException: Failed to initialize linux container
> runtime(s)!
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:207)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:238)
> ... 3 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)