Yesha Vora created YARN-1777: -------------------------------- Summary: Nodemanager fails to detect Full disk and try to launch container Key: YARN-1777 URL: https://issues.apache.org/jira/browse/YARN-1777 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora
Nodemanager is not able to recognize that the disk is full. it keeps retrying to launch a container on full disk. ------------------------------------------ 2013-06-06 17:45:25,319 INFO container.Container (ContainerImpl.java:handle(852)) - Container container_1370473246485_0136_01_000018 transitioned from LOCALIZING to LOCALIZED 2013-06-06 17:45:25,328 INFO container.Container (ContainerImpl.java:handle(852)) - Container container_1370473246485_0136_01_000019 transitioned from LOCALIZED to RUNNING 2013-06-06 17:45:25,329 WARN launcher.ContainerLaunch (ContainerLaunch.java:call(255)) - Failed to launch container. java.io.IOException: mkdir of /tmp/1/hdp/yarn/local/usercache/hrt_qa/appcache/application_1370473246485_0136/container_1370473246485_0136_01_000019 failed at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1044) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:150) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:187) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:730) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:726) at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2379) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:726) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:412) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:130) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:250) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:73) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2013-06-06 17:45:25,330 INFO container.Container (ContainerImpl.java:handle(852)) - Container container_1370473246485_0136_01_000019 transitioned from RUNNING to EXITED_WITH_FAILURE 2013-06-06 17:45:25,330 INFO launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(307)) - Cleaning up container container_1370473246485_0136_01_000019 2013-06-06 17:45:25,333 WARN launcher.ContainerLaunch (ContainerLaunch.java:call(255)) - Failed to launch container. java.io.IOException: mkdir of /tmp/1/hdp/yarn/local/usercache/hrt_qa/appcache/application_1370473246485_0136/container_1370473246485_0136_01_000018 failed at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1044) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:150) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:187) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:730) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:726) at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2379) at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:726) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:412) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:130) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:250) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:73) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ------------------------------------------ -- This message was sent by Atlassian JIRA (v6.2#6252)