Hi Folks I'm working on the Hadoop FileSystem validation workstream (https://wiki.apache.org/hadoop/HCFS/Progress) over at Hadoop Common. To do that we're building a library of Hadoop FileSystem tests that will run against FileSystems configured within Hadoop 2.0. I have YARN working on HDFS and LocalFS, next I'm trying to get YARN running on top of GlusterFS using the GlusterFS Hadoop FileSystem plugin. The plugin works just fine on Hadoop 1.x.
When I start the JobHistoryServer it fails with an UnsupportedFileSystemException (full stack trace below). I did a bit of googling and ran into Karthik over at the QFS community (https://groups.google.com/forum/#!topic/qfs-devel/KF3AAFheNq8) who had the same issue and has also been unsuccessful at getting this working. I've provided my core-site file below. The glusterfs plugin jar is copied into share/hadoop/common/lib/, share/hadoop/mapreduce/lib and share/hadoop/yarn/lib so I don't think this is a classpath issue. Perhaps the exception is a result of misconfiguration somewhere? -- Core Site -- <configuration> <property> <name>fs.defaultFS</name> <value>glusterfs://amb-1:9000</value> </property> <property> <name>fs.default.name</name> <value>glusterfs://amb-1:9000</value> </property> <property> <name>fs.glusterfs.server</name> <value>amb-1</value> </property> <property> <name>fs.glusterfs.impl</name> <value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value> </property> </configuration> -- Stack Trace -- STARTUP_MSG: build = git://pico-2-centos-6-3--01.hortonworks.com/home/jenkins/workspace/BIGTOP-BigWheelAplha-2-HDP-RPM-SYNC-REPO/label/centos6-3/build/hadoop/rpm/BUILD/hadoop-2.0.3.22-alpha-src/hadoop-common-project/hadoop-common -r bdb84648f423eb2b7af5cb97c7192193a5a57956; compiled by 'jenkins' on Fri Mar 15 02:03:54 PDT 2013 STARTUP_MSG: java = 1.6.0_43 ************************************************************/ 2013-06-08 05:46:23,796 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: JobHistory Init 2013-06-08 05:46:24,015 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: glusterfs 2013-06-08 05:46:24,015 FATAL org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting JobHistoryServer org.apache.hadoop.yarn.YarnException: Error creating done directory: [null] at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:424) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:87) at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58) at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:87) at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:145) Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: glusterfs at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:146) at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:234) at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:342) at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:339) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441) at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:339) at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:453) at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:475) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:417) ----- Original Message ----- From: "Vinod Kumar Vavilapalli" <[email protected]> To: [email protected] Sent: Thursday, June 20, 2013 6:32:42 PM Subject: Re: FileNotFoundExceptions with Pseudo Distributed YARN MR using the Local FileSystem Please let us know your final results. Interesting to see YARN+MR directly working on local-file-system. Thanks, +Vinod On Jun 20, 2013, at 2:27 PM, Stephen Watt wrote: > I resolved this. The issue is that I was using relative paths (i.e "teragen > 1000 data/in-dir") as the params for TeraGen and TeraSort. When I changed it > to use absolute paths, (i.e. "teragen 1000 /data/in-dir") it works. > > ----- Original Message ----- > From: "Stephen Watt" <[email protected]> > To: [email protected] > Sent: Thursday, June 20, 2013 12:25:17 PM > Subject: FileNotFoundExceptions with Pseudo Distributed YARN MR using the > Local FileSystem > > Hi Folks > > I'm running into FileNotFoundExceptions when using using Pseudo Distributed > Single Node YARN using the Local FileSystem. I'd greatly appreciate any > insights/solutions. > > To level set, I'm using RHEL 6.2 and I've successfully setup a single node > pseudo-distributed YARN on HDFS 2.0 using the HDP 2.0.2 Alpha Release > (tarball extract to /opt). All the processes were started and the jobs > submitted as root. I ran some smoke tests with TeraGen and TeraSort and it > works great. > > The next step was to leave YARN in pseudo-distributed mode and stop HDFS and > change the Hadoop FileSystem from HDFS to the Local FileSystem. I stopped all > the daemons, changed the core-site.xml to use the Local FileSystem as > demonstrated below, and then restarted the resourcemanager, nodemanager and > historyserver. Still running as root, everything started just fine. I ran > TeraGen (params: 1000 data/in-dir) it worked fine. I then ran TeraSort > (params: data/in-dir data/out-dir) and the Job Failed with a > FileNotFoundException. I've provided my core-site and mapred-site below. > > -- core-site.xml -- > > <configuration> > > <property> > <name>fs.default.name</name> > <value>file:///</value> > </property> > > </configuration> > > -- mapred-site.xml -- > > <configuration> > > <property> > <name>mapreduce.framework.name</name> > <value>yarn</value> > </property> > > </configuration> > > -- Stack Trace Exception -- > > 2013-06-18 23:06:40,876 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack > 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_1371596024885_0003_01_000002 to > attempt_1371596024885_0003_m_000000_0 > 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] > org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack > 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_1371596024885_0003_01_000003 to > attempt_1371596024885_0003_m_000001_0 > 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating > schedule, headroom=4096 > 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start > threshold not met. completedMapsForReduceSlowstart 1 > 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: > PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 > CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:2 > 2013-06-18 23:06:40,896 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file > on the remote FS is > file:///tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.jar > 2013-06-18 23:06:40,901 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf > file on the remote FS is > /tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.xml > 2013-06-18 23:06:40,902 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > org.apache.hadoop.yarn.YarnException: java.io.FileNotFoundException: File > file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst > does not exist > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:723) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:771) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1352) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1310) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1018) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:142) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1116) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1108) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.FileNotFoundException: File > file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492) > at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:697) > at > org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:144) > at > org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:417) > at > org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:365) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:686) > ... 14 more > 2013-06-18 23:06:40,906 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
