YARN uses the FileContext APIs in its code, which would require your FS implementation to also provide one (inheriting the AbstractFileSystem).
On Fri, Jun 21, 2013 at 6:38 AM, Stephen Watt <[email protected]> wrote: > Hi Folks > > I'm working on the Hadoop FileSystem validation workstream > (https://wiki.apache.org/hadoop/HCFS/Progress) over at Hadoop Common. To do > that we're building a library of Hadoop FileSystem tests that will run > against FileSystems configured within Hadoop 2.0. I have YARN working on HDFS > and LocalFS, next I'm trying to get YARN running on top of GlusterFS using > the GlusterFS Hadoop FileSystem plugin. The plugin works just fine on Hadoop > 1.x. > > When I start the JobHistoryServer it fails with an > UnsupportedFileSystemException (full stack trace below). I did a bit of > googling and ran into Karthik over at the QFS community > (https://groups.google.com/forum/#!topic/qfs-devel/KF3AAFheNq8) who had the > same issue and has also been unsuccessful at getting this working. I've > provided my core-site file below. The glusterfs plugin jar is copied into > share/hadoop/common/lib/, share/hadoop/mapreduce/lib and > share/hadoop/yarn/lib so I don't think this is a classpath issue. Perhaps the > exception is a result of misconfiguration somewhere? > > -- Core Site -- > > <configuration> > > <property> > <name>fs.defaultFS</name> > <value>glusterfs://amb-1:9000</value> > </property> > > <property> > <name>fs.default.name</name> > <value>glusterfs://amb-1:9000</value> > </property> > > <property> > <name>fs.glusterfs.server</name> > <value>amb-1</value> > </property> > > <property> > <name>fs.glusterfs.impl</name> > <value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value> > </property> > > </configuration> > > > -- Stack Trace -- > > STARTUP_MSG: build = > git://pico-2-centos-6-3--01.hortonworks.com/home/jenkins/workspace/BIGTOP-BigWheelAplha-2-HDP-RPM-SYNC-REPO/label/centos6-3/build/hadoop/rpm/BUILD/hadoop-2.0.3.22-alpha-src/hadoop-common-project/hadoop-common > -r bdb84648f423eb2b7af5cb97c7192193a5a57956; compiled by 'jenkins' on Fri > Mar 15 02:03:54 PDT 2013 > STARTUP_MSG: java = 1.6.0_43 > ************************************************************/ > 2013-06-08 05:46:23,796 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: > JobHistory Init > 2013-06-08 05:46:24,015 ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:root (auth:SIMPLE) > cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No > AbstractFileSystem for scheme: glusterfs > 2013-06-08 05:46:24,015 FATAL > org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting > JobHistoryServer > org.apache.hadoop.yarn.YarnException: Error creating done directory: [null] > at > org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:424) > at > org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:87) > at > org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58) > at > org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:87) > at > org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:145) > Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No > AbstractFileSystem for scheme: glusterfs > at > org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:146) > at > org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:234) > at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:342) > at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:339) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441) > at > org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:339) > at > org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:453) > at > org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:475) > at > org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:417) > > > ----- Original Message ----- > From: "Vinod Kumar Vavilapalli" <[email protected]> > To: [email protected] > Sent: Thursday, June 20, 2013 6:32:42 PM > Subject: Re: FileNotFoundExceptions with Pseudo Distributed YARN MR using the > Local FileSystem > > > Please let us know your final results. Interesting to see YARN+MR directly > working on local-file-system. > > Thanks, > +Vinod > > On Jun 20, 2013, at 2:27 PM, Stephen Watt wrote: > >> I resolved this. The issue is that I was using relative paths (i.e "teragen >> 1000 data/in-dir") as the params for TeraGen and TeraSort. When I changed it >> to use absolute paths, (i.e. "teragen 1000 /data/in-dir") it works. >> >> ----- Original Message ----- >> From: "Stephen Watt" <[email protected]> >> To: [email protected] >> Sent: Thursday, June 20, 2013 12:25:17 PM >> Subject: FileNotFoundExceptions with Pseudo Distributed YARN MR using the >> Local FileSystem >> >> Hi Folks >> >> I'm running into FileNotFoundExceptions when using using Pseudo Distributed >> Single Node YARN using the Local FileSystem. I'd greatly appreciate any >> insights/solutions. >> >> To level set, I'm using RHEL 6.2 and I've successfully setup a single node >> pseudo-distributed YARN on HDFS 2.0 using the HDP 2.0.2 Alpha Release >> (tarball extract to /opt). All the processes were started and the jobs >> submitted as root. I ran some smoke tests with TeraGen and TeraSort and it >> works great. >> >> The next step was to leave YARN in pseudo-distributed mode and stop HDFS and >> change the Hadoop FileSystem from HDFS to the Local FileSystem. I stopped >> all the daemons, changed the core-site.xml to use the Local FileSystem as >> demonstrated below, and then restarted the resourcemanager, nodemanager and >> historyserver. Still running as root, everything started just fine. I ran >> TeraGen (params: 1000 data/in-dir) it worked fine. I then ran TeraSort >> (params: data/in-dir data/out-dir) and the Job Failed with a >> FileNotFoundException. I've provided my core-site and mapred-site below. >> >> -- core-site.xml -- >> >> <configuration> >> >> <property> >> <name>fs.default.name</name> >> <value>file:///</value> >> </property> >> >> </configuration> >> >> -- mapred-site.xml -- >> >> <configuration> >> >> <property> >> <name>mapreduce.framework.name</name> >> <value>yarn</value> >> </property> >> >> </configuration> >> >> -- Stack Trace Exception -- >> >> 2013-06-18 23:06:40,876 INFO [AsyncDispatcher event handler] >> org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack >> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] >> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned >> container container_1371596024885_0003_01_000002 to >> attempt_1371596024885_0003_m_000000_0 >> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] >> org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack >> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] >> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned >> container container_1371596024885_0003_01_000003 to >> attempt_1371596024885_0003_m_000001_0 >> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] >> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating >> schedule, headroom=4096 >> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] >> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow >> start threshold not met. completedMapsForReduceSlowstart 1 >> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] >> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After >> Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 >> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 >> HostLocal:0 RackLocal:2 >> 2013-06-18 23:06:40,896 INFO [AsyncDispatcher event handler] >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar >> file on the remote FS is >> file:///tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.jar >> 2013-06-18 23:06:40,901 INFO [AsyncDispatcher event handler] >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf >> file on the remote FS is >> /tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.xml >> 2013-06-18 23:06:40,902 FATAL [AsyncDispatcher event handler] >> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread >> org.apache.hadoop.yarn.YarnException: java.io.FileNotFoundException: File >> file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst >> does not exist >> at >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:723) >> at >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:771) >> at >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1352) >> at >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1310) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) >> at >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1018) >> at >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:142) >> at >> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1116) >> at >> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1108) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) >> at java.lang.Thread.run(Thread.java:662) >> Caused by: java.io.FileNotFoundException: File >> file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst >> does not exist >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492) >> at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:697) >> at >> org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:144) >> at >> org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:417) >> at >> org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:365) >> at >> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:686) >> ... 14 more >> 2013-06-18 23:06:40,906 INFO [AsyncDispatcher event handler] >> org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.. -- Harsh J
