Hi Folks

I'm working on the Hadoop FileSystem validation workstream 
(https://wiki.apache.org/hadoop/HCFS/Progress) over at Hadoop Common. To do 
that we're building a library of Hadoop FileSystem tests that will run against 
FileSystems configured within Hadoop 2.0. I have YARN working on HDFS and 
LocalFS, next I'm trying to get YARN running on top of GlusterFS using the 
GlusterFS Hadoop FileSystem plugin. The plugin works just fine on Hadoop 1.x. 

When I start the JobHistoryServer it fails with an 
UnsupportedFileSystemException (full stack trace below). I did a bit of 
googling and ran into Karthik over at the QFS community 
(https://groups.google.com/forum/#!topic/qfs-devel/KF3AAFheNq8) who had the 
same issue and has also been unsuccessful at getting this working. I've 
provided my core-site file below. The glusterfs plugin jar is copied into 
share/hadoop/common/lib/, share/hadoop/mapreduce/lib and share/hadoop/yarn/lib 
so I don't think this is a classpath issue. Perhaps the exception is a result 
of misconfiguration somewhere? 

-- Core Site --

<configuration>

 <property>
  <name>fs.defaultFS</name>
  <value>glusterfs://amb-1:9000</value>
 </property>

 <property>
  <name>fs.default.name</name>
  <value>glusterfs://amb-1:9000</value>
 </property>

 <property>
  <name>fs.glusterfs.server</name>
  <value>amb-1</value>
 </property>

 <property>
  <name>fs.glusterfs.impl</name>
  <value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value>
 </property>

</configuration>


-- Stack Trace -- 

STARTUP_MSG:   build = 
git://pico-2-centos-6-3--01.hortonworks.com/home/jenkins/workspace/BIGTOP-BigWheelAplha-2-HDP-RPM-SYNC-REPO/label/centos6-3/build/hadoop/rpm/BUILD/hadoop-2.0.3.22-alpha-src/hadoop-common-project/hadoop-common
 -r bdb84648f423eb2b7af5cb97c7192193a5a57956; compiled by 'jenkins' on Fri Mar 
15 02:03:54 PDT 2013
STARTUP_MSG:   java = 1.6.0_43
************************************************************/
2013-06-08 05:46:23,796 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: 
JobHistory Init
2013-06-08 05:46:24,015 ERROR org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:root (auth:SIMPLE) 
cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No 
AbstractFileSystem for scheme: glusterfs
2013-06-08 05:46:24,015 FATAL 
org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting 
JobHistoryServer
org.apache.hadoop.yarn.YarnException: Error creating done directory: [null]
        at 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:424)
        at org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:87)
        at 
org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
        at 
org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:87)
        at 
org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:145)
Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No 
AbstractFileSystem for scheme: glusterfs
        at 
org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:146)
        at 
org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:234)
        at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:342)
        at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:339)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
        at 
org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:339)
        at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:453)
        at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:475)
        at 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:417)


----- Original Message -----
From: "Vinod Kumar Vavilapalli" <[email protected]>
To: [email protected]
Sent: Thursday, June 20, 2013 6:32:42 PM
Subject: Re: FileNotFoundExceptions with Pseudo Distributed YARN MR using the 
Local FileSystem


Please let us know your final results. Interesting to see YARN+MR directly 
working on local-file-system.

Thanks,
+Vinod

On Jun 20, 2013, at 2:27 PM, Stephen Watt wrote:

> I resolved this. The issue is that I was using relative paths (i.e "teragen 
> 1000 data/in-dir") as the params for TeraGen and TeraSort. When I changed it 
> to use absolute paths, (i.e. "teragen 1000 /data/in-dir") it works.
> 
> ----- Original Message -----
> From: "Stephen Watt" <[email protected]>
> To: [email protected]
> Sent: Thursday, June 20, 2013 12:25:17 PM
> Subject: FileNotFoundExceptions with Pseudo Distributed YARN MR using the 
> Local FileSystem
> 
> Hi Folks
> 
> I'm running into FileNotFoundExceptions when using using Pseudo Distributed 
> Single Node YARN using the Local FileSystem. I'd greatly appreciate any 
> insights/solutions.
> 
> To level set, I'm using RHEL 6.2 and I've successfully setup a single node 
> pseudo-distributed YARN on HDFS 2.0 using the HDP 2.0.2 Alpha Release 
> (tarball extract to /opt). All the processes were started and the jobs 
> submitted as root. I ran some smoke tests with TeraGen and TeraSort and it 
> works great.
> 
> The next step was to leave YARN in pseudo-distributed mode and stop HDFS and 
> change the Hadoop FileSystem from HDFS to the Local FileSystem. I stopped all 
> the daemons, changed the core-site.xml to use the Local FileSystem as 
> demonstrated below, and then restarted the resourcemanager, nodemanager and 
> historyserver. Still running as root,  everything started just fine. I ran 
> TeraGen (params: 1000 data/in-dir) it worked fine. I then ran TeraSort 
> (params: data/in-dir data/out-dir) and the Job Failed with a 
> FileNotFoundException. I've provided my core-site and mapred-site below.
> 
> -- core-site.xml --
> 
> <configuration>
> 
> <property>
>   <name>fs.default.name</name>
>    <value>file:///</value>
> </property>
> 
> </configuration>
> 
> -- mapred-site.xml --
> 
> <configuration>
> 
>   <property>
>      <name>mapreduce.framework.name</name>
>      <value>yarn</value>
>   </property>
> 
> </configuration>
> 
> -- Stack Trace Exception -- 
> 
> 2013-06-18 23:06:40,876 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack
> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
> container container_1371596024885_0003_01_000002 to 
> attempt_1371596024885_0003_m_000000_0
> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
> container container_1371596024885_0003_01_000003 to 
> attempt_1371596024885_0003_m_000001_0
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating 
> schedule, headroom=4096
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start 
> threshold not met. completedMapsForReduceSlowstart 1
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
> PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 
> CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:2
> 2013-06-18 23:06:40,896 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file 
> on the remote FS is 
> file:///tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.jar
> 2013-06-18 23:06:40,901 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf 
> file on the remote FS is 
> /tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.xml
> 2013-06-18 23:06:40,902 FATAL [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
> org.apache.hadoop.yarn.YarnException: java.io.FileNotFoundException: File 
> file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
>  does not exist
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:723)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:771)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1352)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1310)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1018)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:142)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1116)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1108)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
>       at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.FileNotFoundException: File 
> file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
>  does not exist
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
>       at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:697)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:144)
>       at 
> org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:417)
>       at 
> org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:365)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:686)
>       ... 14 more
> 2013-06-18 23:06:40,906 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..

Reply via email to