Re: 3rd Party Hadoop FileSystems failing with UnsupportedFileSystemException

Harsh J Thu, 20 Jun 2013 22:48:39 -0700

YARN uses the FileContext APIs in its code, which would require your
FS implementation to also provide one (inheriting the
AbstractFileSystem).


On Fri, Jun 21, 2013 at 6:38 AM, Stephen Watt <[email protected]> wrote:
> Hi Folks
>
> I'm working on the Hadoop FileSystem validation workstream 
> (https://wiki.apache.org/hadoop/HCFS/Progress) over at Hadoop Common. To do 
> that we're building a library of Hadoop FileSystem tests that will run 
> against FileSystems configured within Hadoop 2.0. I have YARN working on HDFS 
> and LocalFS, next I'm trying to get YARN running on top of GlusterFS using 
> the GlusterFS Hadoop FileSystem plugin. The plugin works just fine on Hadoop 
> 1.x.
>
> When I start the JobHistoryServer it fails with an 
> UnsupportedFileSystemException (full stack trace below). I did a bit of 
> googling and ran into Karthik over at the QFS community 
> (https://groups.google.com/forum/#!topic/qfs-devel/KF3AAFheNq8) who had the 
> same issue and has also been unsuccessful at getting this working. I've 
> provided my core-site file below. The glusterfs plugin jar is copied into 
> share/hadoop/common/lib/, share/hadoop/mapreduce/lib and 
> share/hadoop/yarn/lib so I don't think this is a classpath issue. Perhaps the 
> exception is a result of misconfiguration somewhere?
>
> -- Core Site --
>
> <configuration>
>
>  <property>
>   <name>fs.defaultFS</name>
>   <value>glusterfs://amb-1:9000</value>
>  </property>
>
>  <property>
>   <name>fs.default.name</name>
>   <value>glusterfs://amb-1:9000</value>
>  </property>
>
>  <property>
>   <name>fs.glusterfs.server</name>
>   <value>amb-1</value>
>  </property>
>
>  <property>
>   <name>fs.glusterfs.impl</name>
>   <value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value>
>  </property>
>
> </configuration>
>
>
> -- Stack Trace --
>
> STARTUP_MSG:   build = 
> git://pico-2-centos-6-3--01.hortonworks.com/home/jenkins/workspace/BIGTOP-BigWheelAplha-2-HDP-RPM-SYNC-REPO/label/centos6-3/build/hadoop/rpm/BUILD/hadoop-2.0.3.22-alpha-src/hadoop-common-project/hadoop-common
>  -r bdb84648f423eb2b7af5cb97c7192193a5a57956; compiled by 'jenkins' on Fri 
> Mar 15 02:03:54 PDT 2013
> STARTUP_MSG:   java = 1.6.0_43
> ************************************************************/
> 2013-06-08 05:46:23,796 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: 
> JobHistory Init
> 2013-06-08 05:46:24,015 ERROR 
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
> as:root (auth:SIMPLE) 
> cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No 
> AbstractFileSystem for scheme: glusterfs
> 2013-06-08 05:46:24,015 FATAL 
> org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting 
> JobHistoryServer
> org.apache.hadoop.yarn.YarnException: Error creating done directory: [null]
>         at 
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:424)
>         at 
> org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:87)
>         at 
> org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
>         at 
> org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:87)
>         at 
> org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:145)
> Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No 
> AbstractFileSystem for scheme: glusterfs
>         at 
> org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:146)
>         at 
> org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:234)
>         at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:342)
>         at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:339)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>         at 
> org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:339)
>         at 
> org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:453)
>         at 
> org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:475)
>         at 
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:417)
>
>
> ----- Original Message -----
> From: "Vinod Kumar Vavilapalli" <[email protected]>
> To: [email protected]
> Sent: Thursday, June 20, 2013 6:32:42 PM
> Subject: Re: FileNotFoundExceptions with Pseudo Distributed YARN MR using the 
> Local FileSystem
>
>
> Please let us know your final results. Interesting to see YARN+MR directly 
> working on local-file-system.
>
> Thanks,
> +Vinod
>
> On Jun 20, 2013, at 2:27 PM, Stephen Watt wrote:
>
>> I resolved this. The issue is that I was using relative paths (i.e "teragen 
>> 1000 data/in-dir") as the params for TeraGen and TeraSort. When I changed it 
>> to use absolute paths, (i.e. "teragen 1000 /data/in-dir") it works.
>>
>> ----- Original Message -----
>> From: "Stephen Watt" <[email protected]>
>> To: [email protected]
>> Sent: Thursday, June 20, 2013 12:25:17 PM
>> Subject: FileNotFoundExceptions with Pseudo Distributed YARN MR using the 
>> Local FileSystem
>>
>> Hi Folks
>>
>> I'm running into FileNotFoundExceptions when using using Pseudo Distributed 
>> Single Node YARN using the Local FileSystem. I'd greatly appreciate any 
>> insights/solutions.
>>
>> To level set, I'm using RHEL 6.2 and I've successfully setup a single node 
>> pseudo-distributed YARN on HDFS 2.0 using the HDP 2.0.2 Alpha Release 
>> (tarball extract to /opt). All the processes were started and the jobs 
>> submitted as root. I ran some smoke tests with TeraGen and TeraSort and it 
>> works great.
>>
>> The next step was to leave YARN in pseudo-distributed mode and stop HDFS and 
>> change the Hadoop FileSystem from HDFS to the Local FileSystem. I stopped 
>> all the daemons, changed the core-site.xml to use the Local FileSystem as 
>> demonstrated below, and then restarted the resourcemanager, nodemanager and 
>> historyserver. Still running as root,  everything started just fine. I ran 
>> TeraGen (params: 1000 data/in-dir) it worked fine. I then ran TeraSort 
>> (params: data/in-dir data/out-dir) and the Job Failed with a 
>> FileNotFoundException. I've provided my core-site and mapred-site below.
>>
>> -- core-site.xml --
>>
>> <configuration>
>>
>> <property>
>>   <name>fs.default.name</name>
>>    <value>file:///</value>
>> </property>
>>
>> </configuration>
>>
>> -- mapred-site.xml --
>>
>> <configuration>
>>
>>   <property>
>>      <name>mapreduce.framework.name</name>
>>      <value>yarn</value>
>>   </property>
>>
>> </configuration>
>>
>> -- Stack Trace Exception --
>>
>> 2013-06-18 23:06:40,876 INFO [AsyncDispatcher event handler] 
>> org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack
>> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] 
>> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
>> container container_1371596024885_0003_01_000002 to 
>> attempt_1371596024885_0003_m_000000_0
>> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] 
>> org.apache.hadoop.yarn.util.RackResolver: Resolved yarn-1 to /default-rack
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
>> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
>> container container_1371596024885_0003_01_000003 to 
>> attempt_1371596024885_0003_m_000001_0
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
>> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating 
>> schedule, headroom=4096
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
>> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow 
>> start threshold not met. completedMapsForReduceSlowstart 1
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] 
>> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After 
>> Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 
>> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 
>> HostLocal:0 RackLocal:2
>> 2013-06-18 23:06:40,896 INFO [AsyncDispatcher event handler] 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar 
>> file on the remote FS is 
>> file:///tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.jar
>> 2013-06-18 23:06:40,901 INFO [AsyncDispatcher event handler] 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf 
>> file on the remote FS is 
>> /tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.xml
>> 2013-06-18 23:06:40,902 FATAL [AsyncDispatcher event handler] 
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
>> org.apache.hadoop.yarn.YarnException: java.io.FileNotFoundException: File 
>> file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
>>  does not exist
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:723)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:771)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1352)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1310)
>>       at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359)
>>       at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
>>       at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>>       at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1018)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:142)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1116)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1108)
>>       at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
>>       at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
>>       at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.io.FileNotFoundException: File 
>> file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
>>  does not exist
>>       at 
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
>>       at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:697)
>>       at 
>> org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:144)
>>       at 
>> org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:417)
>>       at 
>> org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:365)
>>       at 
>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:686)
>>       ... 14 more
>> 2013-06-18 23:06:40,906 INFO [AsyncDispatcher event handler] 
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..



-- 
Harsh J

Re: 3rd Party Hadoop FileSystems failing with UnsupportedFileSystemException

Reply via email to