Hi,

     MapReduce TeraSort Job fails on S3 with Output PathExistsException.
Is this a known issue?

Thanks,
Prabhu Joseph


[hrt_qa@hostname root]$ yarn jar
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples-3.1.1.7.0.0.0-115.jar
terasort s3a:/bucket/INPUT s3a://bucket/OUTPUT

WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of
YARN_OPTS.

19/06/07 14:13:11 INFO terasort.TeraSort: starting

19/06/07 14:13:12 WARN impl.MetricsConfig: Cannot locate configuration:
tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties

19/06/07 14:13:12 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot
period at 10 second(s).

19/06/07 14:13:12 INFO impl.MetricsSystemImpl: s3a-file-system metrics
system started

19/06/07 14:13:14 INFO input.FileInputFormat: Total input files to process
: 2

Spent 396ms computing base-splits.

Spent 3ms computing TeraScheduler splits.

Computing input splits took 400ms

Sampling 2 splits of 2

Making 80 from 10000 sampled records

Computing parititions took 685ms

Spent 1088ms computing partitions.

19/06/07 14:13:15 INFO client.RMProxy: Connecting to ResourceManager at
hostname:8032

19/06/07 14:13:17 INFO mapreduce.JobResourceUploader: Disabling Erasure
Coding for path: /user/hrt_qa/.staging/job_1559891760159_0011

19/06/07 14:13:17 INFO mapreduce.JobSubmitter: number of splits:2

19/06/07 14:13:17 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1559891760159_0011

19/06/07 14:13:17 INFO mapreduce.JobSubmitter: Executing with tokens: []

19/06/07 14:13:18 INFO conf.Configuration: resource-types.xml not found

19/06/07 14:13:18 INFO resource.ResourceUtils: Unable to find
'resource-types.xml'.

19/06/07 14:13:18 INFO impl.YarnClientImpl: Submitted application
application_1559891760159_0011

19/06/07 14:13:18 INFO mapreduce.Job: The url to track the job:
http://hostname:8088/proxy/application_1559891760159_0011/

19/06/07 14:13:18 INFO mapreduce.Job: Running job: job_1559891760159_0011

19/06/07 14:13:33 INFO mapreduce.Job: Job job_1559891760159_0011 running in
uber mode : false

19/06/07 14:13:33 INFO mapreduce.Job:  map 0% reduce 0%

19/06/07 14:13:34 INFO mapreduce.Job: Job job_1559891760159_0011 failed
with state FAILED due to: Job setup failed :
org.apache.hadoop.fs.PathExistsException: `s3a://bucket/OUTPUT': Setting
job as Task committer attempt_1559891760159_0011_m_000000_0: Destination
path exists and committer conflict resolution mode is "fail"

at
org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.failDestinationExists(StagingCommitter.java:878)

at
org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter.setupJob(DirectoryStagingCommitter.java:71)

at
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:255)

at
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:235)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)



19/06/07 14:13:34 INFO mapreduce.Job: Counters: 2

Job Counters

Total time spent by all maps in occupied slots (ms)=0

Total time spent by all reduces in occupied slots (ms)=0

19/06/07 14:13:34 INFO terasort.TeraSort: done

19/06/07 14:13:34 INFO impl.MetricsSystemImpl: Stopping s3a-file-system
metrics system...

19/06/07 14:13:34 INFO impl.MetricsSystemImpl: s3a-file-system metrics
system stopped.

19/06/07 14:13:34 INFO impl.MetricsSystemImpl: s3a-file-system metrics
system shutdown complete.

Reply via email to