https://bugzilla.wikimedia.org/show_bug.cgi?id=65420

--- Comment #9 from Oliver Keyes <[email protected]> ---
Two new queries that are exploding, both with slightly different error reports:

hive (wmf)> SELECT uri_host FROM webrequest WHERE uri_path = '/wiki/Education'
AND year = 2014 AND month = 05;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.lang.String.substring(String.java:1913)
    at java.net.URI$Parser.substring(URI.java:2850)
    at java.net.URI$Parser.parse(URI.java:3046)
    at java.net.URI.<init>(URI.java:753)
    at org.apache.hadoop.fs.Path.<init>(Path.java:73)
    at org.apache.hadoop.fs.Path.<init>(Path.java:58)
    at
org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:209)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:372)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:416)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467)
    at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231)
    at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:206)
    at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69)
    at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411)
    at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:377)
    at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:387)
    at
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:479)
    at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471)
    at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
FAILED: Execution Error, return code -101 from
org.apache.hadoop.hive.ql.exec.MapRedTask

and:

hive (wmf)> SELECT DISTINCT(uri_host) FROM webrequest WHERE uri_path =
'/wiki/Education/' AND year = 2014 AND month = 05;        
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 999
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
java.io.IOException: com.google.protobuf.ServiceException:
java.lang.OutOfMemoryError: Java heap space
    at
org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
    at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:448)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:405)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467)
    at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231)
    at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:206)
    at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69)
    at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411)
    at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:377)
    at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:387)
    at
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:479)
    at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471)
    at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
    at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
    at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)
Caused by: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError:
Java heap space
    at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:238)
    at com.sun.proxy.$Proxy13.getListing(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
    at com.sun.proxy.$Proxy13.getListing(Unknown Source)
    at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
    ... 32 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at
org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$HdfsFileStatusProto$Builder.buildPartial(HdfsProtos.java:9398)
    at
org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$DirectoryListingProto$Builder.mergeFrom(HdfsProtos.java:11422)
    at
org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$DirectoryListingProto$Builder.mergeFrom(HdfsProtos.java:11241)
    at
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:275)
    at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetListingResponseProto$Builder.mergeFrom(ClientNamenodeProtocolProtos.java:18775)
    at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetListingResponseProto$Builder.mergeFrom(ClientNamenodeProtocolProtos.java:18629)
    at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:300)
    at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
    at
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:162)
    at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:716)
    at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
    at
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:153)
    at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:709)
    at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
    at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
    at com.sun.proxy.$Proxy13.getListing(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
    at com.sun.proxy.$Proxy13.getListing(Unknown Source)
    at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:405)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467)
    at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231)
    at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:206)
    at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69)
    at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411)
Job Submission failed with exception
'java.io.IOException(com.google.protobuf.ServiceException:
java.lang.OutOfMemoryError: Java heap space)'
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MapRedTask

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to