https://bugzilla.wikimedia.org/show_bug.cgi?id=65420
--- Comment #9 from Oliver Keyes <[email protected]> --- Two new queries that are exploding, both with slightly different error reports: hive (wmf)> SELECT uri_host FROM webrequest WHERE uri_path = '/wiki/Education' AND year = 2014 AND month = 05; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.String.substring(String.java:1913) at java.net.URI$Parser.substring(URI.java:2850) at java.net.URI$Parser.parse(URI.java:3046) at java.net.URI.<init>(URI.java:753) at org.apache.hadoop.fs.Path.<init>(Path.java:73) at org.apache.hadoop.fs.Path.<init>(Path.java:58) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:209) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:372) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:416) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:206) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:377) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:387) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:479) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601) FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.MapRedTask and: hive (wmf)> SELECT DISTINCT(uri_host) FROM webrequest WHERE uri_path = '/wiki/Education/' AND year = 2014 AND month = 05; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 999 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> java.io.IOException: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:448) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:405) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:206) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:377) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:387) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:479) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) Caused by: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:238) at com.sun.proxy.$Proxy13.getListing(Unknown Source) at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy13.getListing(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441) ... 32 more Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$HdfsFileStatusProto$Builder.buildPartial(HdfsProtos.java:9398) at org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$DirectoryListingProto$Builder.mergeFrom(HdfsProtos.java:11422) at org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$DirectoryListingProto$Builder.mergeFrom(HdfsProtos.java:11241) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:275) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetListingResponseProto$Builder.mergeFrom(ClientNamenodeProtocolProtos.java:18775) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetListingResponseProto$Builder.mergeFrom(ClientNamenodeProtocolProtos.java:18629) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:300) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238) at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:162) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:716) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238) at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:153) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:709) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) at com.sun.proxy.$Proxy13.getListing(Unknown Source) at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy13.getListing(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:405) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1427) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1467) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:206) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411) Job Submission failed with exception 'java.io.IOException(com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: Java heap space)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
