https://bugzilla.wikimedia.org/show_bug.cgi?id=72812
Bug ID: 72812 Summary: Kafka logs drowning in errors processing fetch requests since 2014-10-28 ~20:00 Product: Analytics Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: Refinery Assignee: wikibugs-l@lists.wikimedia.org Reporter: christ...@quelltextlich.at CC: bugwatc...@sb-mail.wmflabs.org, christ...@quelltextlich.at, dandree...@wikimedia.org, kle...@wikimedia.org, oke...@wikimedia.org, o...@wikimedia.org, tneg...@wikimedia.org Web browser: --- Mobile Platform: --- Since 2014-10-28 ~20:00, the kafka logs on each four brokers are drowning [1] in exceptions around processing fetch requests, like the following one: [2014-10-31 13:20:01,425] 7574685637 [kafka-request-handler-1] ERROR kafka.server.KafkaApis - [KafkaApi-22] Error when processing fetch request for partition [webrequest_text,8] offset 25962495797 from consumer with correlation id 2 kafka.common.OffsetOutOfRangeException: Request for offset 25962495797 but we only have log segments in the range 24383057344 to 25960520433. at kafka.log.Log.read(Log.scala:380) at kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.scala:530) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:476) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:471) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194) at scala.collection.immutable.Map$Map3.foreach(Map.scala:163) at scala.collection.TraversableLike$class.map(TraversableLike.scala:194) at scala.collection.immutable.Map$Map3.map(Map.scala:143) at kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSets(KafkaApis.scala:471) at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:437) at kafka.server.KafkaApis.handle(KafkaApis.scala:186) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:42) at java.lang.Thread.run(Thread.java:744) The exceptions are really strongly biased towards the first second of a minute. [2] This happens every minute, with some 60 extra exceptions every 5th minute. [3] [1] qchris@analytics1012:~$ grep 'ERROR kafka.server.KafkaApis' /var/log/kafka/kafka.log | cut -f 1 -d ' ' | uniq -c 6 [2014-08-21 18 [2014-08-22 9 [2014-08-28 12 [2014-09-09 3 [2014-09-16 47 [2014-09-17 28 [2014-09-18 6 [2014-10-08 30 [2014-10-16 12 [2014-10-23 1563 [2014-10-28 8640 [2014-10-29 8610 [2014-10-30 834 [2014-10-31 qchris@analytics1018:~$ grep 'ERROR kafka.server.KafkaApis' /var/log/kafka/kafka.log | cut -f 1 -d ' ' | uniq -c 6 [2014-08-12 6 [2014-08-21 18 [2014-08-22 9 [2014-08-28 12 [2014-09-09 3 [2014-09-16 45 [2014-09-17 31 [2014-09-18 6 [2014-10-08 30 [2014-10-16 12 [2014-10-23 1527 [2014-10-28 8640 [2014-10-29 8610 [2014-10-30 834 [2014-10-31 chris@analytics1021:~$ cat /var/log/kafka/kafka.log.1 /var/log/kafka/kafka.log | grep 'ERROR kafka.server.KafkaApis' | cut -f 1 -d ' ' | uniq -c 4 [2014-06-24 2 [2014-07-02 1 [2014-08-04 4 [2014-08-12 4 [2014-08-21 10 [2014-08-22 6 [2014-08-28 10 [2014-09-09 2 [2014-09-16 32 [2014-09-17 21 [2014-09-18 6 [2014-10-08 30 [2014-10-16 12 [2014-10-23 1380 [2014-10-28 8640 [2014-10-29 8610 [2014-10-30 834 [2014-10-31 qchris@analytics1022:~$ grep 'ERROR kafka.server.KafkaApis' /var/log/kafka/kafka.log | cut -f 1 -d ' ' | uniq -c 9 [2014-09-09 3 [2014-09-16 34 [2014-09-17 20 [2014-09-18 6 [2014-10-08 38 [2014-10-16 10 [2014-10-23 508 [2014-10-28 2890 [2014-10-29 2877 [2014-10-30 280 [2014-10-31 [2] qchris@analytics1021:~$ cat /var/log/kafka/kafka.log.1 /var/log/kafka/kafka.log | grep 'ERROR kafka.server.KafkaApis' | cut -c 1-60 | grep 2014-10 | cut -f 3 -d : | cut -f 1 -d , | sort | uniq -c 12 00 17466 01 1916 02 82 03 6 04 6 06 6 07 6 23 6 42 6 46 6 52 6 56 [3] qchris@analytics1021:~$ cat /var/log/kafka/kafka.log.1 /var/log/kafka/kafka.log | grep 'ERROR kafka.server.KafkaApis' | cut -c 1-60 | grep 2014-10 | cut -f 2 -d : | sort | uniq -c 380 00 314 01 314 02 314 03 314 04 368 05 314 06 312 07 312 08 312 09 366 10 306 11 306 12 318 13 318 14 378 15 318 16 318 17 318 18 318 19 372 20 312 21 318 22 318 23 318 24 378 25 312 26 312 27 312 28 312 29 372 30 312 31 312 32 312 33 312 34 372 35 312 36 312 37 312 38 312 39 366 40 312 41 312 42 312 43 312 44 372 45 318 46 312 47 312 48 312 49 366 50 312 51 318 52 312 53 312 54 366 55 312 56 312 57 320 58 332 59 -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l