https://bugzilla.wikimedia.org/show_bug.cgi?id=72812

            Bug ID: 72812
           Summary: Kafka logs drowning in errors processing fetch
                    requests since 2014-10-28 ~20:00
           Product: Analytics
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: Refinery
          Assignee: wikibugs-l@lists.wikimedia.org
          Reporter: christ...@quelltextlich.at
                CC: bugwatc...@sb-mail.wmflabs.org,
                    christ...@quelltextlich.at, dandree...@wikimedia.org,
                    kle...@wikimedia.org, oke...@wikimedia.org,
                    o...@wikimedia.org, tneg...@wikimedia.org
       Web browser: ---
   Mobile Platform: ---

Since 2014-10-28 ~20:00, the kafka logs on each four brokers are
drowning [1] in exceptions around processing fetch requests, like the
following one:

[2014-10-31 13:20:01,425] 7574685637 [kafka-request-handler-1] ERROR
kafka.server.KafkaApis  - [KafkaApi-22] Error when processing fetch request for
partition [webrequest_text,8] offset 25962495797 from consumer with correlation
id 2
kafka.common.OffsetOutOfRangeException: Request for offset 25962495797 but we
only have log segments in the range 24383057344 to 25960520433.
        at kafka.log.Log.read(Log.scala:380)
        at
kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.scala:530)
        at
kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:476)
        at
kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:471)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
        at scala.collection.immutable.Map$Map3.foreach(Map.scala:163)
        at
scala.collection.TraversableLike$class.map(TraversableLike.scala:194)
        at scala.collection.immutable.Map$Map3.map(Map.scala:143)
        at
kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSets(KafkaApis.scala:471)
        at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:437)
        at kafka.server.KafkaApis.handle(KafkaApis.scala:186)
        at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:42)
        at java.lang.Thread.run(Thread.java:744)


The exceptions are really strongly biased towards the first second of a minute.
[2]
This happens every minute, with some 60 extra exceptions every 5th minute. [3]


[1]

qchris@analytics1012:~$ grep 'ERROR kafka.server.KafkaApis'
/var/log/kafka/kafka.log | cut -f 1 -d ' ' | uniq -c
      6 [2014-08-21
     18 [2014-08-22
      9 [2014-08-28
     12 [2014-09-09
      3 [2014-09-16
     47 [2014-09-17
     28 [2014-09-18
      6 [2014-10-08
     30 [2014-10-16
     12 [2014-10-23
   1563 [2014-10-28
   8640 [2014-10-29
   8610 [2014-10-30
    834 [2014-10-31

qchris@analytics1018:~$ grep 'ERROR kafka.server.KafkaApis'
/var/log/kafka/kafka.log | cut -f 1 -d ' ' | uniq -c
      6 [2014-08-12
      6 [2014-08-21
     18 [2014-08-22
      9 [2014-08-28
     12 [2014-09-09
      3 [2014-09-16
     45 [2014-09-17
     31 [2014-09-18
      6 [2014-10-08
     30 [2014-10-16
     12 [2014-10-23
   1527 [2014-10-28
   8640 [2014-10-29
   8610 [2014-10-30
    834 [2014-10-31

chris@analytics1021:~$ cat /var/log/kafka/kafka.log.1 /var/log/kafka/kafka.log
| grep 'ERROR kafka.server.KafkaApis' | cut -f 1 -d ' ' | uniq -c
      4 [2014-06-24
      2 [2014-07-02
      1 [2014-08-04
      4 [2014-08-12
      4 [2014-08-21
     10 [2014-08-22
      6 [2014-08-28
     10 [2014-09-09
      2 [2014-09-16
     32 [2014-09-17
     21 [2014-09-18
      6 [2014-10-08
     30 [2014-10-16
     12 [2014-10-23
   1380 [2014-10-28
   8640 [2014-10-29
   8610 [2014-10-30
    834 [2014-10-31

qchris@analytics1022:~$ grep 'ERROR kafka.server.KafkaApis'
/var/log/kafka/kafka.log | cut -f 1 -d ' ' | uniq -c
      9 [2014-09-09
      3 [2014-09-16
     34 [2014-09-17
     20 [2014-09-18
      6 [2014-10-08
     38 [2014-10-16
     10 [2014-10-23
    508 [2014-10-28
   2890 [2014-10-29
   2877 [2014-10-30
    280 [2014-10-31



[2]
qchris@analytics1021:~$ cat /var/log/kafka/kafka.log.1 /var/log/kafka/kafka.log
| grep 'ERROR kafka.server.KafkaApis' | cut -c 1-60 | grep 2014-10 | cut -f 3
-d : | cut -f 1 -d , | sort | uniq -c
     12 00
  17466 01
   1916 02
     82 03
      6 04
      6 06
      6 07
      6 23
      6 42
      6 46
      6 52
      6 56



[3]
qchris@analytics1021:~$ cat /var/log/kafka/kafka.log.1 /var/log/kafka/kafka.log
| grep 'ERROR kafka.server.KafkaApis' | cut -c 1-60 | grep 2014-10 | cut -f 2
-d : | sort | uniq -c
    380 00
    314 01
    314 02
    314 03
    314 04
    368 05
    314 06
    312 07
    312 08
    312 09
    366 10
    306 11
    306 12
    318 13
    318 14
    378 15
    318 16
    318 17
    318 18
    318 19
    372 20
    312 21
    318 22
    318 23
    318 24
    378 25
    312 26
    312 27
    312 28
    312 29
    372 30
    312 31
    312 32
    312 33
    312 34
    372 35
    312 36
    312 37
    312 38
    312 39
    366 40
    312 41
    312 42
    312 43
    312 44
    372 45
    318 46
    312 47
    312 48
    312 49
    366 50
    312 51
    318 52
    312 53
    312 54
    366 55
    312 56
    312 57
    320 58
    332 59

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to