https://bugzilla.wikimedia.org/show_bug.cgi?id=67157
Bug ID: 67157
Summary: CirrusSearch: Failing to reindexMeta
Product: MediaWiki extensions
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: CirrusSearch
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected], [email protected],
[email protected]
Web browser: ---
Mobile Platform: ---
We're having trouble reindexing meta because we're hitting a page with an
external link that contains invalid utf-8:
[2014-06-26 18:43:29,960][DEBUG][action.bulk ] [elastic1018]
[metawiki_general_1403807864][5] failed to execute bulk item (index) index
{[metawiki_general_1403807864][page][661035],
source[{"namespace":2,"namespace_text":"User","title":"COIBot/Local/selftrans.narod.ru","timestamp":"2011-10-11T04:19:40Z","category":["Pages
where template include size is exceeded","Noindexed pages","COIBot Local
Reports"],"external_link":["//wikipediatools.appspot.com/linksearch.jsp?set=top20&link=selftrans.narod.ru","//wikipediatools.appspot.com/linksearch.jsp?set=top40&link=selftrans.narod.ru","//wikipediatools.appspot.com/linksearch.jsp?set=major&link=selftrans.narod.ru","http://www.google.com/search?num=10&hl=en&rls=en&q=selftrans.narod.ru","//www.google.com/search?num=100?h1=en&rls=en&q=selftrans.narod.ru+site:en.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=selftrans.narod.ru+site:fr.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=selftrans.narod.ru+site:de.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=selftrans.narod.ru+site:meta.wikimedia.org","http://siteexplorer.search.yahoo.com/advsearch?p=selftrans.narod.ru&bwm=i&bwmf=d&bwms=p","//toolserver.org/~erwin85/xwiki.php?report=User:COIBot/LinkReports/selftrans.narod.ru&forcelive=1","//toolserver.org/~erwin85/xwiki.php?report=User:COIBot/Local/selftrans.narod.ru&forcelive=1","//tools.wmflabs.org/searchsbl/?url=selftrans.narod.ru","http://whois.domaintools.com/selftrans.narod.ru","http://www.aboutus.org/selftrans.narod.ru","http://www.malwaredomainlist.com/mdl.php?search=selftrans.narod.ru&colsearch=Domain&quantity=50","http://www.alexa.com/data/details/main?url=selftrans.narod.ru","http://213.180.199.13","//wikipediatools.appspot.com/linksearch.jsp?set=top20&link=213.180.199.13","//wikipediatools.appspot.com/linksearch.jsp?set=top40&link=213.180.199.13","//wikipediatools.appspot.com/linksearch.jsp?set=major&link=213.180.199.13","http://www.google.com/search?num=10&hl=en&rls=en&q=213.180.199.13","//www.google.com/search?num=100?h1=en&rls=en&q=213.180.199.13+site:en.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=213.180.199.13+site:fr.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=213.180.199.13+site:de.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=213.180.199.13+site:meta.wikimedia.org","http://siteexplorer.search.yahoo.com/advsearch?p=213.180.199.13&bwm=i&bwmf=d&bwms=p","//tools.wmflabs.org/searchsbl/?url=213.180.199.13","http://whois.domaintools.com/213.180.199.13","http://www.aboutus.org/213.180.199.13","http://www.malwaredomainlist.com/mdl.php?search=213.180.199.13&colsearch=Domain&quantity=50","http://www.alexa.com/data/details/main?url=213.180.199.13","http://uk.wikipedia.org/wiki/Mediawiki:Spam-whitelist","http://www.google.com/search?q=%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3
...
java.lang.IllegalArgumentException: Document contains at least one immense term
in field="external_link" (whose UTF8 encoding is longer than the max length
32766), all of which were skipped. Please correct the analyzer to not produce
such terms. The prefix of the first immense term is: '[68 74 74 70 3a 2f 2f 77
77 77 2e 67 6f 6f 67 6c 65 2e 63 6f 6d 2f 73 65 61 72 63 68 3f 71]...'
I'm not sure if this is a new feature of 1.2.1 or what.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l