https://bugzilla.wikimedia.org/show_bug.cgi?id=48856

dr0ptp4kt <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]

--- Comment #6 from dr0ptp4kt <[email protected]> ---
We will be setting up a no-index rule for zero.wikipedia.org requests. The
business team confirmed that zero.wikipedia.org pages are not supposed to be in
the Google index. I prefer that we have a re-crawl of the site first to help
Google's existing canonical links updated. But eventually, the business team
wants zero.wikipedia.org out of the search index completely.

MZMcBride, as to why the pages were indexed, from what I can tell:

* At some point a code change resulted in article content other than the
"Sorry" warning being echoed into the <language>.zero.wikipedia.org pages below
the warning (making them on par if I understand correctly, with
<language>.m.wikipedia.org pages sans the warning).
* With the fulltext content from each <language>.zero.wikipedia.org page,
Google's crawlers were able to discover more links.
* In the absence of a canonical link for each <language>.zero.wikipedia.org
page, Google's algorithms wouldn't have had a perfect, non-heuristic means of
identifying the pages as being the same. The heuristics seem to have correctly
classified a number of pages as dupes, but not all of them based on a
site:en.zero.wikipedia.org Google search, for example.

My Gerrit change #64113 was introduced to stop content from being echoed below
the "Sorry" warning. This in concert with Jon's Gerrit change #61809 will allow
the Google index to self-correct, although as you note, my Gerrit change #64629
provides the means to have no indexing whatsoever.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to