| thiemowmde triaged this task as "Low" priority. thiemowmde moved this task from incoming to in current sprint on the Wikidata board. thiemowmde added a comment. |
@hoo, doesn't seem to be a problem at the moment: https://www.google.com/search?q=site:wikidata.org+inurl:title=Special:GoToLinkedPage&filter=0
I played around with different Google queries and found that these three are an actual problem:
- https://www.google.com/search?q=site:wikidata.org+inurl:ItemByTitle&filter=0 (~38,000 Google results)
- https://www.google.com/search?q=site:wikidata.org+inurl:GoToLinkedPage&filter=0 (~14,000)
- https://www.google.com/search?q=site:wikidata.org+inurl:EntityData&filter=0 (~1,300)
My conclusions:
- ItemByTitle is pure search and pure redirect, similar to GoToLinkedPage.
- EntityData is either an RDF file (not sure if we want to disallow these) or a redirect when no file extension is given.
- All exclusions should have slashes at the end to not exclude the special page itself, only the results it produces.
- All URLs with %3A instead of : are duplicates anyway, let's exclude them all.
@Mbch331, please edit https://www.wikidata.org/wiki/MediaWiki:Robots.txt as follows. Thanks.
Disallow: /wiki/Special%3A Disallow: /wiki/Special:EntityData/ Allow: /wiki/Special:EntityData/*. Disallow: /wiki/Special:GoToLinkedPage/ Disallow: /wiki/Special:ItemByTitle/ Disallow: /wiki/Special:SetAliases/ Disallow: /wiki/Special:SetDescription/ Disallow: /wiki/Special:SetLabel/ Disallow: /wiki/Special:SetLabelDescriptionAliases/ Disallow: /wiki/Special:SetSiteLink/
TASK DETAIL
EMAIL PREFERENCES
To: Mbch331, thiemowmde
Cc: Mbch331, Jonas, hoo, aude, Lydia_Pintscher, Tobi_WMDE_SW, Aklapper, thiemowmde, TerraCodes, D3r1ck01, Izno, Wikidata-bugs, TheDJ
Cc: Mbch331, Jonas, hoo, aude, Lydia_Pintscher, Tobi_WMDE_SW, Aklapper, thiemowmde, TerraCodes, D3r1ck01, Izno, Wikidata-bugs, TheDJ
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
