* Aryeh Gregor <[email protected]> [Tue, 25 Aug 2009 13:13:56 -0400]: > That's wrong. The canonical version of a page must be a page with > substantially identical content. Edit pages serve totally different > HTML; rel=canonical pointing to the article will just be ignored by > search engines. See here for a discussion of how rel=canonical works: > > http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html > Thanks for pointing out.
> Note, e.g., "We allow slight differences, e.g., in the sort order of a > table of products. We also recognize that we may crawl the canonical > and the duplicate pages at different points in time, so we may > occasionally see different versions of your content." Totally > different content, no. > Well, semantically an edit page and action=view page are not totally different, for sure. Both of these will contain very similar information. But I cannot go against standards, that's impossible. That's something like law, you don't always like it, but you have to obey it. > > Anyway, it seems that Yandex crawler doesn't like the meta noindex > > rules in the header of the page, giving an error (warning) message in > > the stats of their webmaster tools. > > What does the warning say? Ideally, of course, you should ban them in > robots.txt, so the search engine doesn't have to bother fetching the > URL. > I've banned them in robots.txt It produces the warning due to non-existing titles, which also have meta noindex. There are some links from foreign sites to non-existing titles which I obviously cannot disable something like "http://mywiki.org/wiki/nonexsitingtitle" . Yandex gives the warning "Document contains meta-tag noindex" (approximately translated from Russian). A lots of such warnings. A bit strange, why this is a warning at all. Google doesn't give such warning. > The purpose is to tell search engines which URL you'd prefer them to > present to users, if the same content is being served under multiple > URLs. It is not meant to artificially inflate rankings by counting > unindexed pages as contributing to some entirely different page of > your choosing, and using it that way won't actually work. Since > search engines were already using heuristics to identify duplicate > content, and might well continue to use those exact same heuristics to > validate rel=canonical, it might not improve rankings at all. > I am not so sure that such inflation is artifical. The artifical one would be when the article/revision is not the same or, even mixing MediaWiki generated HTML and other HTML. But, anyway I cannot change how the search engines will interpret it. Dmitriy _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
