* Aryeh Gregor <[email protected]> [Tue, 25 Aug 2009 
13:13:56 -0400]:
> That's wrong.  The canonical version of a page must be a page with
> substantially identical content.  Edit pages serve totally different
> HTML; rel=canonical pointing to the article will just be ignored by
> search engines.  See here for a discussion of how rel=canonical works:
>
> 
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
>
Thanks for pointing out.

> Note, e.g., "We allow slight differences, e.g., in the sort order of a
> table of products. We also recognize that we may crawl the canonical
> and the duplicate pages at different points in time, so we may
> occasionally see different versions of your content."  Totally
> different content, no.
>
Well, semantically an edit page and action=view page are not totally 
different, for sure. Both of these will contain very similar 
information. But I cannot go against standards, that's impossible. 
That's something like law, you don't always like it, but you have to 
obey it.

> > Anyway, it seems that Yandex crawler doesn't like the meta noindex
> > rules in the header of the page, giving an error (warning) message 
in
> > the stats of their webmaster tools.
>
> What does the warning say?  Ideally, of course, you should ban them in
> robots.txt, so the search engine doesn't have to bother fetching the
> URL.
>
I've banned them in robots.txt It produces the warning due to 
non-existing titles, which also have meta noindex. There are some links 
from foreign sites to non-existing titles which I obviously cannot 
disable something like "http://mywiki.org/wiki/nonexsitingtitle"; . 
Yandex gives the warning "Document contains meta-tag noindex" 
(approximately translated from Russian). A lots of such warnings. A bit 
strange, why this is a warning at all. Google doesn't give such warning.

> The purpose is to tell search engines which URL you'd prefer them to
> present to users, if the same content is being served under multiple
> URLs.  It is not meant to artificially inflate rankings by counting
> unindexed pages as contributing to some entirely different page of
> your choosing, and using it that way won't actually work.  Since
> search engines were already using heuristics to identify duplicate
> content, and might well continue to use those exact same heuristics to
> validate rel=canonical, it might not improve rankings at all.
>
I am not so sure that such inflation is artifical. The artifical one 
would be when the article/revision is not the same or, even mixing 
MediaWiki generated HTML and other HTML. But, anyway I cannot change how 
the search engines will interpret it.
Dmitriy

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to