https://bugzilla.wikimedia.org/show_bug.cgi?id=33406

--- Comment #34 from Jesús Martínez Novo (Ciencia Al Poder) 
<[email protected]> ---
Sitemap is not needed for search engines to index mediazilla, since all bugs
are sent to wikibugs-l and end listed in various web pages. But they aren't
indexing mediazilla because of this entry in robots.txt:

 Disallow: /*.cgi

But without a sitemap, search engines don't know when a bug is updated, and end
reindexing the entire site every time, producing a lot of overhead on the
servers and bringing the site down. With a sitemap, only updated bugs since the
last site index would be crawled again (supposedly), reducing the overhead on
the site, although not sure to what extent.

From comment 32, it just needs to generate a sitemap, there's no need to ping
search engines about it's existence. They'll know about it when they fetch
robots.txt again and find a sitemap file location there. I don't see why it's
pinging search engines.

Sitemap on bmo (bugzilla.mozilla.org) seems to be generated with a different
extension, or a modification of this one, according to this:
https://code.google.com/p/bugzilla-sitemap/issues/detail?id=1

From what I see, that patch doesn't ping search engines, and also saves the
sitemap on the server and sends it to the search engines instead of
regenerating the sitemap *every time* the URL is requested, during a period of
time defined in SITEMAP_AGE. This should be more convenient. Maybe we can get
the extension that's using bmo somewhere? Or at least consider using that patch
if it looks sane.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to