https://bugzilla.wikimedia.org/show_bug.cgi?id=58805

       Web browser: ---
            Bug ID: 58805
           Summary: Add robot policy for each namespace to the dumps
           Product: Datasets
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: General/Unknown
          Assignee: ar...@wikimedia.org
          Reporter: mflasc...@wikimedia.org
                CC: gsv...@gmail.com
    Classification: Unclassified
   Mobile Platform: ---

For individual pages, the __NOINDEX__ and __INDEX__ page properties (available
in page_props.sql lowercase and without the underscores) can be used to
determine overrides.

However, the baseline robot policy for each namespace should also be dumped. 
For each namespace, this can be determined by starting with
[[mw:Manual:$wgDefaultRobotPolicy]] and then overriding it with
[[mw:Manual:$wgNamespaceRobotPolicies]].  For convenience (it's not a
significant storage cost since there are generally not many namespaces), it
should state the policy for each namespace, even the ones that simply inherit
$wgDefaultRobotPolicy.

I think it would be simplest to just use robotpolicy="noindex,nofollow" (or
whatever the actual policy is) on each <namespace> element, since that's the
format used in the HTML output and the configuration variables.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to