"closure of the [[Category:Australia]]" is not going to work. In en.wiki subcategories are not subsets in any mathematical sense and the category tree has many, many loops and no roots.
cheers stuart -- ...let us be heard from red core to black sky On Tue, Jan 24, 2017 at 2:12 PM, Kerry Raymond <[email protected]> wrote: > As previously came up in discussion about chapters, it would be very > useful to have national data about Wikipedia activities, which can be > determined (generally) from IP addresses. Now I understand the privacy > argument in relation to logged-in users (not saying I agree with it though > in relation to aggregate data). However, can we find a proxy that does not > have the privacy considerations. > > > > My hypothesis is that national content is predominantly written by users > resident in that nation. And that therefore activity on national content > can be used as a proxy for national user editing activity. > > > > In the case of Australia, we could describe Australian national content in > either of two ways: articles within the closure of the > [[Category:Australia]] and/or those tagged as {{WikiProject Australia}}. > There are arguments for/against either (neither is perfect, in my > experience the category closure will tend to have false positives and the > project will tend to have false negatives). > > > > I would like to know what correlation exists between national editor > activity (as determined from IP addresses mapped to location) and national > content edits and if/how it changes over time for various nations. This is > research that only WMF can do because WMF has the IP addresses and the rest > of us can’t have them for privacy reasons. > > > > If we could establish that a strong-enough correlation existed between > them, we could use national content activity (for which there is no privacy > consideration) as a proxy for national editing activity. And we might even > be able to come up with a multiplier for each nation to provide comparable > data for national editing activity. > > > > Now, it may be that we need to restrict the edits themselves in some way > to maximise the correlations between national content and same-nation > editor activity. > > > > My second hypothesis is “semantic” edits (e.g. edits that add large > amounts of content or citation) to national content will be more highly > correlated with same-nation editors than “syntactic” edits (e.g. fix > spelling, punctuation or Manual of Style issues) will be. I suspect most > bots and other automated/semi-automated edits are doing syntactic edits. > > > > Now, some of you will probably be aware of [https://en.wikipedia.org/ > wiki/Wikipedia:Wikipedia_Signpost/2017-01-17/Recent_research Female > Wikipedians aren't more likely to edit women biographies]. So it may well > be that my patriotic-editing hypothesis is also untrue. But it would be > nice to know one way or the other. > > > > Kerry > > > > _______________________________________________ > Wiki-research-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > >
_______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
