Not sure if the analysis has to expose any private data at all, you
show the result of the analysis and that would integrate over weeks or
months and perhaps after filtering out random noise. Would that be a
privacy problem?

One of the tricky things is that the disambiguation or search page is
a signal that the referrer or some other previous page in the users
history is difficult to connect to some later page. When the number of
steps between the pages are increasing the problem of detecting the
relation increases exponentially. It is also worth noting that by only
using click events on the disambiguation page you will only discover
connections that are already present as links on the disambiguation
page.

On Wed, Jul 17, 2013 at 6:49 PM, Jon Robson <[email protected]> wrote:
> Agreed. As a first step, if someone is interested in this and this
> doesn't go against our privacy policy it would be good to collect some
> link clicking data for various disambiguation pages to get an idea of
> whether the data created is meaningful and useful. Tyler's concerns
> are valid but we should clarify with some data rather than speculate
> to whether these are indeed concerns we need to worry about and
> whether this. EventLogging [1] could be used for this in my opinion
> using some simple javascript that hijacks links on the disambiguation
> page - looking at referrer and next page.
>
> In terms of analyzing the data you could then simply look at a sample
> of disambiguation pages and manually determine the accuracy of users
> picking the correct link.
>
> If the data does show promise it would then be an easy enough job to
> create a UI to use it and for editors to correct them.
>
> I don't currently have time to explore this but would like to in
> future but if anyone is interested please dive in...
>
> [1] https://mediawiki.org/wiki/Extension:EventLogging
>
> On Wed, Jul 17, 2013 at 5:14 AM, C. Scott Ananian
> <[email protected]> wrote:
>> Sounds like a disagreement that can be settled quantitatively. ;)
>>   --scott
>> On Jul 17, 2013 5:03 AM, "Tyler Romeo" <[email protected]> wrote:
>>
>>> On Wed, Jul 17, 2013 at 4:42 AM, John Erling Blad <[email protected]>
>>> wrote:
>>>
>>> > It doesn't matter because the correct behavior will accumulate over
>>> > time. You don't try to "fix" linkage just because you have one single
>>> > observed behavior, you collect and correlate behavior over time and
>>> > use several, perhaps hundreds of observations.
>>> >
>>>
>>> I strongly doubt that the correct behavior will be prevalent enough to
>>> warrant using such an automatic system over just manually fixing
>>> disambiguation links, which can be done quite easily using automatic wiki
>>> browsers and the like.
>>>
>>> *-- *
>>> *Tyler Romeo*
>>> Stevens Institute of Technology, Class of 2016
>>> Major in Computer Science
>>> www.whizkidztech.com | [email protected]
>>> _______________________________________________
>>> Wikitech-l mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> _______________________________________________
>> Wikitech-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
> --
> Jon Robson
> http://jonrobson.me.uk
> @rakugojon
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to