Reminder, this showcase is starting in 5 minutes.  See the stream here:

Join us on Freenode at #wikimedia-research
<> to ask Andrei


On Tue, Mar 15, 2016 at 12:53 PM, Dario Taraborelli <> wrote:

> This month, our research showcase
> <> hosts
> Andrei Rizoiu (Australian National University) to talk about his work
> <> on *how private traits of
> Wikipedia editors can be exposed from public data* (such as edit
> histories) using off-the-shelf machine learning techniques. (abstract below)
> If you're interested in learning what the combination of machine learning
> and public data mean for privacy and surveillance, come and join us this 
> *Wednesday
> March 16* at *1pm Pacific Time*.
> The event will be recorded and publicly streamed
> <>. As usual, we will be
> hosting the conversation with the speaker and Q&A on the
> #wikimedia-research channel on IRC.
> Looking forward to seeing you there,
> Dario
> Evolution of Privacy Loss in WikipediaThe cumulative effect of collective
> online participation has an important and adverse impact on individual
> privacy. As an online system evolves over time, new digital traces of
> individual behavior may uncover previously hidden statistical links between
> an individual’s past actions and her private traits. To quantify this
> effect, we analyze the evolution of individual privacy loss by studying
> the edit history of Wikipedia over 13 years, including more than 117,523
> different users performing 188,805,088 edits. We trace each Wikipedia’s
> contributor using apparently harmless features, such as the number of edits
> performed on predefined broad categories in a given time period (e.g.
> Mathematics, Culture or Nature). We show that even at this unspecific level
> of behavior description, it is possible to use off-the-shelf machine
> learning algorithms to uncover usually undisclosed personal traits, such as
> gender, religion or education. We provide empirical evidence that the
> prediction accuracy for almost all private traits consistently improves
> over time. Surprisingly, the prediction performance for users who stopped
> editing after a given time still improves. The activities performed by new
> users seem to have contributed more to this effect than additional
> activities from existing (but still active) users. Insights from this work
> should help users, system designers, and policy makers understand and make
> long-term design choices in online content creation systems.
> *Dario Taraborelli  *Head of Research, Wikimedia Foundation
> • • @readermeter
> <>
> _______________________________________________
> Wiki-research-l mailing list
Wikimedia-l mailing list, guidelines at:
New messages to:

Reply via email to