Hi, You may have followed the discussion on Wikimedia-l (and enwiki-l).
For a mere intellectual curiosity I would like to know why hashing the IPs with a varying salt won't work. Wouldn't that provide a way to obfuscate IP addresses while maintaining uniqueness (i. e. a given IP gets alway hashed to the same hash). Tim said in a message on enwiki-l that he has looked into the matter but haven't found any satisfying solution. So what's the problem with salted hashes? Note: I have read something about hashing but I am far from being an expert, please assume I am the classical layman. Thanks in advance to anyone who will take the time to explain. C ---------- Messaggio inoltrato ---------- Da: "Lila Tretikov" <[email protected]> Data: 05/Apr/2015 11:30 Oggetto: Re: [Wikimedia-l] Announcing: The Wikipedia Prize! A: "Wikimedia Mailing List" <[email protected]> Cc: All, As Tim mentioned we are seriously looking at privacy/identity/security/anonymity issues, specifically as it pertains to IP address exposure -- both from legal and technical standpoint. This won't happen overnight as we need to get people to work on this and there are a lot of asks, but this is on our radar. On a related note, let's skip the sarcasm and treat each other with straightforward honestly. And for non-English speakers -- who are also (if not more) in need of this -- sarcasm can be very confusing. Thanks, Lila On Fri, Apr 3, 2015 at 4:02 PM, Cristian Consonni <[email protected]> wrote: > Hi Brian, > > 2015-03-30 0:25 GMT+02:00 Brian <[email protected]>: > > Although the initial goal of the Netflix Prize was to design a > > collaborative filtering algorithm, it became notorious when the data was > > used to de-anonymize Netflix users. Researchers proved that given just a > > user's movie ratings on one site, you can plug those ratings into another > > site, such as the IMDB. You can then take that information, and with some > > Google searches and optionally a bit of cash (for websites that sell user > > information, including, in some cases, their SSN) figure out who they > are. > > You could even drive up to their house and take a selfie with them, or > > follow them to work and meet their boss and tell them about their views > on > > the topics they were editing. > > somewhat tangentially, and to bring back this to topic to a more > scientific setting I would like to point out that there has already > been reasearch in the past on this topic. > > I highly recommend reading the following paper: > > Lieberman, Michael D., and Jimmy Lin. "You Are Where You Edit: > Locating Wikipedia Contributors through Edit Histories." ICWSM. 2009. > (PDF < > http://www.pensivepuffin.com/dwmcphd/syllabi/infx598_wi12/papers/wikipedia/lieberman-lin.YouAreWhereYouEdit.ICWSM09.pdf > >) > > For those of you that don't want to read the whole paper, you can find > a recap of the most relevant findings in this presentation by Maurizio > Napolitano: > < > http://www.slideshare.net/napo/social-geography-wikipedia-a-quick-overwiew > > > > The main idea is associating spatial coordinates to a Wikipedia > articles when possible, this articles are called "geopages". Then you > extract from the history of articles the users which have edited a > geopage. If you plot the geopages edited by a given contributor you > can see that they tend to cluster, so you can define an "edit area". > The study finds that 30-35% of contributors concentrate their edits in > an edit area smaller than 1 deg^2 (~12,362 km^2, approximately the > area of Connecticut or Northern Ireland[1] (thanks, Wikipedia!)). > > For another free/libre project with a geographic focus like > OpenStreetMap this is even more marked, check out for example this > tool «“Your OSM Heat Map” (aka Where did you contribute?)»[2] by > Pascal Neis. > > This, of course, is not a straightforward de-anonimization but this > methods work in principle for every contributor even if you obfuscate > their IP or username (provided that you can still assign all the edits > from a given user to a unique and univocal identifier) > > C > [1] https://en.wikipedia.org/wiki/Square_degree > [2a] http://yosmhm.neis-one.org/ > [2b] http://neis-one.org/2011/08/yosmhm/ > > _______________________________________________ > Wikimedia-l mailing list, guidelines at: > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines > [email protected] > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > <mailto:[email protected]?subject=unsubscribe> > _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[email protected]?subject=unsubscribe> _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
