SSL makes it more difficult; some private wikis are already restricted to SSL. We also have to consider that irc.wikimedia.org has a recent changes feed.
At minimum, the transit links should be encrypted if feasible. A good reason not to encrypt is that it's extra performance overhead. On Sun, Dec 29, 2013 at 11:10 PM, John Vandenberg <jay...@gmail.com> wrote: > We know NSA wants Wikipedia data, as Wikipedia is listed in one of the > NSA slides: > > https://commons.wikimedia.org/wiki/File:KS8-001.jpg > > That slide is about HTTP, and the tech staff are moving the > user/reader base to HTTPS. > > As we learn more about the NSA programs, we need to consider vectors > other than HTTP for the NSA to obtain the data they want. And the > userbase needs to be aware of the current risks. > > One question from the "Dells are backdored"[sic] thread that is worth > separate consideration is: > > Are the Wikimedia transit links encrypted, especially for database > replication? > MySQL has replication over SSL, so I assume the answer is Yes. > > If not, is this necessary or useful, and feasible ? > > However we also need to consider that SSL and other encryption may be > useless against NSA/etc, which means replicating non-public data > should be avoided wherever possible, as it becomes a single point of > failure. > > Given how public our system is, we don't have a lot of non-public > data, so we might be able to design the architecture so that > information isnt replicated, and also ensure it isnt accessed over > insecure links. I think the only parts of the dataset that are > private & valuable are > * passwords/login cookies, > * checkuser info - IPs and useragents, > * WMF analytics, which includes readers iirc, and > * hidden/deleted edits > * private wikis and mailing lists > > Have I missed any? > > Are passwords and/or checkuser info replicated? > > Is there a data policy on WMF analytics data which prevents it flowing > over insecure links, and limits what is collected and ensures > destruction of the data within reasonable timeframes? i.e. how about > not using cookies to track analytics of readers who are on HTTP > instead of HTTPS? > > The private wikis can be restricted to https, depending on the value > of the data on those wikis in the wrong hands. The private mailing > lists will be harder to secure, and at least the English Wikipedia > arbcom list contain a lot of valuable data about contributors. > > Regarding hidden/deleted edits, the replication isnt the only source > of this data. All edits are also exposed via Recent Changes > (https/api/etc) as they occur, and the value of these edits is > determined by the fact they are hidden afterwards (e.g. don't appear > in dumps). Is there any way to control who is effectively capturing > all edits via Recent Changes? > > -- > John Vandenberg > > _______________________________________________ > Wikimedia-l mailing list > Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>