BBlack added a comment.
In T331356#8718619 <https://phabricator.wikimedia.org/T331356#8718619>, @MisterSynergy wrote: > Some remarks: > > - We should consider these canonical HTTP URIs to be //names// in the first place, which are unique worldwide and issued by the Wikidata project as the "owner" [1] of the wikidata.org domain. The purpose of these //names// is to identify things. If they're only names, that's relatively-fine. However, there are user agents that end up following them as access URIs. If we could control every agent, we could require that they all upconvert to HTTPS for access, but we can't. > - Following linked data principles, it is no coincidence that these names happen to be valid URIs. These are meant to be used to look up information about the named entity. It is okay to redirect a canonical URI to another location, including of course to a secure HTTPS location. The problem with relying on redirects is that they're insecure. The initial request goes over the wire in the clear, as does the initial redirect response. They can both be hijacked, modified, censored, and surveilled, before the redirect to HTTPS ever happens. An advanced agent on the wire (like a national telecom) can even persistently hijack a whole session this way, by proxying the traffic into our servers as HTTPS. We support redirects as a "better than breakage/nothing" solution, but ideally UAs shouldn't ever utilize insecure HTTP to begin with. This is why all of our Canonical URIs (in the HTTP/HTML sense) begin with `https`, as evidenced in all the normal pageviews' `<link rel="canonical" href="https://...` tags. > - To my understanding, HSTS can be used to secure all but the first request of a client (that supports HSTS). It can be, and we ever participate in HSTS Preload for all of our canonical domains as well, which protects even the first request to a domain from browsers which use the preload list. However, there are many clients, especially bots and scripted tools, which rely on HTTP libraries or CLI tools which do not, by default, honor HSTS or load the preload list. > - Canonical HTTP URIs are still widespread in many other linked data resources, since many projects have started issueing these before everything transitioned to HTTPS. Some projects have transitioned to canonical HTTPS URIs, however, with GND doing this in 2019 being a prominent example [3]. This would be the ideal end-outcome: that we're able to transition the URLs to be HTTPS everywhere. Barring that, we could also look at where and how they're being emitted. We may have HTML page outputs which are rendering these canonical URIs for access purposes, where it would make sense to convert them to HTTPS as part of the rendering process to cut down on the problem. TASK DETAIL https://phabricator.wikimedia.org/T331356 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: OlafJanssen, MisterSynergy, BCornwall, Bugreporter, Ennomeijers, Nikki, Volans, Aklapper, BBlack, Astuthiodit_1, KOfori, karapayneWMDE, joanna_borun, Invadibot, Devnull, maantietaja, Muchiri124, ItamarWMDE, Akuckartz, Legado_Shulgin, ReaperDawn, Nandana, Davinaclare77, Techguru.pc, Lahi, Gq86, GoranSMilovanovic, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Scott_WUaS, Wong128hk, Wikidata-bugs, aude, faidon, Mbch331, Jay8g, fgiunchedi
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org