BBlack added a comment.

  In T331356#8718619 <https://phabricator.wikimedia.org/T331356#8718619>, 
@MisterSynergy wrote:
  
  > Some remarks:
  >
  > - We should consider these canonical HTTP URIs to be //names// in the first 
place, which are unique worldwide and issued by the Wikidata project as the 
"owner" [1] of the wikidata.org domain. The purpose of these //names// is to 
identify things.
  
  If they're only names, that's relatively-fine.  However, there are user 
agents that end up following them as access URIs.  If we could control every 
agent, we could require that they all upconvert to HTTPS for access, but we 
can't.
  
  > - Following linked data principles, it is no coincidence that these names 
happen to be valid URIs. These are meant to be used to look up information 
about the named entity. It is okay to redirect a canonical URI to another 
location, including of course to a secure HTTPS location.
  
  The problem with relying on redirects is that they're insecure.    The 
initial request goes over the wire in the clear, as does the initial redirect 
response.  They can both be hijacked, modified, censored, and surveilled, 
before the redirect to HTTPS ever happens.  An advanced agent on the wire (like 
a national telecom) can even persistently hijack a whole session this way, by 
proxying the traffic into our servers as HTTPS.
  
  We support redirects as a "better than breakage/nothing" solution, but 
ideally UAs shouldn't ever utilize insecure HTTP to begin with.  This is why 
all of our Canonical URIs (in the HTTP/HTML sense) begin with `https`, as 
evidenced in all the normal pageviews' `<link rel="canonical" 
href="https://...` tags.
  
  > - To my understanding, HSTS can be used to secure all but the first request 
of a client (that supports HSTS).
  
  It can be, and we ever participate in HSTS Preload for all of our canonical 
domains as well, which protects even the first request to a domain from 
browsers which use the preload list.  However, there are many clients, 
especially bots and scripted tools, which rely on HTTP libraries or CLI tools 
which do not, by default, honor HSTS or load the preload list.
  
  > - Canonical HTTP URIs are still widespread in many other linked data 
resources, since many projects have started issueing these before everything 
transitioned to HTTPS. Some projects have transitioned to canonical HTTPS URIs, 
however, with GND doing this in 2019 being a prominent example [3].
  
  This would be the ideal end-outcome: that we're able to transition the URLs 
to be HTTPS everywhere.  Barring that, we could also look at where and how 
they're being emitted.  We may have HTML page outputs which are rendering these 
canonical URIs for access purposes, where it would make sense to convert them 
to HTTPS as part of the rendering process to cut down on the problem.

TASK DETAIL
  https://phabricator.wikimedia.org/T331356

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: BBlack
Cc: OlafJanssen, MisterSynergy, BCornwall, Bugreporter, Ennomeijers, Nikki, 
Volans, Aklapper, BBlack, Astuthiodit_1, KOfori, karapayneWMDE, joanna_borun, 
Invadibot, Devnull, maantietaja, Muchiri124, ItamarWMDE, Akuckartz, 
Legado_Shulgin, ReaperDawn, Nandana, Davinaclare77, Techguru.pc, Lahi, Gq86, 
GoranSMilovanovic, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, 
Scott_WUaS, Wong128hk, Wikidata-bugs, aude, faidon, Mbch331, Jay8g, fgiunchedi
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to