Ryan, Proposals like this might allow video-intensive sites to migrate to HTTPS sooner than otherwise and are thus very welcome. This one was originally suggested by Anne Van Kesteren, I believe. Or at least something very similar. However, this particular proposal suffers (IIUC) from the disadvantage that users are likely to be presented with mixed content warnings. That's not an acceptable user experience for a professional commercial service.
I understand the reasons that mixed content warnings are presented: the properties of the HTTP media requests do not align with the user expectation of privacy and security which is set by the presence of the "green padlock" or other UI indications of secure transport. A viable interim solution - without such warnings - either needs to avoid setting this expectation or to include additional measures such that the warnings were not necessary. If the latter, we'd need to evaluate whether such measures were worthwhile as an interim step or whether the investment would be better spent on the move to HTTPS proper. …Mark On Thu, Feb 19, 2015 at 9:06 PM, Ryan Sleevi <sle...@google.com> wrote: > Cross-posting, as this touches on the Fetch [1] spec, Media Source > Extensions [2], and Mixed Content [3]. This does cross-post WHATWG and > W3C, apologies if this is a mortal sin. > > TL;DR Proposal first: > - Amend MIX in [4] to add "fetch" as an > optionally-blockable-request-context > * This means that fetch() can now return HTTP content from HTTPS > pages. The implications of this, however, are described below, if you > can handle reading it all. > - Amend MSE in [5] to introduce a new method, appendResponse(Response > response), which accepts a Response [6] class > - In MSE, define a Response Append Loop similar to the Stream Append > Loop [7], that calls the consume body algorithm [8] on the internal > response [9] of Response to yield an ArrayBuffer, then executes the > buffer append [10] algorithm on the SourceBuffer > > > MUCH longer justification why: > As it stands, <audio>/<video>/<source> tags today are optionally > blockable content, as noted in [4]. Thus, an HTTPS page may set the > source to HTTP content and load the content (although typically with > user-agent indication). MSE poses itself as a spec to offer much > greater control to site authors than <audio>/<video>, as noted in its > use cases, and as a result, has seen a rapid adoption among a number > of popular video streaming sites. Most notably, the ability to do > adaptive streaming with MSE helps provide a better quality, better > performing experience for users. Finally, in some user agents, MSE is > a pre-requisite for the use of Encrypted Media Extensions [11]. > > However, there are limitations to using MSE that don't exist with > <video>/<audio>. The most notable of these is that in order to > implement the adaptive streaming capabilities, most sites make use of > XMLHttpRequest to request portions of media content, which can then be > supplied to the SourceBuffer. Based on the feedback that MSE provides > the script author, it can then adjust the XHRs they make to use a > lower bitrate media source, to drop segments, etc. When using XHR, the > site author loses the ability to mix HTTPS pages with HTTP media, as > XHR is (rightfully so) treated as blocked content. > > The justification for why XHR does this is that it returns the full > buffer to the page author. In practice, we saw many sites then taking > that buffer and making security decisions on it - whether it be > "clearly" bad things such as eval()ing the content to more subtle > things like adjusting UI or links. All of these undermine all of the > security guarantees that HTTPS tries to provide, and thus XHR is > blocked. > > The result is that if an HTTPS site wants to use MSE with XHR, all of > the content needs to be served via HTTPS. We've already seen some > providers complain that this is prohibitively expensive in their > current networks [12], although it may be solvable in time, as > demonstrated by other video sharing sites [13]. > > In a choice between using MSE - which offers a better user experience > over <video>/<audio> by reducing bandwidth and improving quality - and > using HTTPS - which offers better privacy and security controls - > sites are likely to choose solutions that reduce their costs rather > than protect their users, a reasonable but unfortunate business > reality. > > I'm hoping to find a way to close that gap - to allow sites to use MSE > (and potentially EME) via HTTPS documents, while still sourcing their > media content via HTTP. This may seem counter-intuitive, and a step > back from the efforts of the Chrome security team, but I think it is > actually consistent with our goals and our past comments. In > particular, this solution tries to provide a means and incentive for > sites to adopt MSE (improving user experience) AND to begin migrating > to HTTPS; first with their main document, and then, in time, all of > their media content. > > This won't protect adversaries from knowing what content the user is > actively watching, for example, but will help protect other vital > assets - such as their cookies, session identifiers, user information, > friends list, past viewing history, etc. > > Allowing fetch() to return HTTP content sourced from HTTPS pages seems > like it would re-open the XHR hole, but this isn't the case. As > described in [14], all requests whose mode is CORS or > CORS-with-forced-preflight are force-failed. This only leaves the > request modes of "no-cors", "same-origin", "about"and "data". Because > the origins are different between the document (https) and the request > URL (http), the request mode will be "no-cors", and thus the returned > Response object will be set to "opaque". > > The "opaque" response prevents direct access to the Response data. > Similarly, the SourceBuffer object does not allow direct access to the > data - this is only passed on to the audio/video decoders, same as the > existing <audio>/<video>/<source> tags today. I realize this may > prevent access to the full capabilities of MSE; indeed, some use cases > require access to the content in order to do adaptive streaming. > However, there still seem a number of use cases where it can work, or > where existing solutions that do require direct access to content may > be adjusted, slightly, so that they don't. > > In discussing this, internally and with other vendors, the primary > security implication of this is that of privacy leakage. However, this > problem exists regardless of fetch(), due to the fact that script can > always inject any of the optionally-blockable content tags into the > page and leak information. That is, I can always disclose content by > using a <video> or <img> tag directly, and I can always smuggle back a > few bits of information at a time (for example, using the width/height > of the image to smuggle back 4-8 bytes at a time, or, even more > primitively, using onload/onerror to smuggle a bit at a time back) > > Further, I'm not proposing that there be any special UI handling for > these mixed-content fetch()'s - that is, they behave as the user agent > already does when encountering passive mixed content (e.g. some form > of UI warning/degradation). So performing these fetch()'s will NOT > yield positive security indicators. Of course, as proposals like [15] > mature, it may be far more desirable sites to have HTTPS with mixed > content compared to HTTP, thus making this proposal even more > attractive than the HTTP counterpart. > > Overall, the hope is to provide incentives for media sharing sites to > begin migrating to HTTPS, allowing them to keep the existing features > they have over HTTP (in this case, MSE), and potentially allowing for > a migration path that allows the staged deprecation of allowing more > powerful, privacy-sensitive features like EME [16] from being > available over HTTP, while not taking any steps backwards in terms of > privacy or security for fetch() or HTTPS pages. > > This is not meant to be a long-term solution for optionally-blockable > content. I absolutely think we should be working to wean sites off > this and move them away. However, in the trade-off between having > major sites using HTTP or having to prolong optionally-blockable > content for some additional, defined period of time, I absolutely > believe the latter is in the greater interest of web security, and > consistent with the findings of the W3C's TAG. > > So, beyond telling me I wrote way too much, what do people think? > > [1] https://fetch.spec.whatwg.org/ > [2] http://w3c.github.io/media-source/ > [3] https://w3c.github.io/webappsec/specs/mixedcontent/ > [4] > https://w3c.github.io/webappsec/specs/mixedcontent/#category-optionally-blockable > [5] http://w3c.github.io/media-source/#sourcebuffer > [6] https://fetch.spec.whatwg.org/#response-class > [7] http://w3c.github.io/media-source/#sourcebuffer-stream-append-loop > [8] https://fetch.spec.whatwg.org/#body > [9] https://fetch.spec.whatwg.org/#concept-internal-response > [10] http://w3c.github.io/media-source/#sourcebuffer-buffer-append > [11] https://w3c.github.io/encrypted-media/ > [12] https://lists.w3.org/Archives/Public/www-tag/2014Oct/0105.html > [13] https://www.youtube.com/ > [14] > https://w3c.github.io/webappsec/specs/mixedcontent/#should-block-fetch > [15] > https://lists.w3.org/Archives/Public/public-webappsec/2014Dec/0062.html > [16] https://www.w3.org/Bugs/Public/show_bug.cgi?id=26332 > [17] https://w3ctag.github.io/web-https/ >