On 2017-03-03 01:02, James Roper wrote:
How about you miss-understanding the fact that a hash can only
ever guarantee that two resources are different. A hash can not
guarantee that two resources are the same. A hash do infer a high
probability they are the same but can never guarantee it, such is
the nature of of a hash. A carefully tailored jquery.js that
matches the hash of the "original jquery.js" could be crafted and
contain a hidden payload. Now the browser suddenly injects this
script into all websites that the user visits that use that
particular version of jquery.js which I'd call a extremely serious
security hole. you can't rely on length either as that could also
be padded to match the length. Not to mention that this is also
crossing the CORS threshold (the first instance is from a
different domain than the current page is for example). Accidental
(natural) collision probabilities for sha256/sha384/sha512 is very
low, but intentional ones are higher than accidental ones.
This is completely wrong. No one has *ever* produced an intentional
collision in sha256 or greater.
Huh? When did I ever state that? I have never said that sha256 or higher
having been broken, do not put words/lies in my mouth please. I find
that highly offensive.
I said "could", just ask any cryptographer. It is highly improbable, but
theoretically possible, but fully impractical to attempt (current stages
of quantum computing has not shown any magic bullet yet).
I'm equally concerned with a natural collision, while the probability is
incredibly small the chance is 50/50 (if we imagine all files containing
random data and file lengths, which they don't).
And as to my statement "a hash can only ever guarantee that two
resources are different. A hash can not guarantee that two resources are
the same" again that is true. You can even test this by using small
enough hashes (CRC-4 or something simple) and editing a file and you'll
see that what I say is true.
You know how a these types of hashes works right? They are NOT UNIQUE,
if if you want something unique then those are called "Perfect Hash"
which is not something you want to use for cryptography.
If a hash like sha256 was unique it would be a compression miracle as
you could then just "uncompress" the hash.
Only if the data you hash is the same size as the hash can you perfectly
re-create the data that is hashed. Which is what I proposed with my UUID
suggestion.
Do note that I'm talking about Version 1 UUIDs and not the random
Version 4 ones which re not unique.
In case you missed the headlines, last week Google announced it
created a sha1 collision. That is the first, and only known sha1
collision ever created. This means sha1 is broken, and must not be used.
Now it's unlikely (as in, it's not likely to happen in the history of
a billion universes), but it is possible that at some point in the
history of sha256 that a collision was accidentally created. This
probability is non zero, which is greater than the impossibility of
intentionally creating a collision, hence it is more likely that we
will get an accidental collision than an intentional collision.
Sha1 still has it's uses. Now I haven't checked but sha1 just as md5 are
still ok to use with HMAC. Also it's odd that you say sha1 should not be
used at all. Nothing wrong with using it as a file hash/checksum. With
the number of files and the increase in-data CRC32 is nit that useful
(unless you divide the file in chunks and provide a CRC32 array instead).
A hash is not the right way to do what you want, a UUID and a (or
multiple) trusted shared cache(s) is.
The issue with using a hash is that at some point sha256 could become
deprecated, do the browser start ignoring i then? Should it behave as if
the javascript file had no hash or that it's potentially dangerous now?
Also take note that a UUID can also be made into a valid URI, but I
suggested adding a attribute as that would make older browsers/version
"forward compatible" as the URI till works normally.
And to try and not entirely run you idea into the ground. It's not
detailed enough. By that I mean you would need a way for the webdesigner
to inform the browser that they do not want the scripts hosted on their
site replaced by these from another site. Now requiring a Opt Out is a
pain in the ass, and when security is concerned one such never have to
"Opt Out to get more secure", one should by default be more secure.
Which means that you would need to add another attribute or modify the
integrity one to allow cache sharing.
Now myself I would never do that, even if the hash matches I'd never
feel comfortable running a script originating from some other site in
the page I'm delivering to my visitor.
I would not actually want the browser to even cache my script and
provide that to other sites pages.
I might however feel comfortable adding a UUID and let the browser fetch
that script from it's local cache or from a trusted cloud cache.
If you are going to use the integrity attribute for authentication then
you also need to add a method of revocation so that if for example the
hashing used is deemed weak/compromised (due to say a so far
undiscovered design flaw), then only the browsers that are up to date
will be able to consider those hashes unsafe. Older browsers will be
clueless and all of a sudden some porn site includes manipulated
banking.js and whenever a older browser with a stale cache encounter
that it replaces that and the next time the user goes to their bank the
browser will happily use a trojan script instead. Te end result is that
bank's etc will not use the integrity attribute or they will server a
different versioned script for each visit/page load which kinda nukes
caching in general.
Remember, you did not specify a optin/optout for the shared integrity
based caching.
You might say that this is all theoretical, but you yourself proclaimed
sha1 is no longer safe. Imagine if the most popular version of jquery
became a trojan, we're talking tens of thousands of very high profile
sites possible victims of cache poisoning.
Now I'm not saying the integrity attribute is useless, for CDNs it's
pretty nice. It ensures that when your site uses say awesomescript12.js
that is awesomescript12.js and not a misnamed awesomescript10.js or
worse notsoawesomescript4.js
But, at this point you already trust the CDN (why else would you use
them right?)
Another thing the integrity hash is great for is to reduce the chance of
a damaged script being loaded (sha512 has way more bits than CRC32 for
example).
And if I was to let a webpage fetch a script from a CDN I would probably
use the integrity attribute, but that is because I trust that CDN.
If a browser just caches the first of whatever it encounter and then use
that for all subsequent requests for that script then I want no part of
that, it's a security boundary I'm not willing to cross, hash or no
hash. So a opt-in would be essential on this.
Now many sites have there own CDN I assume these are your focus. But
many use global ones (sometimes provided directly/indirectly with the
blessing of the developers of a script). I don't see this a a major
caching issue. The main issue is multiple versions of a script. Many
scripts are not always that backward compatible, I have seen cases where
there are 3-4 versions of the same script on the same site. A shared
browser cache may help with that if those are the unedited official
scripts of jquery but usually they may not be. They may also be run
through a minifer or similar or they have been minified but not with the
same settings as the official one.
This is why I stress that a UUID based idea is better in the whole. As
the focus would be on the versions/APIs/interoperability instead. I.e.
v1.1 and v1.2 have the exact same calls just some bug fixes? They can
both be given the same UUID and the CDN or trusted cache will provide
v1.2 all the time.
PS! Not trying to sound like an ass here but could you trim the email
next time? While I do enjoy hearing my own voice/reading my own text as
much as the next person there is no need to quote the whole thing. Also
why did you CC me a full quote of my email but did not write anything
yourself, did you hit reply by accident or is there a bug in the email
system somewhere?
Which brings me to a nitpick of mine, if you reply to the list then
there is no need to also CC me. If' I'm posting to the list then I'm
also reading the list, I'd rather not have multiple email copies in my
inbox. Hit the "Reply to list" button instead of "Reply to all" next
time (these options depends on your email client).
--
Roger Hågensen,
Freelancer, Norway.