[Bug 28419] Replace MD5 password hashing with WHIRLPOOL

bugzilla-daemon Fri, 20 Jul 2012 15:36:30 -0700

https://bugzilla.wikimedia.org/show_bug.cgi?id=28419


--- Comment #58 from Daniel Friesen <[email protected]> 
2012-07-20 22:36:20 UTC ---
There should be nothing overly confusing about it (beyond perhaps some
documentation tweaks, etc... that might need improvement).

There are 3 parts to the password system, only 2 of 3 should be intuitive:
- Integration of actual password algorithms into the system should be simple.
The only complexity of which should be the password algorithm itself. The
details of parameter parsing etc... should not be something the implementation
should have to deal with if there isn't some prior reason for this detail to be
handled by them (eg: You're dealing with a library rather than a first hand
implementation)
- Usage of the password system itself should be simple too. The usage should
not have any complexity beyond calling methods like Password::crypt and storing
the resulting data. Comparison (or rather, validation) should also be simple to
the usage. It should not require any extra assumptions in usage. Hence
Password::compare should be separate from crypt and should not require usage of
the system to make assumptions like serialized data outputs being the same.
- The last part of the system is the stack itself that password implementations
integrate into and usage of the system makes calls to. This part of the system
should NOT be simple. This part of the system should be complex enough that it
can handle all of the common scenarios and abstract them away from both the
password implementation integration and the usage of the system. For every part
you try to make this part of the system simple you put extra complexity and
burden of security on the usage and integration code that does not belong
there.

It's like an ssl/tls stack. It's easy to write a protocol and wrap it inside
ssl/tls (like the integration part of the system). And it's easy to write a
client and use a ssl/tls stack to handle the transport (like the usage part of
the system). But the ssl/tls stack itself is insanely complex (like the
internals of the password implementation).

----

The idea of :A: being fixed while the stuff after it is variable is also a
sound implementation pattern. It's a container format for a stack, just like a
tcp packet inside of an ip packet.

Given: ":B:0549900c:8490f5e1c4283c1986a8a59b287d74dc"

The bottom (outermost, lowest) level container is ":B:[...]" it consists of a :
followed by data indicating the type of data contained inside the body followed
by a terminating : and finally the remainder is the body used by the next level
of the stack. It's like the ip packet level of the ip stack.

The next level of the container is "[...]:8490f5e1c4283c1986a8a59b287d74dc"
consisting of a series of params separated by : with the final part of the ata
being defined as the hash. This level of the stack understands introduces the
concept of hash comparison and high level parameters for the next level of the
stack. This is like the tcp level of the IP stack.

The final (top, highest) level of the container works with the high level data
params=[0549900c] and the hash. This part of the stack understands that the
only parameter is a salt and understands how to make a hash out of it along
with a password. The stack component directly below uses that data to do the
hash comparison. This level of the stack is like say the HTTP, IRC, etc...
level of the stack.

So if you want to related the password stack writing a standard password
implementation is like implementing HTTP, etc... on top of tcp which sits on
top of ip. While implementing a crypt() handler is like implementing udp
directly on top of ip instead of using tcp.

----

The result is simple to use when you use it much like a library or component
(eg: Most of us can utilize the Parser class and can implement hooks into the
parser class. But the parser itself as a component is so massive and complex
few people try to touch the component itself).

The relevant code, using ':B:' as an example is:
## Integration; Integrating the password algorithm into the codebase.
## Done at the highest level
class Password_TypeB extends BasePasswordType {
    protected function run( $params, $password ) {
        list( $salt ) = self::params( $params, 1 );
        return md5( $salt . '-' . md5( $password ) );
    }
    protected function cryptParams() {
        $salt = MWCryptRand::generateHex( 8 );
        return array( $salt );
    }
}
##--

This implements a level to handle the 'B' type of password hash. The code
starts with a run() method to take a high level list of params and a password
and convert that into a hash. The B algorithm consists of combining the salt
defined as the first parameter, a '-', and a md5ed version of the password then
putting that through md5. It's a simple hash based password algorithm so it
uses the BasePasswordType layer to abstract away common hash comparisons and
parameter parsing. The code finishes up by defining a method to create default
params (a random salt) for a new password.

## Likewise usage likewise is also simple.
Password::crypt( $password ) => $data // Prepare a password data for storage
Password::compare( $data, $password ) => verified? // Verify a password against
stored data
##--

----
Before I finish off the last of this I should also point out something else on
the topic of keeping crypt and compare separate at the highest level.

You yourself pointed out the php proposal for password handling.
Taking a look at it password_hash and password_verify are separate. In other
words in order to support using this native code as a password implementation
crypt and compare must be separate.

----
(In reply to comment #57)
> Where the hell did everybody on MW learn the definition of clean and
> consistent? Think about what you're saying. Given two hashes that take the
> exact same options and exact same passwords, it's somehow "consistent" for it
> to spit out different hashes each time? That does not make any sense, nor can 
> I
> think of any semantic reason that this would occur. A password hash is just a
> set of options and then a hash produced from the combination of the password,
> those options, and a hashing algorithm. Even the native PHP password hashing
> API, which is being implemented for PHP 5.5, the equality of crypt() and
> compare() holds true (they actually do exactly what I am proposing and just
> call crypt() and then compare the hashes). Yet for some reason MW needs an
> interface that supports not only any format hash ever created or even thought
> of, but it also has to support hashing algorithms that just decide to randomly
> change their formatting at will.

There is a very important thing to point out here.
We're NOT talking about hashes or hashing functions. We're talking about the
stored format of passwords.
Preparing a password is different from just hashing. If this were just a hash
we wouldn't have parameter data to mess with and there would be nothing extra
to consider.

Password storage is a different category of cryptography. It has a different
goal than what the cryptographic hashes category intends to provide.

Different categories of cryptography act differently. Hashes function in a way
that expects the same output for every run. But this does not hold true for
other areas of cryptography. Take a look at some asymmetric signing and even
symmetric mac algorithms. You'll find that for some of the algorithms each time
they sign a piece of text the signature comes out different. Despite that the
signature can still be verified. What's important is not strict equality
verification but instead cryptographic equality verification. This is
consistent within cryptography, but not consistent within output (hence
something you cannot do with the crypt() pattern)

Right now we use the term "Password hashes" because initially all we did with
passwords was hash them, there was no concept of password hashing. And while
we've expanded from that for now we're using the hash category of cryptography.
So we call them password hashes.
However that doesn't mean it'll stay that way forever. It's possible that in
the future someone may have a new idea on how to interfere with password
cracking. And this time it may come from a different area of cryptography like
signing.

Heck, I could almost see that right now. Someone deciding that instead of just
making a password hash they will create a layer around that which will include
an out-of-band cryptographic signature or perhaps encryption into the output
instead of simply including it into some hash.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

[Bug 28419] Replace MD5 password hashing with WHIRLPOOL

Reply via email to