https://bugzilla.wikimedia.org/show_bug.cgi?id=28419
--- Comment #58 from Daniel Friesen <[email protected]> 2012-07-20 22:36:20 UTC --- There should be nothing overly confusing about it (beyond perhaps some documentation tweaks, etc... that might need improvement). There are 3 parts to the password system, only 2 of 3 should be intuitive: - Integration of actual password algorithms into the system should be simple. The only complexity of which should be the password algorithm itself. The details of parameter parsing etc... should not be something the implementation should have to deal with if there isn't some prior reason for this detail to be handled by them (eg: You're dealing with a library rather than a first hand implementation) - Usage of the password system itself should be simple too. The usage should not have any complexity beyond calling methods like Password::crypt and storing the resulting data. Comparison (or rather, validation) should also be simple to the usage. It should not require any extra assumptions in usage. Hence Password::compare should be separate from crypt and should not require usage of the system to make assumptions like serialized data outputs being the same. - The last part of the system is the stack itself that password implementations integrate into and usage of the system makes calls to. This part of the system should NOT be simple. This part of the system should be complex enough that it can handle all of the common scenarios and abstract them away from both the password implementation integration and the usage of the system. For every part you try to make this part of the system simple you put extra complexity and burden of security on the usage and integration code that does not belong there. It's like an ssl/tls stack. It's easy to write a protocol and wrap it inside ssl/tls (like the integration part of the system). And it's easy to write a client and use a ssl/tls stack to handle the transport (like the usage part of the system). But the ssl/tls stack itself is insanely complex (like the internals of the password implementation). ---- The idea of :A: being fixed while the stuff after it is variable is also a sound implementation pattern. It's a container format for a stack, just like a tcp packet inside of an ip packet. Given: ":B:0549900c:8490f5e1c4283c1986a8a59b287d74dc" The bottom (outermost, lowest) level container is ":B:[...]" it consists of a : followed by data indicating the type of data contained inside the body followed by a terminating : and finally the remainder is the body used by the next level of the stack. It's like the ip packet level of the ip stack. The next level of the container is "[...]:8490f5e1c4283c1986a8a59b287d74dc" consisting of a series of params separated by : with the final part of the ata being defined as the hash. This level of the stack understands introduces the concept of hash comparison and high level parameters for the next level of the stack. This is like the tcp level of the IP stack. The final (top, highest) level of the container works with the high level data params=[0549900c] and the hash. This part of the stack understands that the only parameter is a salt and understands how to make a hash out of it along with a password. The stack component directly below uses that data to do the hash comparison. This level of the stack is like say the HTTP, IRC, etc... level of the stack. So if you want to related the password stack writing a standard password implementation is like implementing HTTP, etc... on top of tcp which sits on top of ip. While implementing a crypt() handler is like implementing udp directly on top of ip instead of using tcp. ---- The result is simple to use when you use it much like a library or component (eg: Most of us can utilize the Parser class and can implement hooks into the parser class. But the parser itself as a component is so massive and complex few people try to touch the component itself). The relevant code, using ':B:' as an example is: ## Integration; Integrating the password algorithm into the codebase. ## Done at the highest level class Password_TypeB extends BasePasswordType { protected function run( $params, $password ) { list( $salt ) = self::params( $params, 1 ); return md5( $salt . '-' . md5( $password ) ); } protected function cryptParams() { $salt = MWCryptRand::generateHex( 8 ); return array( $salt ); } } ##-- This implements a level to handle the 'B' type of password hash. The code starts with a run() method to take a high level list of params and a password and convert that into a hash. The B algorithm consists of combining the salt defined as the first parameter, a '-', and a md5ed version of the password then putting that through md5. It's a simple hash based password algorithm so it uses the BasePasswordType layer to abstract away common hash comparisons and parameter parsing. The code finishes up by defining a method to create default params (a random salt) for a new password. ## Likewise usage likewise is also simple. Password::crypt( $password ) => $data // Prepare a password data for storage Password::compare( $data, $password ) => verified? // Verify a password against stored data ##-- ---- Before I finish off the last of this I should also point out something else on the topic of keeping crypt and compare separate at the highest level. You yourself pointed out the php proposal for password handling. Taking a look at it password_hash and password_verify are separate. In other words in order to support using this native code as a password implementation crypt and compare must be separate. ---- (In reply to comment #57) > Where the hell did everybody on MW learn the definition of clean and > consistent? Think about what you're saying. Given two hashes that take the > exact same options and exact same passwords, it's somehow "consistent" for it > to spit out different hashes each time? That does not make any sense, nor can > I > think of any semantic reason that this would occur. A password hash is just a > set of options and then a hash produced from the combination of the password, > those options, and a hashing algorithm. Even the native PHP password hashing > API, which is being implemented for PHP 5.5, the equality of crypt() and > compare() holds true (they actually do exactly what I am proposing and just > call crypt() and then compare the hashes). Yet for some reason MW needs an > interface that supports not only any format hash ever created or even thought > of, but it also has to support hashing algorithms that just decide to randomly > change their formatting at will. There is a very important thing to point out here. We're NOT talking about hashes or hashing functions. We're talking about the stored format of passwords. Preparing a password is different from just hashing. If this were just a hash we wouldn't have parameter data to mess with and there would be nothing extra to consider. Password storage is a different category of cryptography. It has a different goal than what the cryptographic hashes category intends to provide. Different categories of cryptography act differently. Hashes function in a way that expects the same output for every run. But this does not hold true for other areas of cryptography. Take a look at some asymmetric signing and even symmetric mac algorithms. You'll find that for some of the algorithms each time they sign a piece of text the signature comes out different. Despite that the signature can still be verified. What's important is not strict equality verification but instead cryptographic equality verification. This is consistent within cryptography, but not consistent within output (hence something you cannot do with the crypt() pattern) Right now we use the term "Password hashes" because initially all we did with passwords was hash them, there was no concept of password hashing. And while we've expanded from that for now we're using the hash category of cryptography. So we call them password hashes. However that doesn't mean it'll stay that way forever. It's possible that in the future someone may have a new idea on how to interfere with password cracking. And this time it may come from a different area of cryptography like signing. Heck, I could almost see that right now. Someone deciding that instead of just making a password hash they will create a layer around that which will include an out-of-band cryptographic signature or perhaps encryption into the output instead of simply including it into some hash. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
