Hi moxie, might I ask if you've considered SHA-384 instead? If I understand corr...

Perseids · on July 29, 2014

Re: Length extensions

Tldr: The length extension property of the Sha2 family has nothing to do with collisions. If you are afraid of future cryptanlytic breakthroughs regarding the collision resistance of Sha2 use the concatenation of SHA-256 and SHA3-256.

You are mightily confused about the connection of length extensions and collisions. (Bear with me, I know you are already familiar with length extensions, but I need to introduce the notation to explain your misunderstanding. Also it is nice to introduce the other readers to the topic.) All actually used cryptographic hash functions are iterative hash functions. That is the message is first padded to a multiple of the block size and then cut into blocks b_1,...,b_n. The core of the hash function is a compression function c (or maybe we should better call it mix-in function if we also have the Sha3 winner Keccak in mind) that takes the internal state i_k and a block b_k+1 and outputs the new internal state i_k+1: i_k+1=c(i_k,b_k+1). The hash function has a pre-defined initial state i_0. At last there is a finalization function f that takes the last internal state i_n (the state after processing the last message block) and outputs the result of the hash function. MD5, SHA1 and the SHA2 family have the problem that the finalization step is just the identity function, i.e. i_n is directly used as output of the hash function. Thus if you take the output you can continue the hash chain directly and calculate the hash of a longer message b_1,...,b_n,b_n+1,...,b_n+m.

Normally this is not that interesting (you could have just calculated the hash of the long message directly and avoided problems with padding that I'm ignoring to keep it simple). But in some situations you might not actually know b_1,...,b_n. For example if you calculate an authentication tag as t=h(k||m) where k is secret key you share with the API and m is the message you want to authenticate. As t is published an attacker an pickup the internal state i_n=t and calculate the authentication tag of an extension: t2=i_n+m=h(k||m||m2) authenticates m||m2. Twitter had this problem with its API: http://vnhacker.blogspot.de/2009/09/flickrs-api-signature-fo... . If you want to do authentication tags right, use HMAC: https://en.wikipedia.org/wiki/Hash-based_message_authenticat...

Now a collision describes the case where two messages b_1,...,b_n and d_1,...,d_m produce the same hash, i.e. the output of the finalization function is identical. As SHA2 does not have a finalization function and I would not believe that collisions in SHA3 would only be possible in the finalization step (squeezing step in the terminology of the Keccak sponge function) if it is possible at all I'll ignore it as a source for collisions for the moment. Thus a collisions means that after processing the k blocks of b_1,...,b_k and the l blocks d_1,...,d_l the internal state is identical. As the hash function is iterative we can continue both block lists with identical blocks e_1,...,e_j and still have the same internal state. As the state is identical it does not matter which finalization function we apply the output is identical and b_1,...,b_k,e_1,...,e_j and d_1,...,d_l,e_1,...,e_j hash to same value. Thus whenever you have one collision in the internal state you can always produce infinitely many colliding messages. This property is independent of the finalization function and blake2, Sha-384, Skein as well as Sha3 allow such attacks (once one collision is found). If anything a fancy finalization function can make things worse by mapping two different internal states to the same output. But because of the problems with the length extension attacks (see twitter) a one-way finalization function is a standard requirement for hash functions nowadays.

If you are really concerned about future cryptanalysis of SHA2 you can spread the risk on several hash functions where each has to be broken to break the security of the overall construction. These are called hash combiners. The most simple one is the concatenation of two hashes: Sha256(m)||Sha3-256(m). This one will be collision resistant if either Sha256 or Sha3-256 is collision resistant. There are also combiners for other properties, like pseudo randomness, and also multi-property combiners. See for example http://tuprints.ulb.tu-darmstadt.de/2094/1/thesis.lehmann.pd... .

heavenlyhash · on July 29, 2014

I was recalling "Multicollisions in iterated hash functions" published at CRYPTO by Antoine Joux in 2004 talks about this subject.

https://www.iacr.org/archive/crypto2004/31520306/multicollis...

It amplifies collision finding by ﬁnding r-collision in time log2(r)2n/2, if r=2t for some t. Figure 1 in section 3 is a very intuitive picture of how.

The third paragraph of section 5 states that the attack is inapplicable on hashes with truncated output.

Although, now that you mention it, my dusty crypto knowledge is not enough for me to explain why this technique would be inapplicable on the internal states of a decomposed hash even if it's inapplicable on the final output. I shall have to read more.

I agree that if one can find a way to apply an HMAC it should also obviate concerns around length extension, and also that combining hashes (carefully) should be able to break only when both are broken.

Perseids · on July 29, 2014

> The third paragraph of section 5 states that the attack is inapplicable on hashes with truncated output.

They actually refer to the large internal state size that makes the generic attack infeasible (for a state size of n bit you need 2^(n/2) many tries to find a collision on average).

> in the second, the attacker needs collisions in the full internal state of the hash function, rather than on the truncated states.

But as both sha256 and sha3-256 have internal state sizes >= 256 bit these are definitely enough for the foreseeable future to protect against generic attack. More interesting is the question whether you can combine specialized cryptanalysis two different hashes to build multicollisions. Apparently you can, at least for MD5 and SHA1: http://www.iacr.org/archive/asiacrypt2009/59120136/59120136....