Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id CA07EBBE for ; Thu, 7 Jan 2016 23:40:01 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-lf0-f48.google.com (mail-lf0-f48.google.com [209.85.215.48]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 3590118D for ; Thu, 7 Jan 2016 23:40:00 +0000 (UTC) Received: by mail-lf0-f48.google.com with SMTP id m198so18227698lfm.0 for ; Thu, 07 Jan 2016 15:40:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=NfuUIx6l+kUyPg/wN1RuMRjgSbQD514kjQl4HE2VrpM=; b=N8AtyWMRQYzdVHpPBiUZjGbUW17oHeLt96UuJnyvUdkLoSkGxAcoAG+4f3AuuHtF1v blf/vvB+gccJVj0LqKWqApu/zBcnfPxU4W7a2Nq4T/VV2W5T/8wws07Ka6j505/LTLPU Te1HBJ9tD7/Ij+ZNpORlkf7PoFnZ7vX8Ry0ECWC/X+r8PQmvj/XJf7hIZYcPWtnbnypT tyGjtkl9U1DdQ73ommY+I3U/bJeal+5tu9/ztfo7Ax/9zNH0P4AKpwBZT5TvQOEFO/5O MH9SKqqSwMjuSig9uUxXPVtDYHHtOhp8OY/bqd4sgGgJ9nRkM+B2O4QrTmA/R8y6vmt7 hfjg== MIME-Version: 1.0 X-Received: by 10.25.162.11 with SMTP id l11mr23211412lfe.30.1452209998301; Thu, 07 Jan 2016 15:39:58 -0800 (PST) Received: by 10.25.25.78 with HTTP; Thu, 7 Jan 2016 15:39:58 -0800 (PST) In-Reply-To: References: Date: Thu, 7 Jan 2016 18:39:58 -0500 Message-ID: From: Gavin Andresen To: Ethan Heilman Content-Type: multipart/alternative; boundary=001a11411b8c1696ef0528c6fd82 X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Fri, 08 Jan 2016 00:41:49 +0000 Cc: Bitcoin Dev Subject: Re: [bitcoin-dev] Time to worry about 80-bit collision attacks or not? X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Development Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jan 2016 23:40:01 -0000 --001a11411b8c1696ef0528c6fd82 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks, Ethan, that's helpful and I'll stop thinking that collision attacks require 2^(n/2) memory... So can we quantify the incremental increase in security of SHA256(SHA256) over RIPEMD160(SHA256) versus the incremental increase in security of having a simpler implementation of segwitness? I'm going to claim that the difference in the first case is very, very, very small-- the risk of an implementation error caused by having multiple ways of interpreting the segwitness hash in the scriptPubKey is much, much greater. And even if there IS some risk of collision attack now or at some point in the future, I claim that it is easy for wallets to mitigate that risk. In fact, the principle of security in depth means wallets that don't completely control the scriptPubKeys they're creating on behalf of users SHOULD be coded to mitigate that risk (e.g. not allowing arbitrary data around a user's public key in a Script so targeted substring attacks are eliminated entirely). Purely from a security point of view, I think a single 20-byte segwitness in the scriptPubKey is the best design. "Keep the design as simple and small as possible" https://www.securecoding.cert.org/confluence/plugins/servlet/mobile#content= /view/2426 Add in the implied capacity increase of smaller scriptPubKeys and I still think it is a no-brainer. On Thu, Jan 7, 2016 at 5:56 PM, Ethan Heilman wrote: > >Ethan: your algorithm will find two arbitrary values that collide. That > isn't useful as an attack in the context we're talking about here (both o= f > those values will be useless as coin destinations with overwhelming > probability). > > I'm not sure exactly the properties you want here and determining > these properties is not an easy task, but the case is far worse than > just two random values. For instance: (a). with a small modification > my algorithm can also find collisions containing targeted substrings, > (b). length extension attacks are possible with RIPEMD160. > > (a). targeted cycles: > > target1 =3D "str to prepend" > target2 =3D "str to end with" > > seed =3D {0,1}^160 > x =3D hash(seed) > > for i in 2^80: > ....x =3D hash(target1||x||target2) > x_final =3D x > > y =3D hash(tartget1||x_final||target2) > > for j in 2^80: > ....if y =3D=3D x_final: > ........print "cycle len: "+j > ........break > ....y =3D hash(target1||y||target2) > > If a collision is found, the two colliding inputs must both start with > "str to prepend" and end with the phrase "str to end with". As before > this only requires 2^81.5 computations and no real memory. For an > additional 2**80 an adversary has an good change of finding two > different targeted substrings which collide. Consider the case where > the attacker mixes the targeted strings with the hash output: > > hash("my name is=3D0x329482039483204324423"+x[1]+", my favorite number > is=3D"+x) where x[1] is the first bit of x. > > (b). length extension attacks > > Even if all the adversary can do is create two random values that > collide, you can append substrings to the input and get collisions. > Once you find two random values hash(x) =3D hash(y), you could use a > length extension attack on RIPEMD-160 to find hash(x||z) =3D hash(y||z). > > Now the bitcoin wiki says: > "The padding scheme is identical to MD4 using Merkle=E2=80=93Damg=C3=A5rd > strengthening to prevent length extension attacks."[1] > > Which is confusing to me because: > > 1. MD4 is vulnerable to length extension attacks > 2. Merkle=E2=80=93Damg=C3=A5rd strengthening does not protect against len= gth > extension: "Indeed, we already pointed out that none of the 64 > variants above can withstand the 'extension' attack on the MAC > application, even with the Merkle-Damgard strengthening" [2] > 3. RIPEMD-160 is vulnerable to length extension attacks, is Bitcoin > using a non-standard version of RIPEMD-160. > > RIPEMD160(SHA256()) does not protect against length extension attacks > on SHA256, but should protect RIPEMD-160 against length extension > attacks as RIPEMD-160 uses 512-bit message blocks. That being said we > should be very careful here. Research has been done that shows that > cascading the same hash function twice is weaker than using HMAC[3]. I > can't find results on cascading RIPEMD160(SHA256()). > > RIPEMD160(SHA256()) seems better than RIPEMD160() though, but security > should not rest on the notion that an attacker requires 2**80 memory, > many targeted collision attacks can work without much memory. > > [1]: https://en.bitcoin.it/wiki/RIPEMD-160 > [2]: "Merkle-Damgard Revisited: How to Construct a Hash Function" > https://www.cs.nyu.edu/~puniya/papers/merkle.pdf > [3]: https://www.cs.nyu.edu/~dodis/ps/h-of-h.pdf > > On Thu, Jan 7, 2016 at 4:06 PM, Gavin Andresen via bitcoin-dev > wrote: > > Maybe I'm asking this question on the wrong mailing list: > > > > Matt/Adam: do you have some reason to think that RIPEMD160 will be brok= en > > before SHA256? > > And do you have some reason to think that they will be so broken that t= he > > nested hash construction RIPEMD160(SHA256()) will be vulnerable? > > > > Adam: re: "where to stop" : I'm suggesting we stop exactly at the > current > > status quo, where we use RIPEMD160 for P2SH and P2PKH. > > > > Ethan: your algorithm will find two arbitrary values that collide. Tha= t > > isn't useful as an attack in the context we're talking about here (both > of > > those values will be useless as coin destinations with overwhelming > > probability). > > > > Dave: you described a first preimage attack, which is 2**160 cpu time > and no > > storage. > > > > > > -- > > -- > > Gavin Andresen > > > > _______________________________________________ > > bitcoin-dev mailing list > > bitcoin-dev@lists.linuxfoundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > > > --=20 -- Gavin Andresen --001a11411b8c1696ef0528c6fd82 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks, Ethan, that's helpful and I'll stop thinki= ng that collision attacks require 2^(n/2) memory...

So c= an we quantify the incremental increase in security of SHA256(SHA256) over = RIPEMD160(SHA256) versus the incremental increase in security of having a s= impler implementation of segwitness?

I'm going= to claim that the difference in the first case is very, very, very small--= the risk of an implementation error caused by having multiple ways of inte= rpreting the segwitness hash in the scriptPubKey is much, much greater.

And even if there IS some risk of collision attack no= w or at some point in the future, I claim that it is easy for wallets to mi= tigate that risk. In fact, the principle of security in depth means wallets= that don't completely control the scriptPubKeys they're creating o= n behalf of users SHOULD be coded to mitigate that risk (e.g. not allowing = arbitrary data around a user's public key in a Script so targeted subst= ring attacks are eliminated entirely).

Purely from= a security point of view, I think a single 20-byte segwitness in the scrip= tPubKey is the best design.
"Keep the design as simple and small = as possible" https://www.securecoding.cert.org/= confluence/plugins/servlet/mobile#content/view/2426

Add in= the implied capacity increase of smaller scriptPubKeys and I still think i= t is a no-brainer.


On Thu, Jan 7, 2016 at 5:56 PM, Ethan Heil= man <eth3rs@gmail.com> wrote:
>Ethan:=C2=A0 your algorithm will find two arbitrary values that c= ollide. That isn't useful as an attack in the context we're talking= about here (both of those values will be useless as coin destinations with= overwhelming probability).

I'm not sure exactly the properties you want here and determinin= g
these properties is not an easy task, but the case is far worse than
just two random values. For instance: (a). with a small modification
my algorithm can also find collisions containing targeted substrings,
(b). length extension attacks are possible with RIPEMD160.

(a). targeted cycles:

target1 =3D "str to prepend"
target2 =3D "str to end with"

seed =3D {0,1}^160
x =3D hash(seed)

for i in 2^80:
....x =3D hash(target1||x||target2)
x_final =3D x

y =3D hash(tartget1||x_final||target2)

for j in 2^80:
....if y =3D=3D x_final:
........print "cycle len: "+j
........break
....y =3D hash(target1||y||target2)

If a collision is found, the two colliding inputs must both start with
"str to prepend" and end with the phrase "str to end with&qu= ot;. As before
this only requires 2^81.5 computations and no real memory. For an
additional 2**80 an adversary has an good change of finding two
different targeted substrings which collide. Consider the case where
the attacker mixes the targeted strings with the hash output:

hash("my name is=3D0x329482039483204324423"+x[1]+", my favor= ite number
is=3D"+x) where x[1] is the first bit of x.

(b). length extension attacks

Even if all the adversary can do is create two random values that
collide, you can append substrings to the input and get collisions.
Once you find two random values hash(x) =3D hash(y), you could use a
length extension attack on RIPEMD-160 to find hash(x||z) =3D hash(y||z).
Now the bitcoin wiki says:
"The padding scheme is identical to MD4 using Merkle=E2=80=93Damg=C3= =A5rd
strengthening to prevent length extension attacks."[1]

Which is confusing to me because:

1. MD4 is vulnerable to length extension attacks
2. Merkle=E2=80=93Damg=C3=A5rd strengthening does not protect against lengt= h
extension: "Indeed, we already pointed out that none of the 64
variants above can withstand the 'extension' attack on the MAC
application, even with the Merkle-Damgard strengthening" [2]
3. RIPEMD-160 is vulnerable to length extension attacks, is Bitcoin
using a non-standard version of RIPEMD-160.

RIPEMD160(SHA256()) does not protect against length extension attacks
on SHA256, but should protect RIPEMD-160 against length extension
attacks as RIPEMD-160 uses 512-bit message blocks. That being said we
should be very careful here. Research has been done that shows that
cascading the same hash function twice is weaker than using HMAC[3]. I
can't find results on cascading RIPEMD160(SHA256()).

RIPEMD160(SHA256()) seems better than RIPEMD160() though, but security
should not rest on the notion that an attacker requires 2**80 memory,
many targeted collision attacks can work without much memory.

[1]: https://en.bitcoin.it/wiki/RIPEMD-160
[2]: "Merkle-Damgard Revisited: How to Construct a Hash Function"=
https://www.cs.nyu.edu/~puniya/papers/merkle.pdf=
[3]: https://www.cs.nyu.edu/~dodis/ps/h-of-h.pdf

On Thu, Jan 7, 2016 at 4:06 PM, Gavin Andresen via bitcoin-dev
<bitcoin-dev@li= sts.linuxfoundation.org> wrote:
> Maybe I'm asking this question on the wrong mailing list:
>
> Matt/Adam: do you have some reason to think that RIPEMD160 will be bro= ken
> before SHA256?
> And do you have some reason to think that they will be so broken that = the
> nested hash construction RIPEMD160(SHA256()) will be vulnerable?
>
> Adam: re: "where to stop"=C2=A0 :=C2=A0 I'm suggesting w= e stop exactly at the current
> status quo, where we use RIPEMD160 for P2SH and P2PKH.
>
> Ethan:=C2=A0 your algorithm will find two arbitrary values that collid= e. That
> isn't useful as an attack in the context we're talking about h= ere (both of
> those values will be useless as coin destinations with overwhelming > probability).
>
> Dave: you described a first preimage attack, which is 2**160 cpu time = and no
> storage.
>
>
> --
> --
> Gavin Andresen
>
> ________________________= _______________________
> bitcoin-dev mailing list
> bitcoin-dev@l= ists.linuxfoundation.org
> https://lists.linuxfoundation.org= /mailman/listinfo/bitcoin-dev
>



--
=
--
Gavin Andresen
--001a11411b8c1696ef0528c6fd82--