From: Mark Friedenbach <mark@friedenbach.org>
Message-Id: <40D6F502-3380-4B64-BCD9-80D361EED35C@friedenbach.org>
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_2F828574-A4BF-4E5C-8038-569FC539BB4C"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Date: Thu, 7 Sep 2017 13:04:30 -0700
In-Reply-To: <CAMZUoKk7dHy6urnGRzAB2UG_fkwXmQrRHDfFYOHa0sTStr=yAQ@mail.gmail.com>
To: Russell O'Connor <roconnor@blockstream.io>
References: <CAMZUoKmD4v4vn9L=kdyJNk-km3XHpNVkD_tmS+SseMsf6YaVPg@mail.gmail.com>
	<F1D041D0-FC5A-425C-835D-37E7A9C0CFC5@friedenbach.org>
	<CAMZUoKmhN=m4TFwJi7u1bibLJ6mYvpnkddWTZZWdHn7+mVcJvw@mail.gmail.com>
	<EB804508-715A-4CD6-9B87-09845368DAC0@friedenbach.org>
	<CAMZUoKk7dHy6urnGRzAB2UG_fkwXmQrRHDfFYOHa0sTStr=yAQ@mail.gmail.com>
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Fast Merkle Trees
Precedence: list


--Apple-Mail=_2F828574-A4BF-4E5C-8038-569FC539BB4C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

TL;DR I'll be updating the fast Merkle-tree spec to use a different
      IV, using (for infrastructure compatability reasons) the scheme
      provided by Peter Todd.

This is a specific instance of a general problem where you cannot
trust scripts given to you by another party. Notice that we run into
the same sort of problem when doing key aggregation, in which you must
require the other party to prove knowledge of the discrete log before
using their public key, or else key cancellation can occur.

With script it is a little bit more complicated as you might want
zero-knowledge proofs of hash pre-images for HTLCs as well as proofs
of DL knowledge (signatures), but the basic idea is the same. Multi-
party wallet level protocols for jointly constructing scriptPubKeys
should require a 'delinearization' step that proves knowledge of
information necessary to complete each part of the script, as part of
proving the safety of a construct.

I think my hangup before in understanding the attack you describe was
in actualizing it into a practical attack that actually escalates the
attacker's capabilities. If the attacker can get you to agree to a
MAST policy that is nothing more than a CHECKSIG over a key they
presumably control, then they don't need to do any complicated
grinding. The attacker in that scenario would just actually specify a
key they control and take the funds that way.

Where this presumably leads to an actual exploit is when you specify a
script that a curious counter-party actually takes the time to
investigate and believes to be secure. For example, a script that
requires a signature or pre-image revelation from that counter-party.
That would require grinding not a few bytes, but at minimum 20-33
bytes for either a HASH160 image or the counter-party's key.

If I understand the revised attack description correctly, then there
is a small window in which the attacker can create a script less than
55 bytes in length, where nearly all of the first 32 bytes are
selected by the attacker, yet nevertheless the script seems safe to
the counter-party. The smallest such script I was able to construct
was the following:

    <fake-pubkey> CHECKSIGVERIFY HASH160 <preimage> EQUAL

This is 56 bytes and requires only 7 bits of grinding in the fake
pubkey. But 56 bytes is too large. Switching to secp256k1 serialized
32-byte pubkeys (in a script version upgrade, for example) would
reduce this to the necessary 55 bytes with 0 bits of grinding. A
smaller variant is possible:

    DUP HASH160 <fake-pubkey-hash> EQUALVERIFY CHECKSIGVERIFY HASH160 =
<preimage> EQUAL

This is 46 bytes, but requires grinding 96 bits, which is a bit less
plausible.

Belts and suspenders are not so terrible together, however, and I
think there is enough of a justification here to look into modifying
the scheme to use a different IV for hash tree updates. This would
prevent even the above implausible attacks.


> On Sep 7, 2017, at 11:55 AM, Russell O'Connor =
<roconnor@blockstream.io> wrote:
>=20
>=20
>=20
> On Thu, Sep 7, 2017 at 1:42 PM, Mark Friedenbach <mark@friedenbach.org =
<mailto:mark@friedenbach.org>> wrote:
> I've been puzzling over your email since receiving it. I'm not sure it
> is possible to perform the attack you describe with the tree structure
> specified in the BIP. If I may rephrase your attack, I believe you are
> seeking a solution to the following:
>=20
> Want: An innocuous script and a malign script for which
>=20
>    double-SHA256(innocuous)
>=20
> is equal to either
>=20
>    fast-SHA256(double-SHA256(malign) || r) or
>    fast-SHA256(r || double-SHA256(malign))
>=20
> or  fast-SHA256(fast-SHA256(double-SHA256(malign) || r1) || r0)
> or  fast-SHA256(fast-SHA256(r1 || double-SHA256(malign)) || r0)
> or ...
> =20
> where r is a freely chosen 32-byte nonce. This would allow the
> attacker to reveal the innocuous script before funds are sent to the
> MAST, then use the malign script to spend.
>=20
> Because of the double-SHA256 construction I do not see how this can be
> accomplished without a full break of SHA256.=20
>=20
> The particular scenario I'm imagining is a collision between
>=20
>     double-SHA256(innocuous)
>=20
> and=20
>=20
>     fast-SHA256(fast-SHA256(fast-SHA256(double-SHA256(malign) || r2) =
|| r1) || r0).
>=20
> where innocuous is a Bitcoin Script that is between 32 and 55 bytes =
long.
>=20
> Observe that when data is less than 55 bytes then double-SHA256(data) =
=3D fast-SHA256(fast-SHA256(padding-SHA256(data)) || 0x8000...100) =
(which is really the crux of the matter).
>=20
> Therefore, to get our collision it suffices to find a collision =
between
>=20
>     padding-SHA256(innocuous)
>=20
> and
>=20
>     fast-SHA256(double-SHA256(malign) || r2) || r1
>=20
> r1 can freely be set to the second half of padding-SHA256(innocuous), =
so it suffices to find a collision between
>=20
>    fast-SHA256(double-SHA256(malign) || r2)
>=20
> and the first half of padding-SHA256(innocuous) which is equal to the =
first 32 bytes of innocuous.
>=20
> Imagine the first opcode of innocuous is the push of a value that the =
attacker claims to be his 33-byte public key.
> So long as the attacker doesn't need to prove that they know the =
discrete log of this pubkey, they can grind r2 until the result of =
fast-SHA256(double-SHA256(malign) || r2) contains the correct first =
couple of bytes for the script header and the opcode for a 33-byte push. =
 I believe that is only about 3 or 4 bytes of they need to grind out.
>=20


--Apple-Mail=_2F828574-A4BF-4E5C-8038-569FC539BB4C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dus-ascii"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><div class=3D""><div class=3D"">TL;DR I'll be updating the =
fast Merkle-tree spec to use a different</div><div class=3D"">&nbsp; =
&nbsp; &nbsp; IV, using (for infrastructure compatability reasons) the =
scheme</div><div class=3D"">&nbsp; &nbsp; &nbsp; provided by Peter =
Todd.</div><div class=3D""><br class=3D""></div><div class=3D"">This is =
a specific instance of a general problem where you cannot</div><div =
class=3D"">trust scripts given to you by another party. Notice that we =
run into</div><div class=3D"">the same sort of problem when doing key =
aggregation, in which you must</div><div class=3D"">require the other =
party to prove knowledge of the discrete log before</div><div =
class=3D"">using their public key, or else key cancellation can =
occur.</div><div class=3D""><br class=3D""></div><div class=3D"">With =
script it is a little bit more complicated as you might want</div><div =
class=3D"">zero-knowledge proofs of hash pre-images for HTLCs as well as =
proofs</div><div class=3D"">of DL knowledge (signatures), but the basic =
idea is the same. Multi-</div><div class=3D"">party wallet level =
protocols for jointly constructing scriptPubKeys</div><div =
class=3D"">should require a 'delinearization' step that proves knowledge =
of</div><div class=3D"">information necessary to complete each part of =
the script, as part of</div><div class=3D"">proving the safety of a =
construct.</div><div class=3D""><br class=3D""></div><div class=3D"">I =
think my hangup before in understanding the attack you describe =
was</div><div class=3D"">in actualizing it into a practical attack that =
actually escalates the</div><div class=3D"">attacker's capabilities. If =
the attacker can get you to agree to a</div><div class=3D"">MAST policy =
that is nothing more than a CHECKSIG over a key they</div><div =
class=3D"">presumably control, then they don't need to do any =
complicated</div><div class=3D"">grinding. The attacker in that scenario =
would just actually specify a</div><div class=3D"">key they control and =
take the funds that way.</div><div class=3D""><br class=3D""></div><div =
class=3D"">Where this presumably leads to an actual exploit is when you =
specify a</div><div class=3D"">script that a curious counter-party =
actually takes the time to</div><div class=3D"">investigate and believes =
to be secure. For example, a script that</div><div class=3D"">requires a =
signature or pre-image revelation from that counter-party.</div><div =
class=3D"">That would require grinding not a few bytes, but at minimum =
20-33</div><div class=3D"">bytes for either a HASH160 image or the =
counter-party's key.</div><div class=3D""><br class=3D""></div><div =
class=3D"">If I understand the revised attack description correctly, =
then there</div><div class=3D"">is a small window in which the attacker =
can create a script less than</div><div class=3D"">55 bytes in length, =
where nearly all of the first 32 bytes are</div><div class=3D"">selected =
by the attacker, yet nevertheless the script seems safe to</div><div =
class=3D"">the counter-party. The smallest such script I was able to =
construct</div><div class=3D"">was the following:</div><div class=3D""><br=
 class=3D""></div><div class=3D"">&nbsp; &nbsp; &lt;fake-pubkey&gt; =
CHECKSIGVERIFY HASH160 &lt;preimage&gt; EQUAL</div><div class=3D""><br =
class=3D""></div><div class=3D"">This is 56 bytes and requires only 7 =
bits of grinding in the fake</div><div class=3D"">pubkey. But 56 bytes =
is too large. Switching to secp256k1 serialized</div><div =
class=3D"">32-byte pubkeys (in a script version upgrade, for example) =
would</div><div class=3D"">reduce this to the necessary 55 bytes with 0 =
bits of grinding. A</div><div class=3D"">smaller variant is =
possible:</div><div class=3D""><br class=3D""></div><div class=3D"">&nbsp;=
 &nbsp; DUP HASH160 &lt;fake-pubkey-hash&gt; EQUALVERIFY CHECKSIGVERIFY =
HASH160 &lt;preimage&gt; EQUAL</div><div class=3D""><br =
class=3D""></div><div class=3D"">This is 46 bytes, but requires grinding =
96 bits, which is a bit less</div><div class=3D"">plausible.</div><div =
class=3D""><br class=3D""></div><div class=3D"">Belts and suspenders are =
not so terrible together, however, and I</div><div class=3D"">think =
there is enough of a justification here to look into modifying</div><div =
class=3D"">the scheme to use a different IV for hash tree updates. This =
would</div><div class=3D"">prevent even the above implausible =
attacks.</div></div><div class=3D""><br class=3D""></div><br =
class=3D""><div><blockquote type=3D"cite" class=3D""><div class=3D"">On =
Sep 7, 2017, at 11:55 AM, Russell O'Connor &lt;<a =
href=3D"mailto:roconnor@blockstream.io" =
class=3D"">roconnor@blockstream.io</a>&gt; wrote:</div><br =
class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" =
class=3D""><br class=3D""><div class=3D"gmail_extra"><br class=3D""><div =
class=3D"gmail_quote">On Thu, Sep 7, 2017 at 1:42 PM, Mark Friedenbach =
<span dir=3D"ltr" class=3D"">&lt;<a href=3D"mailto:mark@friedenbach.org" =
target=3D"_blank" class=3D"">mark@friedenbach.org</a>&gt;</span> =
wrote:<br class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px=
 0px 0px 0.8ex;border-left:1px solid =
rgb(204,204,204);padding-left:1ex"><div style=3D"overflow-wrap: =
break-word;" class=3D""><div class=3D"">I've been puzzling over your =
email since receiving it. I'm not sure it</div><div class=3D"">is =
possible to perform the attack you describe with the tree =
structure</div><div class=3D"">specified in the BIP. If I may rephrase =
your attack, I believe you are</div><div class=3D"">seeking a solution =
to the following:</div><div class=3D""><br class=3D""></div><div =
class=3D"">Want: An innocuous script and a malign script for =
which</div><div class=3D""><br class=3D""></div><div class=3D"">&nbsp; =
&nbsp;double-SHA256(innocuous)</div><div class=3D""><br =
class=3D""></div><div class=3D"">is equal to either</div><div =
class=3D""><br class=3D""></div><div class=3D"">&nbsp; =
&nbsp;fast-SHA256(double-SHA256(<wbr class=3D"">malign) || r) =
or</div><div class=3D"">&nbsp; &nbsp;fast-SHA256(r || =
double-SHA256(malign))</div></div></blockquote><div class=3D""><br =
class=3D""></div><div class=3D"">or&nbsp; =
fast-SHA256(fast-SHA256(double-SHA256(<wbr class=3D"">malign) || r1) || =
r0)</div><div class=3D"">or&nbsp; fast-SHA256(fast-SHA256(r1 || =
double-SHA256(<wbr class=3D"">malign)) || r0)</div><div class=3D"">or =
...<br class=3D""></div><div class=3D"">&nbsp;</div><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px =
solid rgb(204,204,204);padding-left:1ex"><div style=3D"overflow-wrap: =
break-word;" class=3D""><div class=3D"">where r is a freely chosen =
32-byte nonce. This would allow the</div><div class=3D"">attacker to =
reveal the innocuous script before funds are sent to the</div><div =
class=3D"">MAST, then use the malign script to spend.</div><div =
class=3D""><br class=3D""></div><div class=3D"">Because of the =
double-SHA256 construction I do not see how this can be</div><div =
class=3D"">accomplished without a full break of SHA256. <br =
class=3D""></div></div></blockquote><div class=3D""><br =
class=3D""></div><div class=3D"">The particular scenario I'm imagining =
is a collision between</div><div class=3D""><br class=3D""></div><div =
class=3D"">&nbsp;&nbsp;&nbsp; double-SHA256(innocuous)</div><div =
class=3D""><br class=3D""></div><div class=3D"">and <br =
class=3D""></div><div class=3D""><br class=3D""></div><div =
class=3D"">&nbsp;&nbsp;&nbsp; =
fast-SHA256(fast-SHA256(fast-SHA256(double-SHA256(<wbr class=3D"">malign) =
|| r2) || r1) || r0).</div><div class=3D""><br class=3D""></div><div =
class=3D"">where innocuous is a Bitcoin Script that is between 32 and 55 =
bytes long.<br class=3D""></div><div class=3D""><br class=3D""></div><div =
class=3D"">Observe that when data is less than 55 bytes then =
double-SHA256(data) =3D fast-SHA256(fast-SHA256(padding-SHA256(data)) || =
0x8000...100) (which is really the crux of the matter).<br =
class=3D""></div><div class=3D""><br class=3D""></div><div =
class=3D"">Therefore, to get our collision it suffices to find a =
collision between</div><div class=3D""><br class=3D""></div><div =
class=3D"">&nbsp;&nbsp;&nbsp; padding-SHA256(innocuous)</div><div =
class=3D""><br class=3D""></div><div class=3D"">and</div><div =
class=3D""><br class=3D""></div><div class=3D"">&nbsp;&nbsp;&nbsp; =
fast-SHA256(double-SHA256(<wbr class=3D"">malign) || r2) || r1</div><div =
class=3D""><br class=3D""></div><div class=3D"">r1 can freely be set to =
the second half of padding-SHA256(innocuous), so it suffices to find a =
collision between</div><div class=3D""><br class=3D""></div><div =
class=3D"">&nbsp;&nbsp; fast-SHA256(double-SHA256(<wbr class=3D"">malign) =
|| r2)</div><div class=3D""><br class=3D""></div><div class=3D"">and the =
first half of padding-SHA256(innocuous) which is equal to the first 32 =
bytes of innocuous.</div><div class=3D""><br class=3D""></div><div =
class=3D"">Imagine the first opcode of innocuous is the push of a value =
that the attacker claims to be his 33-byte public key.</div><div =
class=3D"">So long as the attacker doesn't need to prove that they know =
the discrete log of this pubkey, they can grind r2 until the result of =
fast-SHA256(double-SHA256(<wbr class=3D"">malign) || r2) contains the =
correct first couple of bytes for the script header and the opcode for a =
33-byte push.&nbsp; I believe that is only about 3 or 4 bytes of they =
need to grind out.<br class=3D""></div><div class=3D""><br =
class=3D""></div></div></div></div>
</div></blockquote></div><br class=3D""></body></html>=

--Apple-Mail=_2F828574-A4BF-4E5C-8038-569FC539BB4C--