From: Mark Friedenbach <mark@friedenbach.org>
Message-Id: <EB804508-715A-4CD6-9B87-09845368DAC0@friedenbach.org>
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_4984A445-CC1B-4C86-8669-BADE0F8960A3"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Date: Thu, 7 Sep 2017 10:42:13 -0700
In-Reply-To: <CAMZUoKmhN=m4TFwJi7u1bibLJ6mYvpnkddWTZZWdHn7+mVcJvw@mail.gmail.com>
To: Russell O'Connor <roconnor@blockstream.io>
References: <CAMZUoKmD4v4vn9L=kdyJNk-km3XHpNVkD_tmS+SseMsf6YaVPg@mail.gmail.com>
	<F1D041D0-FC5A-425C-835D-37E7A9C0CFC5@friedenbach.org>
	<CAMZUoKmhN=m4TFwJi7u1bibLJ6mYvpnkddWTZZWdHn7+mVcJvw@mail.gmail.com>
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Fast Merkle Trees
Precedence: list


--Apple-Mail=_4984A445-CC1B-4C86-8669-BADE0F8960A3
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

I've been puzzling over your email since receiving it. I'm not sure it
is possible to perform the attack you describe with the tree structure
specified in the BIP. If I may rephrase your attack, I believe you are
seeking a solution to the following:

Want: An innocuous script and a malign script for which

   double-SHA256(innocuous)

is equal to either

   fast-SHA256(double-SHA256(malign) || r) or
   fast-SHA256(r || double-SHA256(malign))

where r is a freely chosen 32-byte nonce. This would allow the
attacker to reveal the innocuous script before funds are sent to the
MAST, then use the malign script to spend.

Because of the double-SHA256 construction I do not see how this can be
accomplished without a full break of SHA256. The trick of setting r
equal to the padding only works when a single SHA256 is used for leaf
values. This is why double-SHA256 is specified in the BIP, and I will
edit the text to make that more clear.

Which brings us to the point that I think your original request of
separating the hash function of leaves from internal nodes is already
in the specification. I misunderstood your request at first to be that
MERKLEBRANCHVERIFY should itself perform this hash, which I objected
to as it closes of certain use cases such as chained verification of
proofs. But it is explicitly the case that leaf values and internal
updates are calculated with different hash functions.

I'm not intrinsicly opposed to using a different IV for fast-SHA256 so
as to remove the incompatability with single-SHA256 as the leaf hash
function, if that is the consensus of the community. It just adds
complication to implementations and so I want to make sure that
complication is well justified.

Sincerely,
Mark Friedenbach

> On Sep 7, 2017, at 8:43 AM, Russell O'Connor <roconnor@blockstream.io> =
wrote:
>=20
> In that case, you may as well remove all references to leaves and =
double SHA-256 from your BIP since your design has no method for =
distinguishing between internal nodes and leaves.
>=20
> I think that if this design stands, it will play a role in some future =
CVEs.  The BIP itself is too abstract about its data contents to =
specifically say that it has a vulnerability; however, I believe it is =
inviting vulnerabilities.
> For example, I might agree with a counterparty to a design of some =
sort of smart contract in the form of a MAST.  My counterparty has shown =
me all the "leaves" of our MAST and I can verify its Merkle root =
computation.
> After being deployed, I found out that one of the leaves wasn't really =
a leaf but is instead a specially crafted "script" with a fake pubkey =
chosen by my couterparty so that this leaf can also be interpreted as a =
fake internal node (i.e. an internal node with a right branch of =
0x8000...100).
> Because the Fast Merkle Tree design doesn't distinguish between leaves =
and internal nodes my counter party gets away with building an Inclusion =
Proof through this "leaf" to reveal the evil code that they had designed =
into the MAST at a deeper level.
>=20
> Turns out my counterparty was grinding their evil code to produce an =
internal node that can also be parsed as an innocent script.  They used =
their "pubkey" to absorb excess random data from their grinding that =
they cannot eliminate.
> (The counterparty doesn't actually know the discrete log of this =
"pubkey", they just claimed it was their pubkey and I believed them).
>=20
>=20
> Having ambiguity about whether a node is a leaf or an internal node is =
a security risk. Furthermore, changing the design so that internal node =
and leaves are distinguishable still allows chained invocations.
> Arbitrary data can be stored in Fast Merkle Tree leaves, including the =
Merkle root of another Fast Merkle Tree.
> Applications that are limited to proof with paths no longer than 32 =
branches can still circumvent this limit by staging these Fast Merkle =
Trees in explicit layers (as opposed to the implicit layers with the =
current design).
>=20
> By storing a inner Fast Merkle Tree root inside the (explicit) leaf of =
an outer Fast Merkle Tree, the application can verify a Inclusion Proof =
of the inner Fast Merkle Tree Root in the outer Fast Merkle Tree Root, =
and then verify a second Inclusion Proof of the desired data in the =
inner Faster Merkle Tree Root.  The application will need to tag their =
data to distinguish between inner Fast Merkle Tree Roots and other =
application data, but that is just part of the general expectation that =
applications not store ambiguous data inside the leaves of Fast Merkle =
Trees.
>=20
>=20
> On Wed, Sep 6, 2017 at 10:20 PM, Mark Friedenbach =
<mark@friedenbach.org <mailto:mark@friedenbach.org>> wrote:
> This design purposefully does not distinguish leaf nodes from internal =
nodes. That way it chained invocations can be used to validate paths =
longer than 32 branches. Do you see a vulnerability due to this lack of =
distinction?
>=20
> On Sep 6, 2017, at 6:59 PM, Russell O'Connor <roconnor@blockstream.io =
<mailto:roconnor@blockstream.io>> wrote:
>=20
>> The fast hash for internal nodes needs to use an IV that is not the =
standard SHA-256 IV. Instead needs to use some other fixed value, which =
should itself be the SHA-256 hash of some fixed string (e.g. the string =
"BIP ???" or "Fash SHA-256").
>>=20
>> As it stands, I believe someone can claim a leaf node as an internal =
node by creating a proof that provides a phony right-hand branch =
claiming to have hash 0x80000..0000100 (which is really the padding =
value for the second half of a double SHA-256 hash).
>>=20
>> (I was schooled by Peter Todd by a similar issue in the past.)
>>=20
>> On Wed, Sep 6, 2017 at 8:38 PM, Mark Friedenbach via bitcoin-dev =
<bitcoin-dev@lists.linuxfoundation.org =
<mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>> Fast Merkle Trees
>> BIP: https://gist.github.com/maaku/41b0054de0731321d23e9da90ba4ee0a =
<https://gist.github.com/maaku/41b0054de0731321d23e9da90ba4ee0a>
>> Code: https://github.com/maaku/bitcoin/tree/fast-merkle-tree =
<https://github.com/maaku/bitcoin/tree/fast-merkle-tree>
>=20


--Apple-Mail=_4984A445-CC1B-4C86-8669-BADE0F8960A3
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dus-ascii"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><div class=3D"">I've been puzzling over your email since =
receiving it. I'm not sure it</div><div class=3D"">is possible to =
perform the attack you describe with the tree structure</div><div =
class=3D"">specified in the BIP. If I may rephrase your attack, I =
believe you are</div><div class=3D"">seeking a solution to the =
following:</div><div class=3D""><br class=3D""></div><div class=3D"">Want:=
 An innocuous script and a malign script for which</div><div =
class=3D""><br class=3D""></div><div class=3D"">&nbsp; =
&nbsp;double-SHA256(innocuous)</div><div class=3D""><br =
class=3D""></div><div class=3D"">is equal to either</div><div =
class=3D""><br class=3D""></div><div class=3D"">&nbsp; =
&nbsp;fast-SHA256(double-SHA256(malign) || r) or</div><div =
class=3D"">&nbsp; &nbsp;fast-SHA256(r || =
double-SHA256(malign))</div><div class=3D""><br class=3D""></div><div =
class=3D"">where r is a freely chosen 32-byte nonce. This would allow =
the</div><div class=3D"">attacker to reveal the innocuous script before =
funds are sent to the</div><div class=3D"">MAST, then use the malign =
script to spend.</div><div class=3D""><br class=3D""></div><div =
class=3D"">Because of the double-SHA256 construction I do not see how =
this can be</div><div class=3D"">accomplished without a full break of =
SHA256. The trick of setting r</div><div class=3D"">equal to the padding =
only works when a single SHA256 is used for leaf</div><div =
class=3D"">values. This is why double-SHA256 is specified in the BIP, =
and I will</div><div class=3D"">edit the text to make that more =
clear.</div><div class=3D""><br class=3D""></div><div class=3D"">Which =
brings us to the point that I think your original request of</div><div =
class=3D"">separating the hash function of leaves from internal nodes is =
already</div><div class=3D"">in the specification. I misunderstood your =
request at first to be that</div><div class=3D"">MERKLEBRANCHVERIFY =
should itself perform this hash, which I objected</div><div class=3D"">to =
as it closes of certain use cases such as chained verification =
of</div><div class=3D"">proofs. But it is explicitly the case that leaf =
values and internal</div><div class=3D"">updates are calculated with =
different hash functions.</div><div class=3D""><br class=3D""></div><div =
class=3D"">I'm not intrinsicly opposed to using a different IV for =
fast-SHA256 so</div><div class=3D"">as to remove the incompatability =
with single-SHA256 as the leaf hash</div><div class=3D"">function, if =
that is the consensus of the community. It just adds</div><div =
class=3D"">complication to implementations and so I want to make sure =
that</div><div class=3D"">complication is well justified.</div><div =
class=3D""><br class=3D""></div><div class=3D"">Sincerely,</div><div =
class=3D"">Mark Friedenbach</div><div class=3D""><br =
class=3D""></div><div><blockquote type=3D"cite" class=3D""><div =
class=3D"">On Sep 7, 2017, at 8:43 AM, Russell O'Connor &lt;<a =
href=3D"mailto:roconnor@blockstream.io" =
class=3D"">roconnor@blockstream.io</a>&gt; wrote:</div><br =
class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" =
class=3D""><div class=3D"">In that case, you may as well remove all =
references to leaves and double SHA-256 from your BIP since your design =
has no method for distinguishing between internal nodes and =
leaves.</div><div class=3D""><br class=3D""></div><div class=3D"">I =
think that if this design stands, it will play a role in some future =
CVEs.&nbsp; The BIP itself is too abstract about its data contents to =
specifically say that it has a vulnerability; however, I believe it is =
inviting vulnerabilities.</div><div class=3D"">For example, I might =
agree with a counterparty to a design of some sort of smart contract in =
the form of a MAST.&nbsp; My counterparty has shown me all the "leaves" =
of our MAST and I can verify its Merkle root computation.</div><div =
class=3D"">After being deployed, I found out that one of the leaves =
wasn't really a leaf but is instead a specially crafted "script" with a =
fake pubkey chosen by my couterparty so that this leaf can also be =
interpreted as a fake internal node (i.e. an internal node with a right =
branch of 0x8000...100).</div><div class=3D"">Because the Fast Merkle =
Tree design doesn't distinguish between leaves and internal nodes my =
counter party gets away with building an Inclusion Proof through this =
"leaf" to reveal the evil code that they had designed into the MAST at a =
deeper level.</div><div class=3D""><br class=3D""></div><div =
class=3D"">Turns out my counterparty was grinding their evil code to =
produce an internal node that can also be parsed as an innocent =
script.&nbsp; They used their "pubkey" to absorb excess random data from =
their grinding that they cannot eliminate.</div><div class=3D"">(The =
counterparty doesn't actually know the discrete log of this "pubkey", =
they just claimed it was their pubkey and I believed them).</div><div =
class=3D""><br class=3D""></div><div class=3D""><br class=3D""></div><div =
class=3D"">Having ambiguity about whether a node is a leaf or an =
internal node is a security risk. Furthermore, changing the design so =
that internal node and leaves are distinguishable still allows chained =
invocations.</div><div class=3D"">Arbitrary data can be stored in Fast =
Merkle Tree leaves, including the Merkle root of another Fast Merkle =
Tree.</div><div class=3D"">Applications that are limited to proof with =
paths no longer than 32 branches can still circumvent this limit by =
staging these Fast Merkle Trees in explicit layers (as opposed to the =
implicit layers with the current design).</div><div class=3D""><br =
class=3D""></div><div class=3D"">By storing a inner Fast Merkle Tree =
root inside the (explicit) leaf of an outer Fast Merkle Tree, the =
application can verify a Inclusion Proof of the inner Fast Merkle Tree =
Root in the outer Fast Merkle Tree Root, and then verify a second =
Inclusion Proof of the desired data in the inner Faster Merkle Tree =
Root.&nbsp; The application will need to tag their data to distinguish =
between inner Fast Merkle Tree Roots and other application data, but =
that is just part of the general expectation that applications not store =
ambiguous data inside the leaves of Fast Merkle Trees.<br =
class=3D""></div><div class=3D""><br class=3D""></div></div><div =
class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On Wed, =
Sep 6, 2017 at 10:20 PM, Mark Friedenbach <span dir=3D"ltr" =
class=3D"">&lt;<a href=3D"mailto:mark@friedenbach.org" target=3D"_blank" =
class=3D"">mark@friedenbach.org</a>&gt;</span> wrote:<br =
class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"auto" =
class=3D""><div class=3D"">This design purposefully does not distinguish =
leaf nodes from internal nodes. That way it chained invocations can be =
used to validate paths longer than 32 branches. Do you see a =
vulnerability due to this lack of distinction?<br class=3D""></div><div =
class=3D""><div class=3D"h5"><div class=3D""><br class=3D"">On Sep 6, =
2017, at 6:59 PM, Russell O'Connor &lt;<a =
href=3D"mailto:roconnor@blockstream.io" target=3D"_blank" =
class=3D"">roconnor@blockstream.io</a>&gt; wrote:<br class=3D""><br =
class=3D""></div><blockquote type=3D"cite" class=3D""><div class=3D""><div=
 dir=3D"ltr" class=3D""><div class=3D""><div class=3D"">The fast hash =
for internal nodes needs to use an IV that is not the standard SHA-256 =
IV. Instead needs to use some other fixed value, which should itself be =
the SHA-256 hash of some fixed string (e.g. the string "BIP ???" or =
"Fash SHA-256").<br class=3D""><br class=3D""></div>As it stands, I =
believe someone can claim a leaf node as an internal node by creating a =
proof that provides a phony right-hand branch claiming to have hash =
0x80000..0000100 (which is really the padding value for the second half =
of a double SHA-256 hash).<br class=3D""><br class=3D""></div>(I was =
schooled by Peter Todd by a similar issue in the past.)<br class=3D""><div=
 class=3D""><div class=3D""><div class=3D""><div class=3D""><div =
class=3D""><div class=3D""><div class=3D"gmail_extra"><br class=3D""><div =
class=3D"gmail_quote">On Wed, Sep 6, 2017 at 8:38 PM, Mark Friedenbach =
via bitcoin-dev <span dir=3D"ltr" class=3D"">&lt;<a =
href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank" =
class=3D"">bitcoin-dev@lists.<wbr =
class=3D"">linuxfoundation.org</a>&gt;</span> wrote:<br =
class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">
Fast Merkle Trees<br class=3D"">
BIP: <a =
href=3D"https://gist.github.com/maaku/41b0054de0731321d23e9da90ba4ee0a" =
rel=3D"noreferrer" target=3D"_blank" =
class=3D"">https://gist.github.com/maaku/<wbr =
class=3D"">41b0054de0731321d23e9da90ba4ee<wbr class=3D"">0a</a><br =
class=3D"">
Code: <a href=3D"https://github.com/maaku/bitcoin/tree/fast-merkle-tree" =
rel=3D"noreferrer" target=3D"_blank" =
class=3D"">https://github.com/maaku/bitco<wbr =
class=3D"">in/tree/fast-merkle-tree</a><br =
class=3D""></blockquote></div></div></div></div></div></div></div></div></=
div>
</div></blockquote></div></div></div></blockquote></div><br =
class=3D""></div>
</div></blockquote></div><br class=3D""></body></html>=

--Apple-Mail=_4984A445-CC1B-4C86-8669-BADE0F8960A3--