Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 69A438EE for ; Mon, 29 May 2017 14:55:59 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-vk0-f44.google.com (mail-vk0-f44.google.com [209.85.213.44]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id CBB80CF for ; Mon, 29 May 2017 14:55:58 +0000 (UTC) Received: by mail-vk0-f44.google.com with SMTP id y190so32945976vkc.1 for ; Mon, 29 May 2017 07:55:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blockstream-io.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=/VjhF9F+TPyqEnKErWvTEW6LdJyARXMhIk8092aCpIM=; b=EogtRGEdnpVeRw/cbVWpORSfGlyb1h/GuZHpjwfRUl2tw8l6n4ZAeKtJAHJ/aMBpH+ 1s6yzk6wWZnnCXxnfy9wf2ykEasWPaaeVkDhPBJyJRO8Xdd6Ldf+30Cc0Qy+CmZQ7jKz ajgr+M87+3OWQmgaRSNR40xHb6BovNPpU4DRPvT8x91uk9uSi1xZB5xTH4uJyVXhJdWR ZgOXjVwMEdnk+JDizznYlVLf4V4N4EylukIriSlMDfPTD49ccP3li3BR6wMBP87BAFjc /F2wAgErY7W7OFSYiFGpT9J/IcO+yi6u7RfJJc9DcrWL5wGAc2PtZJSl/b+34sBVRJi/ F71g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=/VjhF9F+TPyqEnKErWvTEW6LdJyARXMhIk8092aCpIM=; b=DO1kggfgUAX1WAdf4muvJ39s8JsCtJcS//wySh7EsocY+l0YcrPJnZAfxbB0BXJvDT aLzcgbclodPzceo3BI3tRZUMYJJeXCywbDfJ26yh0DhPstCqUDLglkkUxoOIs41HI3/B ghYUCx2ETtMzQp1iYpfSJ7oPLxQ9HUcQVi3mYll1kQ8DWirJR98g+Fgsyw7vHzQqoDwC C7fTzSi8k+IA4y7/zeFF4+YnWYWLGhIpDKq5IV7yL4Dx97T03ZGAX2TBTOnPlgwdKABh gACcBsRT2URysXN8juvXBkyX6xof+Nz174MWU2CFSJftVVDy0+q8EqapHWimBHnPEXQ9 Kv3g== X-Gm-Message-State: AODbwcChzKmY9hooonf2RJJsaYzOB8cpV9vZyNPFfg7vFQpgKIK47Z1t d+rZ2bblixprh1fAopsmwe143NOpM/+SJcLdiw== X-Received: by 10.31.63.140 with SMTP id m134mr7654532vka.8.1496069757914; Mon, 29 May 2017 07:55:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.95.90 with HTTP; Mon, 29 May 2017 07:55:37 -0700 (PDT) In-Reply-To: <20170528082624.GA14552@fedora-23-dvm> References: <20170528082624.GA14552@fedora-23-dvm> From: "Russell O'Connor" Date: Mon, 29 May 2017 10:55:37 -0400 Message-ID: To: Peter Todd Content-Type: multipart/alternative; boundary="001a114c9a8e7af3b20550aae2e0" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Bitcoin Protocol Discussion Subject: Re: [bitcoin-dev] A Method for Computing Merkle Roots of Annotated Binary Trees X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 May 2017 14:55:59 -0000 --001a114c9a8e7af3b20550aae2e0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, May 28, 2017 at 4:26 AM, Peter Todd wrote: > On Mon, May 22, 2017 at 03:05:49AM -0400, Russell O'Connor via bitcoin-de= v > wrote: > > Not all of the inputs to the SHA256 compression function are created > > equal. Only the second argument, the chunk data, is applied to the > SHA256 > > expander. `merkleRoot` is designed to ensure that the first argument o= f > > the SHA256 compression function is only fed some output of the SHA256 > > compression function. In fact, we can prove that the output of the > > `merkleRoot` function is always the midstate of some SHA256 hash. To s= ee > > this, let us explicitly separate the `sha256` function into the padding > > step, `sha256Pad`, and the recursive hashing step, `unpaddedSha256`. > > This doesn't hold true in the case of pruned trees, as for the pruning to > be > useful, you don't know what produced the left merkleRoot, and thus you > can't > guarantee it is in fact a midstate of a genuine SHA256 hash. > Thanks for the review Peter. This does seem like a serious issue that I hadn't considered yet. As far as I understand, we have no reason to think that the SHA-256 compression function will be secure with chosen initial values. Some of this proposal can be salvaged, I think, by putting the hash of the tags into Sha256Compress's first argument: merkleRoot : Tree BitString -> Word256 merkleRoot (Leaf (t)) :=3D sha256Compress(sha256(t), 0b0^512) merkleRoot (Unary (t, child)) :=3D sha256Compress(sha256(t), merkleRoot(child) =E2=8B=85 0b0^256) merkleRoot (Binary (t, left, right)) :=3D sha256Compress(sha256(t), merkleRoot(left) =E2=8B=85 merkleRoot(right)) The Merkle--Damg=C3=A5rd property will still hold under the same conditions about tags determining the type of node (Leaf, Unary, or Binary) while avoiding the need for SHA256's padding. If you have a small number of tags, then you can pre-compute their hashes so that you only end up with one call to SHA256 compress per node. If you have tags with a large amount of data, you were going to be hashing them anyways. Unfortunately we lose the ability to cleverly avoid collisions between the Sha256 and MerkleRoot functions by using safe tags. --001a114c9a8e7af3b20550aae2e0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sun, May 28, 2017 at 4:26 AM, Peter Todd <pete@petertodd.org&g= t; wrote:
On Mon, May 22, 2017 at 03:05:49AM -0400, Russell O'C= onnor via bitcoin-dev wrote:
> Not all of the inputs to the SHA256 comp= ression function are created
> equal.=C2=A0 Only the second argument, the chunk data, is applied to t= he SHA256
> expander.=C2=A0 `merkleRoot` is designed to ensure that the first argu= ment of
> the SHA256 compression function is only fed some output of the SHA256<= br> > compression function.=C2=A0 In fact, we can prove that the output of t= he
> `merkleRoot` function is always the midstate of some SHA256 hash.=C2= =A0 To see
> this, let us explicitly separate the `sha256` function into the paddin= g
> step, `sha256Pad`, and the recursive hashing step, `unpaddedSha256`.
This doesn't hold true in the case of pruned trees, as for the p= runing to be
useful, you don't know what produced the left merkleRoot, and thus you = can't
guarantee it is in fact a midstate of a genuine SHA256 hash.

Thanks for the review Peter.=C2=A0 This does seem lik= e a serious issue that I hadn't considered yet.=C2=A0 As far as I under= stand, we have no reason to think that the SHA-256 compression function wil= l be secure with chosen initial values.

Some of this prop= osal can be salvaged, I think, by putting the hash of the tags into Sha256C= ompress's first argument:

=C2=A0=C2=A0=C2=A0 merkleRoot : Tree BitString -> Word256
= =C2=A0=C2=A0=C2=A0 merkleRoot (Leaf (t))=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 :=3D sha256Compre= ss(sha256(t), 0b0^512)
=C2=A0=C2=A0=C2=A0 merkleRoot (Unary (t, child))= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 :=3D sha256Compress(
sha256(t), merkleRoot(child) =E2=8B=85 0b0^256)
=C2=A0=C2=A0=C2=A0 merkleRoot (Binary (t= , left, right)) :=3D sha256Compress(
sha256(t), merkleRoot(le= ft) =E2=8B=85 merkleRoot(right))

The Merkle--Damg=C3=A5rd property will still hold under t= he same conditions about tags determining the type of node (Leaf, Unary, or= Binary) while avoiding the need for SHA256's padding.=C2=A0 If you hav= e a small number of tags, then you can pre-compute their hashes so that you= only end up with one call to SHA256 compress per node. If you have tags wi= th a large amount of data, you were going to be hashing them anyways.
Unfortunately we lose the ability to cleverly avoid collisions= between the Sha256 and MerkleRoot functions by using safe tags.
<= /div>
--001a114c9a8e7af3b20550aae2e0--