summaryrefslogtreecommitdiff
path: root/3e/102e9a5f2785ecacea984c19d50d68fe0fdcb5
blob: f9f716a743842450db5496f8128676f86eacc628 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
Return-Path: <pete@petertodd.org>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 1DCCCB8A
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon, 29 May 2017 16:11:11 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from outmail148108.authsmtp.net (outmail148108.authsmtp.net
	[62.13.148.108])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 4440A17D
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon, 29 May 2017 16:11:09 +0000 (UTC)
Received: from mail-c247.authsmtp.com (mail-c247.authsmtp.com [62.13.128.247])
	by punt24.authsmtp.com (8.14.2/8.14.2/) with ESMTP id v4TGB7su011653;
	Mon, 29 May 2017 17:11:07 +0100 (BST)
Received: from petertodd.org (ec2-52-5-185-120.compute-1.amazonaws.com
	[52.5.185.120]) (authenticated bits=0)
	by mail.authsmtp.com (8.14.2/8.14.2/) with ESMTP id v4TGB568084189
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 29 May 2017 17:11:06 +0100 (BST)
Received: from [127.0.0.1] (localhost [127.0.0.1])
	by petertodd.org (Postfix) with ESMTPSA id 494274013E;
	Mon, 29 May 2017 16:11:05 +0000 (UTC)
Received: by localhost (Postfix, from userid 1000)
	id 6F6F620421; Mon, 29 May 2017 12:10:59 -0400 (EDT)
Date: Mon, 29 May 2017 12:10:59 -0400
From: Peter Todd <pete@petertodd.org>
To: "Russell O'Connor" <roconnor@blockstream.io>
Message-ID: <20170529161059.GA7698@fedora-23-dvm>
References: <CAMZUoK=f3hXHkqJBDfiLGSrgXi_ppgyH6+XWD9W54EYFWLm1+Q@mail.gmail.com>
	<20170528082624.GA14552@fedora-23-dvm>
	<CAMZUoK=8xaVp2Qoc7kvx8FdPbpY0rEpSba8kQVRQGjX4p0haxg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="opJtzjQTFsWo+cga"
Content-Disposition: inline
In-Reply-To: <CAMZUoK=8xaVp2Qoc7kvx8FdPbpY0rEpSba8kQVRQGjX4p0haxg@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-Server-Quench: 6c191ab7-4489-11e7-bcdf-0015176ca198
X-AuthReport-Spam: If SPAM / abuse - report it at:
	http://www.authsmtp.com/abuse
X-AuthRoute: OCd2Yg0TA1ZNQRgX IjsJECJaVQIpKltL GxAVKBZePFsRUQkR
	aAdMdAsUFVQNAgsB AmEbWlNeUVx7WWQ7 bghPaBtcak9QXgdq
	T0pMXVMcUgEMdm4E ABYeWh1zfwwIeXxw YE4sWCNaVUUrJ0Fg
	RkpcHHAHZDJndTJM BBZFdwNVdQJNeEwU a1l3GhFYa3VsNCMk
	FAgyOXU9MCtqYA0d ZwwRLVUeCUgMBHYX QBUaACkuG0JNYigp
	LBgrYmQbG1oKekI8 eXInX1UEOgMfBm8W NUBLCTVIb2UbSicw ZQAA
X-Authentic-SMTP: 61633532353630.1038:706
X-AuthFastPath: 0 (Was 255)
X-AuthSMTP-Origin: 52.5.185.120/25
X-AuthVirus-Status: No virus detected - but ensure you scan with your own
	anti-virus system.
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW
	autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] A Method for Computing Merkle Roots of Annotated
 Binary Trees
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Mon, 29 May 2017 16:11:11 -0000


--opJtzjQTFsWo+cga
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, May 29, 2017 at 10:55:37AM -0400, Russell O'Connor wrote:
> > This doesn't hold true in the case of pruned trees, as for the pruning =
to
> > be
> > useful, you don't know what produced the left merkleRoot, and thus you
> > can't
> > guarantee it is in fact a midstate of a genuine SHA256 hash.
> >
>=20
> Thanks for the review Peter.  This does seem like a serious issue that I
> hadn't considered yet.  As far as I understand, we have no reason to think
> that the SHA-256 compression function will be secure with chosen initial
> values.

Well, it's an easy thing to forget, unless like me, you've written a general
purpose library to work with pruned data. :) (actually, soon to be two of
them!)

I also ran into the midstate issue with my OpenTimestamps protocol, as I was
looking into whether or not it was safe to have a SHA256 midstate commitment
operation, and couldn't find clear evidence that it was.

> Some of this proposal can be salvaged, I think, by putting the hash of the
> tags into Sha256Compress's first argument:
>=20
>     merkleRoot : Tree BitString -> Word256
>     merkleRoot (Leaf (t))                :=3D sha256Compress(sha256(t),
> 0b0^512)
>     merkleRoot (Unary (t, child))        :=3D sha256Compress(sha256(t),
> merkleRoot(child) =E2=8B=85 0b0^256)
>     merkleRoot (Binary (t, left, right)) :=3D sha256Compress(sha256(t),
> merkleRoot(left) =E2=8B=85 merkleRoot(right))
>=20
> The Merkle--Damg=C3=A5rd property will still hold under the same conditio=
ns
> about tags determining the type of node (Leaf, Unary, or Binary) while
> avoiding the need for SHA256's padding.  If you have a small number of
> tags, then you can pre-compute their hashes so that you only end up with
> one call to SHA256 compress per node. If you have tags with a large amount
> of data, you were going to be hashing them anyways.

Notice how what you're proposing here is almost the same thing as using SHA=
256
directly, modulo the fact that you skip the final block containing the mess=
age
length.

Similarly, you don't need to compute sha256(t) - you can just as easily com=
pute
the midstate sha256Compress(IV, t), and cache that midstate if you can reuse
tags. Again, the only difference is the last block.

> Unfortunately we lose the ability to cleverly avoid collisions between the
> Sha256 and MerkleRoot functions by using safe tags.

I think a better question to ask is why you want that property in the first
place?

My earlier python-proofmarshal(1) library had a scheme of per-use-case tags,
but I eventually realised that depending on tags being unique is a footgun.=
 For
example, it's easy to see how two different systems could end up using the =
same
tag due to designers forgetting to create new tags while copying and pasting
old code. Similarly, if two such systems have to be integrated, you'll end =
up
with tags getting reused for two different purposes.

Now, if you design a system where that doesn't matter, then by extension it=
'll
also be true that collisions between the sha256 and merkleroot functions do=
n't
matter either. And that system will be more robust to design mistakes, as t=
ags
only need to be unique "locally" to distinguish between different sub-types=
 in
a sum type (enum).


FWIW what I've done with my newer (and as yet unpublished) rust-proofmarshal
work is commitments are only valid for a specific type. Secondly, I use
blake2b, whose compression function processes up to 128 bytes of message on
each invocation.  That's large enough for four 32 byte hashes, which is by
itself more than sufficient for a summed merkle tree with three 32 byte has=
hes
(left right and sum) and a per-node-type tag.

Blake2b's documentations don't make it clear if it's resistant to collision=
 if
the adversary can control the salt or personalization strings, so I don't
bother using them - the large block size by itself is enough to fit almost =
any
use-case into a single block, and it hashes blocks significantly faster than
SHA256. This also has the advantage that the actual primitive I'm using is =
100%
standard blake2b, an aid to debugging and development.

1) https://github.com/proofchains/python-proofmarshal

--=20
https://petertodd.org 'peter'[:-1]@petertodd.org

--opJtzjQTFsWo+cga
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----

iQEcBAEBCAAGBQJZLEgRAAoJECSBQD2l8JH7x+IIAJAOiw82V6TSCLqRfshb6YIZ
2189yypuNw8e/6L2hKpaPw0ySMObKwfbuBy4mU/qvxB1ovIl8In1Rwc8fW1RUYvg
M8egst57Xj+SIrffDxdcT5R2pDTBJRc5Mmq2ARYnoAivmK8Jfe2GkjvUM6v7lO8p
jGOqcZfSaz1B7t9HpEUUgIPaIKlUgIk9GrhjMQtN6eQQTsZofIOSTeMrFIxlQjYM
sZXbjFUeVAGo7Q8dpYXwefga3GUNT+iY1H/eR04vNfEaIN0isW9vCyKKzI5uNseC
oHbHhi9yIqHYmN88cGMC8g1j0DpFSzWDn4PEMmIOIehJbmTnuA2ITOZ/e2uIm5A=
=FRpe
-----END PGP SIGNATURE-----

--opJtzjQTFsWo+cga--