1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
|
Return-Path: <pete@petertodd.org>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id A2CE3956
for <bitcoin-dev@lists.linuxfoundation.org>;
Thu, 16 Jun 2016 00:10:48 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from outmail148114.authsmtp.net (outmail148114.authsmtp.net
[62.13.148.114])
by smtp1.linuxfoundation.org (Postfix) with ESMTP id 9BCC4FB
for <bitcoin-dev@lists.linuxfoundation.org>;
Thu, 16 Jun 2016 00:10:47 +0000 (UTC)
Received: from mail-c247.authsmtp.com (mail-c247.authsmtp.com [62.13.128.247])
by punt20.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5G0AjZf072975;
Thu, 16 Jun 2016 01:10:45 +0100 (BST)
Received: from petertodd.org (ec2-52-5-185-120.compute-1.amazonaws.com
[52.5.185.120]) (authenticated bits=0)
by mail.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5G0Afpm029478
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
Thu, 16 Jun 2016 01:10:42 +0100 (BST)
Received: from [127.0.0.1] (localhost [127.0.0.1])
by petertodd.org (Postfix) with ESMTPSA id 6653F40092;
Thu, 16 Jun 2016 00:08:45 +0000 (UTC)
Received: by localhost (Postfix, from userid 1000)
id 7A77520738; Wed, 15 Jun 2016 20:10:40 -0400 (EDT)
Date: Wed, 15 Jun 2016 20:10:40 -0400
From: Peter Todd <pete@petertodd.org>
To: Bram Cohen <bram@bittorrent.com>,
Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Message-ID: <20160616001040.GA5026@fedora-21-dvm>
References: <CA+KqGkosGrWQ2Lpg3Ptc+UyidK7H4_HLQ_QWx2DOAFRmLkv4zQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
protocol="application/pgp-signature"; boundary="opJtzjQTFsWo+cga"
Content-Disposition: inline
In-Reply-To: <CA+KqGkosGrWQ2Lpg3Ptc+UyidK7H4_HLQ_QWx2DOAFRmLkv4zQ@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-Server-Quench: c4091184-3356-11e6-bcde-0015176ca198
X-AuthReport-Spam: If SPAM / abuse - report it at:
http://www.authsmtp.com/abuse
X-AuthRoute: OCd2Yg0TA1ZNQRgX IjsJECJaVQIpKltL GxAVKBZePFsRUQkR
aAdMdwQUEkAaAgsB AmAbW1VeUV17XWA7 bghPaBtcak9QXgdq
T0pMXVMcUQAfAW1X RkMeUBB2cA0IeXZ3 Y0AsDXRbVUV7fUJg
QU1RE3AHZDJmdTJM BBVFdwNVdQJNeEwU a1l3GhFYa3VsNCMk
FAgyOXU9MCtqYAFY WAIJIBoOW0sGBXY1 QRxKGDIyG1EMRiN7 NRUgJVMHdAAA
X-Authentic-SMTP: 61633532353630.1038:706
X-AuthFastPath: 0 (Was 255)
X-AuthSMTP-Origin: 52.5.185.120/25
X-AuthVirus-Status: No virus detected - but ensure you scan with your own
anti-virus system.
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW
autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Subject: Re: [bitcoin-dev] Merkle trees and mountain ranges
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Jun 2016 00:10:48 -0000
--opJtzjQTFsWo+cga
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Tue, Jun 14, 2016 at 05:14:23PM -0700, Bram Cohen via bitcoin-dev wrote:
> This is in response to Peter Todd's proposal for Merkle Mountain Range
> commitments in blocks:
>=20
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-May/012715.h=
tml
>=20
> I'm in strong agreement that there's a compelling need to put UTXO
> commitments in blocks, and that the big barrier to getting it done is
> performance, particularly latency. But I have strong disagreements (or
> perhaps the right word is skepticism) about the details.
>=20
> Peter proposes that there should be both UTXO and STXO commitments, and
No, that's incorrect - I'm only proposing TXO commitments, not UTXO nor STXO
commitments.
> they should be based on Merkle Mountain Ranges based on Patricia Tries. My
> first big disagreement is about the need for STXO commitments. I think
> they're unnecessary and a performance problem. The STXO set is much larger
> than the utxo set and requires much more memory and horespower to maintai=
n.
Again, I'm not proposing STXO commitments precisely because the set of _spe=
nt_
transactions grows without bound. TXO commitments with committed sums of
remaining unspent TXO's and with pruning of old history are special in this
regard, because once spent the data associated with spent transactions can =
be
discarded completely, and at the same time, data associated with old history
can be pruned with responsibility for keeping it resting on the shoulders of
those owning those coins.
> Most if not all of its functionality can in practice be done using the ut=
xo
> set. Almost anything accepting proofs of inclusion and exclusion will have
> a complete history of block headers, so to prove inclusion in the stxo set
> you can use a utxo proof of inclusion in the past and a proof of exclusion
> for the most recent block. In the case of a txo which has never been
> included at all, it's generally possible to show that an ancestor of the
> txo in question was at one point included but that an incompatible
> descendant of it (or the ancestor itself) is part of the current utxo set.
> Generating these sorts of proofs efficiently can for some applications
> require a complete STXO set, but that can done with a non-merkle set,
> getting the vastly better performance of an ordinary non-cryptographic
> hashtable.
TXO commitments allows you to do all of this without requiring miners to ha=
ve
unbounded storage to create new blocks.
> The fundamental approach to handling the latency problem is to have the
> utxo commitments trail a bit. Computing utxo commitments takes a certain
> amount of time, too much to hold up block propagation but still hopefully
> vastly less than the average amount of time between blocks. Trailing by a
> single block is probably a bad idea because you sometimes get blocks back
> to back, but you never get blocks back to back to back to back. Having the
> utxo set be trailing by a fixed amount - five blocks is probably excessive
> - would do a perfectly good job of keeping latency from every becoming an
> issue. Smaller commitments for the utxos added and removed in each block
> alone could be added without any significant performance penalty. That way
> all blocks would have sufficient commitments for a completely up to date
> proofs of inclusion and exclusion. This is not a controversial approach.
Agreed - regardless of approach adding latency to commitment calculations of
all kinds is something I think we all agree can work in principle, although
obviously it should be a last resort technique when optimization fails.
> Now I'm going to go out on a limb. My thesis is that usage of a mountain
> range is unnecessary, and that a merkle tree in the raw can be made
> serviceable by sprinkling magic pixie dust on the performance problem.
It'd help if you specified exactly what type of merkle tree you're talking
about here; remember that the certificate transparency RFC appears to have
reinvented merkle mountain ranges, and they call them "merkle trees". Bitc=
oin
meanwhile uses a so-called "merkle tree" that's broken, and Zcash uses a
partially filled fixed-sized perfect tree.
> There are two causes of performance problems for merkle trees: hashing
> operations and memory cache misses. For hashing functions, the difference
> between a mountain range and a straight merkle tree is roughly that in a
> mountain range there's one operation for each new update times the number
> of times that thing will get merged into larger hills. If there are fewer
> levels of hills the number of operations is less but the expense of proof
> of inclusion will be larger. For raw merkle trees the number of operations
> per thing added is the log base 2 of the number of levels in the tree,
> minus the log base 2 of the number of things added at once since you can =
do
> lazy evaluation. For practical Bitcoin there are (very roughly) a million
> things stored, or 20 levels, and there are (even more roughly) about a
> thousand things stored per block, so each thing forces about 20 - 10 =3D =
10
> operations. If you follow the fairly reasonable guideline of mountain ran=
ge
> hills go up by factors of four, you instead have 20/2 =3D 10 operations p=
er
> thing added amortized. Depending on details this comparison can go either
> way but it's roughly a wash and the complexity of a mountain range is
> clearly not worth it at least from the point of view of CPU costs.
I'm having a hard time understanding this paragraph; could you explain what=
you
think is happening when things are "merged into larger hills"?
> But CPU costs aren't the main performance problem in merkle trees. The
> biggest issues is cache misses, specifically l1 and l2 cache misses. These
> tend to take a long time to do, resulting in the CPU spending most of its
> time sitting around doing nothing. A naive tree implementation is pretty
> much the worst thing you can possibly build from a cache miss standpoint,
> and its performance will be completely unacceptable. Mountain ranges do a
> fabulous job of fixing this problem, because all their updates are merges
> so the metrics are more like cache misses per block instead of cache miss=
es
> per transaction.
>
> The magic pixie dust I mentioned earlier involves a bunch of subtle
> implementation details to keep cache coherence down which should get the
> number of cache misses per transaction down under one, at which point it
> probably isn't a bottleneck any more. There is an implementation in the
> works here:
As UTXO/STXO/TXO sets are all enormously larger than L1/L2 cache, it's
impossible to get CPU cache misses below one for update operations. The clo=
sest
thing to an exception is MMR's, which due to their insertion-ordering could
have good cache locality for appends, in the sense that the mountain tips
required to recalculate the MMR tip digest will already be in cache from the
previous append. But that's not sufficient, as you also need to modify old
TXO's further back in the tree to mark them as spent - that data is going t=
o be
far larger than L1/L2 cache.
> https://github.com/bramcohen/MerkleSet
>=20
> This implementation isn't finished yet! I'm almost there, and I'm
> definitely feeling time pressure now. I've spent quite a lot of time on
> this, mostly because of a bunch of technical reworkings which proved
> necessary. This is the last time I ever write a database for kicks. But
> this implementation is good on all important dimensions, including:
>=20
> Lazy root calculation
> Few l1 and l2 cache misses
> Small proofs of inclusion/exclusion
Have you looked at the pruning system that my proofchains work implements?
--=20
https://petertodd.org 'peter'[:-1]@petertodd.org
--opJtzjQTFsWo+cga
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
-----BEGIN PGP SIGNATURE-----
iQEcBAEBCAAGBQJXYe59AAoJEGOZARBE6K+yHDkH/0F8SnYloFxwHvdHe3jtkbX3
o8PTfb1d8xmn2ZX3HMIJ4rMWL7KA2rcOpCLfx976ZpnlmngJzM58j49jtKaXT1wV
0FfNRpgL4xScDOa5abNU/4tzaXRNiQQkPGqXp/NFxmngYHHHK06NuI2l7BjrwGeZ
BZWAQRQ4Qd4PWdy1/TaUyY5Zelb4tpF0HRnsg8JH2P6MScp7rsrKePO4ehUSWEQy
6RJZlKsc7GGwTpdXxjihHpUUdTYjy/C8fgeRJZrqDz0lIMOgoZ0YyODL8N5PcWPH
7EQoblIWyhY6eyhv3PzeyeAbKHd+Zc5HgTNU+VsXxhn2ft6Nk3xVmuR2Ely41rI=
=on7S
-----END PGP SIGNATURE-----
--opJtzjQTFsWo+cga--
|