1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
|
Return-Path: <pete@petertodd.org>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
[172.17.192.35])
by mail.linuxfoundation.org (Postfix) with ESMTPS id 4006A957
for <bitcoin-dev@lists.linuxfoundation.org>;
Thu, 16 Jun 2016 03:26:20 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from outmail149082.authsmtp.co.uk (outmail149082.authsmtp.co.uk
[62.13.149.82])
by smtp1.linuxfoundation.org (Postfix) with ESMTP id F2DC4100
for <bitcoin-dev@lists.linuxfoundation.org>;
Thu, 16 Jun 2016 03:26:18 +0000 (UTC)
Received: from mail-c232.authsmtp.com (mail-c232.authsmtp.com [62.13.128.232])
by punt20.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5G3QGZ0067346;
Thu, 16 Jun 2016 04:26:16 +0100 (BST)
Received: from petertodd.org (ec2-52-5-185-120.compute-1.amazonaws.com
[52.5.185.120]) (authenticated bits=0)
by mail.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5G3QF86086084
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
Thu, 16 Jun 2016 04:26:15 +0100 (BST)
Received: from [127.0.0.1] (localhost [127.0.0.1])
by petertodd.org (Postfix) with ESMTPSA id 7313240092;
Thu, 16 Jun 2016 03:24:18 +0000 (UTC)
Received: by localhost (Postfix, from userid 1000)
id 44D8920738; Wed, 15 Jun 2016 23:26:12 -0400 (EDT)
Date: Wed, 15 Jun 2016 23:26:12 -0400
From: Peter Todd <pete@petertodd.org>
To: Bram Cohen <bram@bittorrent.com>
Message-ID: <20160616032612.GA7792@fedora-21-dvm>
References: <CA+KqGkosGrWQ2Lpg3Ptc+UyidK7H4_HLQ_QWx2DOAFRmLkv4zQ@mail.gmail.com>
<20160616001040.GA5026@fedora-21-dvm>
<CA+KqGkqAHcU2PzEX6OmorXRBQ22eF_QBYYqUDS1v_ZvevhLCuQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
protocol="application/pgp-signature"; boundary="GvXjxJ+pjyke8COw"
Content-Disposition: inline
In-Reply-To: <CA+KqGkqAHcU2PzEX6OmorXRBQ22eF_QBYYqUDS1v_ZvevhLCuQ@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-Server-Quench: 159850b5-3372-11e6-829e-00151795d556
X-AuthReport-Spam: If SPAM / abuse - report it at:
http://www.authsmtp.com/abuse
X-AuthRoute: OCd2Yg0TA1ZNQRgX IjsJECJaVQIpKltL GxAVKBZePFsRUQkR
aAdMdwQUEkAaAgsB AmAbW1BeUlt7WGU7 bghPaBtcak9QXgdq
T0pMXVMcUQAfAn13 DhgeWh9yfwEIeXh4 YEYsX3VSVEF6J0Ng
QU1TF3AHZDJmdWgd WRVFdwNVdQJNdxoR b1V5GhFYa3VsNCMk
FAgyOXU9MCtqYAFY WAIJIBoOW0sGBXY1 QRxKGDIyG1EMRiN7 NRUgJVMHdAAA
X-Authentic-SMTP: 61633532353630.1037:706
X-AuthFastPath: 0 (Was 255)
X-AuthSMTP-Origin: 52.5.185.120/25
X-AuthVirus-Status: No virus detected - but ensure you scan with your own
anti-virus system.
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW
autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
smtp1.linux-foundation.org
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Merkle trees and mountain ranges
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Jun 2016 03:26:20 -0000
--GvXjxJ+pjyke8COw
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Wed, Jun 15, 2016 at 06:16:26PM -0700, Bram Cohen wrote:
> On Wed, Jun 15, 2016 at 5:10 PM, Peter Todd <pete@petertodd.org> wrote:
>=20
> > On Tue, Jun 14, 2016 at 05:14:23PM -0700, Bram Cohen via bitcoin-dev wr=
ote:
> > >
> > > Peter proposes that there should be both UTXO and STXO commitments, a=
nd
> >
> > No, that's incorrect - I'm only proposing TXO commitments, not UTXO nor
> > STXO
> > commitments.
> >
>=20
> What do you mean by TXO commitments? If you mean that it only records
> insertions rather than deletions, then that can do many of the same proofs
> but has no way of proving that something is currently in the UTXO set,
> which is functionality I'd like to provide.
I think you need to re-read my original post on TXO commitments, specifical=
ly
where I say:
# TXO Commitments
A merkle tree committing to the state of __all transaction outputs, bot=
h spent
and unspent__, we can provide a method of compactly proving the current=
state of
an output.
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-May/012715.html
> When I say 'merkle tree' what I mean is a patricia trie. What I assume is
> meant by a merkle mountain range is a series of patricia tries of
> decreasing size each of which is an addition to the previous one, and
> they're periodically consolidated into larger tries so the old ones can go
> away. This data structure has the nice property that it's both in sorted
> order and has less than one cache miss per operation because the
> consolidation operations can be batched and done linearly. There are a
> number of different things you could be describing if I misunderstood.
Nope, MMR's are completely unlike what you just described.
> > I'm not proposing STXO commitments precisely because the set of _spent_
> > transactions grows without bound.
>=20
>=20
> I'm worried that once there's real transaction fees everyone might stop
> consolidating dust and the set of unspent transactions might grow without
> bound as well, but that's a topic for another day.
Ok, but then if you're concerned about that risk, why introduce a data
structure - the STXO set - that's _guaranteed_ to grow without bound?
> > > Now I'm going to go out on a limb. My thesis is that usage of a mount=
ain
> > > range is unnecessary, and that a merkle tree in the raw can be made
> > > serviceable by sprinkling magic pixie dust on the performance problem.
> >
> > It'd help if you specified exactly what type of merkle tree you're talk=
ing
> > about here; remember that the certificate transparency RFC appears to h=
ave
> > reinvented merkle mountain ranges, and they call them "merkle trees".
> > Bitcoin
> > meanwhile uses a so-called "merkle tree" that's broken, and Zcash uses a
> > partially filled fixed-sized perfect tree.
> >
>=20
> What I'm making is a patricia trie. Its byte level definition is very
> similar to the one in your MMR codebase.
Which codebase exactly? I have both a insertion-ordered list (MMR) and a
key:value mapping (referred to as a "merbinner tree" in the codebase) in the
proofchains codebase. They're very different data structures.
> Each level of the tree has a single metadata byte and followed by two
> hashes. The hashes are truncated by one byte and the hash function is a
> non-padding variant of sha256 (right now it's just using regular sha256,
> but that's a nice optimization which allows everything to fit in a single
> block).
>=20
> The possible metadata values are: TERM0, TERM1, TERMBOTH, ONLY0, ONLY1,
> MIDDLE. They mean:
>=20
> TERM0, TERM1: There is a single thing in the tree on the specified side.
> The thing hashed on that side is that thing verbatim. The other side has
> more than one thing and the hash of it is the root of everything below.
>=20
> TERMBOTH: There are exactly two things below and they're included inline.
> Note that two things is a special case, if there are more you sometimes
> have ONLY0 or ONLY1.
>=20
> ONLY0, ONLY1: There are more than two things below and they're all on the
> same side. This makes proofs of inclusion and exclusion simpler, and makes
> some implementation details easier, for example there's always something =
at
> every level with perfect memory positioning. It doesn't cause much extra
> memory usage because of the TERMBOTH exception for exactly two things.
>=20
> MIDDLE: There two or more things on both sides.
>=20
> There's also a SINGLETON special case for a tree which contains only one
> thing, and an EMPTY special value for tree which doesn't contain anything.
>=20
> The main differences to your patricia trie are the non-padding sha256 and
> that each level doesn't hash in a record of its depth and the usage of
> ONLY0 and ONLY1.
I'm rather confused, as the above sounds nothing like what I've implemented,
which only has leaf nodes, inner nodes, and the special empty node singleto=
n,
for both the MMR and merbinner trees.
> > I'm having a hard time understanding this paragraph; could you explain
> > what you
> > think is happening when things are "merged into larger hills"?
> >
>=20
> I'm talking about the recalculation of mountain tips, assuming we're on t=
he
> same page about what 'MMR' means.
Yeah, we're definitely not...
In MMR's append operations never need to modify mountain contents.
> > As UTXO/STXO/TXO sets are all enormously larger than L1/L2 cache, it's
> > impossible to get CPU cache misses below one for update operations. The
> > closest
> > thing to an exception is MMR's, which due to their insertion-ordering c=
ould
> > have good cache locality for appends, in the sense that the mountain ti=
ps
> > required to recalculate the MMR tip digest will already be in cache from
> > the
> > previous append. But that's not sufficient, as you also need to modify =
old
> > TXO's further back in the tree to mark them as spent - that data is goi=
ng
> > to be
> > far larger than L1/L2 cache.
> >
>=20
> This makes me think we're talking about subtly different things for MMRs.
> The ones I described above have sub-1 cache miss per update due to the
> amortized merging together of old mountains.
Again, see above.
> Technically even a patricia trie utxo commitment can have sub-1 cache
> misses per update if some of the updates in a single block are close to
> each other in memory. I think I can get practical Bitcoin updates down to=
a
> little bit less than one l2 cache miss per update, but not a lot less.
I'm very confused as to why you think that's possible. When you say "practi=
cal
Bitcoin updates", what exactly is the data structure you're proposing to
update? How is it indexed?
--=20
https://petertodd.org 'peter'[:-1]@petertodd.org
--GvXjxJ+pjyke8COw
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
-----BEGIN PGP SIGNATURE-----
iQEcBAEBCAAGBQJXYhxRAAoJEGOZARBE6K+y/TgH/id+TvgsceX/8+w+sneS0nfq
K/l9zNcizl6UwLwKYYih7iHQ3zq/9xS9/caI1W7RvGv1kbJtbce9RHSM06IfWzy3
hpgNbRge1kh5umzqxalhKx7KzcAtFHQU2gi/wBJFBrQ+zEn4FyDJov1hFOONupEE
vHRlrksXDblfAEaRBovzKfIYQlf8j3vgHg/UPPIP95cUaC5rRimrBKQ7jbC5676X
LwxFsZGeITp0jmPasgiTPq931YbCSzB7UYHDC8ppffTfL2B9k9zmfGGxnIDXOufB
yKelvGYr78oIAKVQVUlihrAmJJ01TwExRERfUYntSv+3/gSDjVOw4nH6i0cIt+M=
=80le
-----END PGP SIGNATURE-----
--GvXjxJ+pjyke8COw--
|