3f/4604fef2a53a84e7c3c10aab47474ad070d17b


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485

Return-Path: <bram@bittorrent.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 46950B7C
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 24 Feb 2017 22:20:22 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-io0-f174.google.com (mail-io0-f174.google.com
	[209.85.223.174])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C098E18C
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 24 Feb 2017 22:20:20 +0000 (UTC)
Received: by mail-io0-f174.google.com with SMTP id l66so2808289ioi.1
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 24 Feb 2017 14:20:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=bittorrent-com.20150623.gappssmtp.com; s=20150623;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:cc; bh=apRFdcYHU8VoMC5fRKoY9Bz/aL0S1caImkffHdneXmA=;
	b=P9HiKMVKvbUe6MDFOTyEJbLClhDC1tO6sVNKzQQ2UaHH5Nfl3SBQKijhUrpMn80gtn
	rRNH8baBwhrCp0o8Pf4fMSgJD9JPA1xGtzfOU0K+koZQC2o4vaiURUfYyGmFKV1ZRLGT
	9Xoxuhd8dfs2DqNA0dq2LFkSw/wt+rcXcwWhdXypBWjcdv8xi+ETmSaYXX7+a6O/n9RK
	57EMqpxL6elA1+EsFPD1BuwBnc+101DjX0azXQ24stNC7dz0qtuRKoFayS62AbYbohAu
	QxRKlCz4FHAG0ev6Vz4hP4UqBABB5A7b5VXWw4m5jyeC1Dm8tc/yeqsPvLH3zv2aXBm/
	mKZw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc;
	bh=apRFdcYHU8VoMC5fRKoY9Bz/aL0S1caImkffHdneXmA=;
	b=sp0TSSLXhaNVeYqLaE3wWBA3AZQ7oZeUEEChkL6A9w6OyopjwnsMZhD20J4CZ81C1R
	qjNOoD1ADF1qWwrcyBFhUY+ohJ/iNw5/mLlO69yPWpu89ngtaoGJnC5ZcSd+gWZFo7Uz
	hQLDbVLI7us+E8Q95ssAdFa8QH+dDjwyVs8OJWG05OxaEbeeCF8Gn6pIX8AJIQ9YsT5S
	WSCr2C5IY+8DI2MdpYNB9VPTdjt4XQcFYxWmwaCgmdkvLO/9UofgboLAI+MchzmQft1c
	PsL02gV8CeiUb0y1sE/+axFpIHVXMzmimhUa1tdFxw8/xDqexNgfWRIX1bMWSCZEG0GX
	BXHg==
X-Gm-Message-State: AMke39mDzUATkS+F3vrMNT8POEs6i7k54A1sTqNQwB4OcEJViw6IRe+JPNsk4osENr+JUHfNKuNOyhwZcpkt6Gwu
X-Received: by 10.107.26.205 with SMTP id a196mr4651415ioa.214.1487974820033; 
	Fri, 24 Feb 2017 14:20:20 -0800 (PST)
MIME-Version: 1.0
Received: by 10.36.73.150 with HTTP; Fri, 24 Feb 2017 14:20:19 -0800 (PST)
In-Reply-To: <20170224043613.GA32502@savin.petertodd.org>
References: <CAAcC9ys5sUxVfOjogFiF3gzk51D_L=QQkOYevTH=qbh_RkA3Hw@mail.gmail.com>
	<CA+KqGkrUneGe4yORi=JAAWzoO0UftMUuJm3S-__W5sBh-+T1vQ@mail.gmail.com>
	<20170223235105.GA28497@savin.petertodd.org>
	<CA+KqGkowxEZeAFYa2JJchBDtRkg1p3YZNocivzu3fAtgRLDRBQ@mail.gmail.com>
	<20170224010943.GA29218@savin.petertodd.org>
	<CA+KqGkrOK76S3ffPJmpqYcBwtSeKESqN16yZsrwzDR6JZZmwFA@mail.gmail.com>
	<20170224025811.GA31911@savin.petertodd.org>
	<CA+KqGkq7gavAnAk-tcA+gxL2sWpv3ENhEmHrQHaPdyAsKrLjGg@mail.gmail.com>
	<20170224031531.GA32118@savin.petertodd.org>
	<CA+KqGkrfhg3GnbWwvKXHQ2NWuCnfzYyTPUxRhzYMuDBiNQR4eA@mail.gmail.com>
	<20170224043613.GA32502@savin.petertodd.org>
From: Bram Cohen <bram@bittorrent.com>
Date: Fri, 24 Feb 2017 14:20:19 -0800
Message-ID: <CA+KqGkpi4GvgU-K6vt-U5ZN4AkpjZ0rruzddoJS4-V0TcnyqUQ@mail.gmail.com>
To: Peter Todd <pete@petertodd.org>
Content-Type: multipart/alternative; boundary=001a113fd05695bf9205494e22bc
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,HTML_MESSAGE,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] A Better MMR Definition
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 24 Feb 2017 22:20:22 -0000

--001a113fd05695bf9205494e22bc
Content-Type: text/plain; charset=UTF-8

So your idea is to cluster entries by entry time because newer things are
more likely to leave and updating multiple things near each other is
cheaper?

That can be done with my tool. Instead of using hashes for the values being
stored, you use position entries. The first entry gets a value of all
zeros, the next one a one followed by all zeros, then the next two
correspond to the first two with the second bit flipped to one, then the
next four the first four with the third bit flipped to one, etc. It
probably performs a little bit better to do it two bits at a time instead
of one so that the entries are 00, 01, 10, 11, 0001, 0010, 0011, 0101,
0110, 0111, 1001, etc. If you were to really use this you'd probably want
to to add some optimizations to use the fact that the terminals fit in 64
bits instead of 256, but it mostly works unchanged, and gets whatever
benefits there are to this clustering plus the high performance
implementation tricks I've built which I keep complaining that nobody's
giving feedback on.

I'm not sold on this being a win: The empirical access patterns are
unknown, it requires an extra cache miss per lookup to find the entry
number, it may be that everything is optimized well enough without it for
there to be no meaningful gains, and it's a bunch of extra complexity. What
should be done is that a plain vanilla UTXO set solution is optimized as
well as it can be first, and then the insertion ordering trick is tried as
an optimization to see if it's an improvement. Without that baseline
there's no meaningful basis for comparison, and I'm quite confident that a
naive implementation which just allocates individual nodes will
underperform the thing I've come up with, even without adding optimizations
related to fitting in 64 bits.

On Thu, Feb 23, 2017 at 8:36 PM, Peter Todd <pete@petertodd.org> wrote:

> On Thu, Feb 23, 2017 at 07:32:43PM -0800, Bram Cohen wrote:
> > On Thu, Feb 23, 2017 at 7:15 PM, Peter Todd <pete@petertodd.org> wrote:
> >
> > >
> > > Glad we're on the same page with regard to what's possible in TXO
> > > commitments.
> > >
> > > Secondly, am I correct in saying your UTXO commitments scheme requires
> > > random
> > > access? While you describe it as a "merkle set", obviously to be
> merkelized
> > > it'll have to have an ordering of some kind. What do you propose that
> > > ordering
> > > to be?
> > >
> >
> > The ordering is by the bits in the hash. Technically it's a Patricia
> Trie.
> > I'm using 'merkle tree' to refer to basically anything with a hash root.
>
> The hash of what? The values in the set?
>
> > > Maybe more specifically, what exact values do you propose to be in the
> set?
> > >
> > >
> > That is unspecified in the implementation, it just takes a 256 bit value
> > which is presumably a hash of something. The intention is to nail down a
> > simple format and demonstrate good performance and leave those semantics
> to
> > a higher layer. The simplest thing would be to hash together the txid and
> > output number.
>
> Ok, so let's assume the values in the set are the unspent outpoints.
>
> Since we're ordering by the hash of the values in the set, outpoints will
> be
> distributed uniformly in the set, and thus the access pattern of data in
> the
> set is uniform.
>
> Now let's fast-forward 10 years. For the sake of argument, assume that for
> every 1 UTXO in the set that corresponds to funds in someone's wallet that
> are
> likely to be spent, there are 2^12 = 4096 UTXO's that have been permanently
> lost (and/or created in spam attacks) and thus will never be spent.
>
> Since lost UTXO's are *also* uniformly distributed, if I'm processing a new
> block that spends 2^12 = 4096 UTXO's, on average for each UTXO spent, I'll
> have to update log2(4096) = 12 more digests than I would have had those
> "dead"
> UTXO's not existed.
>
> Concretely, imagine our UTXO set had just 8 values in it, and we were
> updating
> two of them:
>
>                #
>               / \
>              /   \
>             /     \
>            /       \
>           /         \
>          #           #
>         / \         / \
>        /   \       /   \
>       #     .     .     #
>      / \   / \   / \   / \
>     .   X .   . .   . X   .
>
> To mark two coins as spent, we've had to update 5 inner nodes.
>
>
> Now let's look at what happens in an insertion-ordered TXO commitment
> scheme.
> For sake of argument, let's assume the best possible case, where every UTXO
> spent in that same block was recently created. Since the UTXO's are
> recently
> created, chances are almost every single one of those "dead" UTXO's will
> have
> been created in the past. Thus, since this is an insertion-ordered data
> structure, those UTXO's exist in an older part of the data structure that
> our
> new block doesn't need to modify at all.
>
> Concretely, again let's imagine a TXO commitment with 8 values in it, and
> two
> of them being spent:
>
>                #
>               / \
>              /   \
>             /     \
>            /       \
>           /         \
>          .           #
>         / \         / \
>        /   \       /   \
>       .     .     .     #
>      / \   / \   / \   / \
>     .   . .   . .   . X   X
>
> To mark two coins as spent, we've only had to update 3 inner nodes; while
> our
> tree is higher with those lost coins, those extra inner nodes are amortised
> across all the coins we have to update.
>
>
> The situation gets even better when we look at the *new* UTXO's that our
> block
> creates. Suppose our UTXO set has size n. To mark a single coin as spent,
> we
> have to update log2(n) inner nodes. We do get to amortise this a bit at
> the top
> levels in the tree, but even if we assume the amortisation is totally free,
> we're updating at least log2(n) - log2(m) inner nodes "under" the amortised
> nodes at the top of the tree for *each* new node.
>
> Meanwhile with an insertion-ordered TXO commitment, each new UTXO added to
> the
> data set goes in the same place - the end. So almost none of the existing
> data
> needs to be touched to add the new UTXOs. Equally, the hashing required
> for the
> new UTXO's can be done in an incremental fashion that's very L1/L2 cache
> friendly.
>
>
> tl;dr: Precisely because access patterns in TXO commitments are *not*
> uniform,
> I think we'll find that from a L1/L2/etc cache perspective alone, TXO
> commitments will result in better performance than UTXO commitments.
>
>
> Now it is true that Bitcoin's current design means we'll need a map of
> confirmed outpoints to TXO insertion order indexes. But it's not
> particularly
> hard to add that "metadata" to transactions on the P2P layer in the same
> way
> that segwit added witnesses to transactions without modifying how txids
> were
> calculated; if you only connect to peers who provide you with TXO index
> information in blocks and transactions, you don't need to keep that map
> yourself.
>
> Finally, note how this makes transactions *smaller* in many circumstances:
> it's
> just a 8-byte max index rather than a 40 byte outpoint.
>
> --
> https://petertodd.org 'peter'[:-1]@petertodd.org
>

--001a113fd05695bf9205494e22bc
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">So your idea is to cluster entries by entry time because n=
ewer things are more likely to leave and updating multiple things near each=
 other is cheaper?<div><br></div><div>That can be done with my tool. Instea=
d of using hashes for the values being stored, you use position entries. Th=
e first entry gets a value of all zeros, the next one a one followed by all=
 zeros, then the next two correspond to the first two with the second bit f=
lipped to one, then the next four the first four with the third bit flipped=
 to one, etc. It probably performs a little bit better to do it two bits at=
 a time instead of one so that the entries are 00, 01, 10, 11, 0001, 0010, =
0011, 0101, 0110, 0111, 1001, etc. If you were to really use this you&#39;d=
 probably want to to add some optimizations to use the fact that the termin=
als fit in 64 bits instead of 256, but it mostly works unchanged, and gets =
whatever benefits there are to this clustering plus the high performance im=
plementation tricks I&#39;ve built which I keep complaining that nobody&#39=
;s giving feedback on.</div><div><br></div><div>I&#39;m not sold on this be=
ing a win: The empirical access patterns are unknown, it requires an extra =
cache miss per lookup to find the entry number, it may be that everything i=
s optimized well enough without it for there to be no meaningful gains, and=
 it&#39;s a bunch of extra complexity. What should be done is that a plain =
vanilla UTXO set solution is optimized as well as it can be first, and then=
 the insertion ordering trick is tried as an optimization to see if it&#39;=
s an improvement. Without that baseline there&#39;s no meaningful basis for=
 comparison, and I&#39;m quite confident that a naive implementation which =
just allocates individual nodes will underperform the thing I&#39;ve come u=
p with, even without adding optimizations related to fitting in 64 bits.</d=
iv></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Thu, =
Feb 23, 2017 at 8:36 PM, Peter Todd <span dir=3D"ltr">&lt;<a href=3D"mailto=
:pete@petertodd.org" target=3D"_blank">pete@petertodd.org</a>&gt;</span> wr=
ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border=
-left:1px #ccc solid;padding-left:1ex"><span class=3D"">On Thu, Feb 23, 201=
7 at 07:32:43PM -0800, Bram Cohen wrote:<br>
&gt; On Thu, Feb 23, 2017 at 7:15 PM, Peter Todd &lt;<a href=3D"mailto:pete=
@petertodd.org">pete@petertodd.org</a>&gt; wrote:<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt; Glad we&#39;re on the same page with regard to what&#39;s possibl=
e in TXO<br>
&gt; &gt; commitments.<br>
&gt; &gt;<br>
&gt; &gt; Secondly, am I correct in saying your UTXO commitments scheme req=
uires<br>
&gt; &gt; random<br>
&gt; &gt; access? While you describe it as a &quot;merkle set&quot;, obviou=
sly to be merkelized<br>
&gt; &gt; it&#39;ll have to have an ordering of some kind. What do you prop=
ose that<br>
&gt; &gt; ordering<br>
&gt; &gt; to be?<br>
&gt; &gt;<br>
&gt;<br>
&gt; The ordering is by the bits in the hash. Technically it&#39;s a Patric=
ia Trie.<br>
&gt; I&#39;m using &#39;merkle tree&#39; to refer to basically anything wit=
h a hash root.<br>
<br>
</span>The hash of what? The values in the set?<br>
<span class=3D""><br>
&gt; &gt; Maybe more specifically, what exact values do you propose to be i=
n the set?<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; That is unspecified in the implementation, it just takes a 256 bit val=
ue<br>
&gt; which is presumably a hash of something. The intention is to nail down=
 a<br>
&gt; simple format and demonstrate good performance and leave those semanti=
cs to<br>
&gt; a higher layer. The simplest thing would be to hash together the txid =
and<br>
&gt; output number.<br>
<br>
</span>Ok, so let&#39;s assume the values in the set are the unspent outpoi=
nts.<br>
<br>
Since we&#39;re ordering by the hash of the values in the set, outpoints wi=
ll be<br>
distributed uniformly in the set, and thus the access pattern of data in th=
e<br>
set is uniform.<br>
<br>
Now let&#39;s fast-forward 10 years. For the sake of argument, assume that =
for<br>
every 1 UTXO in the set that corresponds to funds in someone&#39;s wallet t=
hat are<br>
likely to be spent, there are 2^12 =3D 4096 UTXO&#39;s that have been perma=
nently<br>
lost (and/or created in spam attacks) and thus will never be spent.<br>
<br>
Since lost UTXO&#39;s are *also* uniformly distributed, if I&#39;m processi=
ng a new<br>
block that spends 2^12 =3D 4096 UTXO&#39;s, on average for each UTXO spent,=
 I&#39;ll<br>
have to update log2(4096) =3D 12 more digests than I would have had those &=
quot;dead&quot;<br>
UTXO&#39;s not existed.<br>
<br>
Concretely, imagine our UTXO set had just 8 values in it, and we were updat=
ing<br>
two of them:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0#<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 / \<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0#=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0#<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 / \=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/ \<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=
=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 #=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=
=A0 =C2=A0#<br>
=C2=A0 =C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \<br>
=C2=A0 =C2=A0 .=C2=A0 =C2=A0X .=C2=A0 =C2=A0. .=C2=A0 =C2=A0. X=C2=A0 =C2=
=A0.<br>
<br>
To mark two coins as spent, we&#39;ve had to update 5 inner nodes.<br>
<br>
<br>
Now let&#39;s look at what happens in an insertion-ordered TXO commitment s=
cheme.<br>
For sake of argument, let&#39;s assume the best possible case, where every =
UTXO<br>
spent in that same block was recently created. Since the UTXO&#39;s are rec=
ently<br>
created, chances are almost every single one of those &quot;dead&quot; UTXO=
&#39;s will have<br>
been created in the past. Thus, since this is an insertion-ordered data<br>
structure, those UTXO&#39;s exist in an older part of the data structure th=
at our<br>
new block doesn&#39;t need to modify at all.<br>
<br>
Concretely, again let&#39;s imagine a TXO commitment with 8 values in it, a=
nd two<br>
of them being spent:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0#<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 / \<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0#<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 / \=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/ \<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=
=A0 =C2=A0\<br>
=C2=A0 =C2=A0 =C2=A0 .=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=
=A0 =C2=A0#<br>
=C2=A0 =C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \<br>
=C2=A0 =C2=A0 .=C2=A0 =C2=A0. .=C2=A0 =C2=A0. .=C2=A0 =C2=A0. X=C2=A0 =C2=
=A0X<br>
<br>
To mark two coins as spent, we&#39;ve only had to update 3 inner nodes; whi=
le our<br>
tree is higher with those lost coins, those extra inner nodes are amortised=
<br>
across all the coins we have to update.<br>
<br>
<br>
The situation gets even better when we look at the *new* UTXO&#39;s that ou=
r block<br>
creates. Suppose our UTXO set has size n. To mark a single coin as spent, w=
e<br>
have to update log2(n) inner nodes. We do get to amortise this a bit at the=
 top<br>
levels in the tree, but even if we assume the amortisation is totally free,=
<br>
we&#39;re updating at least log2(n) - log2(m) inner nodes &quot;under&quot;=
 the amortised<br>
nodes at the top of the tree for *each* new node.<br>
<br>
Meanwhile with an insertion-ordered TXO commitment, each new UTXO added to =
the<br>
data set goes in the same place - the end. So almost none of the existing d=
ata<br>
needs to be touched to add the new UTXOs. Equally, the hashing required for=
 the<br>
new UTXO&#39;s can be done in an incremental fashion that&#39;s very L1/L2 =
cache<br>
friendly.<br>
<br>
<br>
tl;dr: Precisely because access patterns in TXO commitments are *not* unifo=
rm,<br>
I think we&#39;ll find that from a L1/L2/etc cache perspective alone, TXO<b=
r>
commitments will result in better performance than UTXO commitments.<br>
<br>
<br>
Now it is true that Bitcoin&#39;s current design means we&#39;ll need a map=
 of<br>
confirmed outpoints to TXO insertion order indexes. But it&#39;s not partic=
ularly<br>
hard to add that &quot;metadata&quot; to transactions on the P2P layer in t=
he same way<br>
that segwit added witnesses to transactions without modifying how txids wer=
e<br>
calculated; if you only connect to peers who provide you with TXO index<br>
information in blocks and transactions, you don&#39;t need to keep that map=
<br>
yourself.<br>
<br>
Finally, note how this makes transactions *smaller* in many circumstances: =
it&#39;s<br>
just a 8-byte max index rather than a 40 byte outpoint.<br>
<div class=3D"HOEnZb"><div class=3D"h5"><br>
--<br>
<a href=3D"https://petertodd.org" rel=3D"noreferrer" target=3D"_blank">http=
s://petertodd.org</a> &#39;peter&#39;[:-1]@<a href=3D"http://petertodd.org"=
 rel=3D"noreferrer" target=3D"_blank">petertodd.org</a><br>
</div></div></blockquote></div><br></div>

--001a113fd05695bf9205494e22bc--