Re: [bitcoin-dev] A Better MMR Definition

author: Bram Cohen <bram@bittorrent.com> 2017-02-24 14:20:19 -0800
committer: bitcoindev <bitcoindev@gnusha.org> 2017-02-24 22:20:22 +0000
commit: eb049d723c552d5fc69a226b75e6468276f9d998 (patch)
tree: e38fc7e9b3b750637352097f2065437468ed740a /3f/4604fef2a53a84e7c3c10aab47474ad070d17b
parent: 9c631aa447f44f341a1cc70c4a75560890ab428e (diff)
download: pi-bitcoindev-eb049d723c552d5fc69a226b75e6468276f9d998.tar.gz
pi-bitcoindev-eb049d723c552d5fc69a226b75e6468276f9d998.zip
1 files changed, 485 insertions, 0 deletions
diff --git a/3f/4604fef2a53a84e7c3c10aab47474ad070d17b b/3f/4604fef2a53a84e7c3c10aab47474ad070d17b
new file mode 100644
index 000000000..fe7e0570e
--- /dev/null
+++ b/3f/4604fef2a53a84e7c3c10aab47474ad070d17b
@@ -0,0 +1,485 @@
+Return-Path: <bram@bittorrent.com>
+Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
+	[172.17.192.35])
+	by mail.linuxfoundation.org (Postfix) with ESMTPS id 46950B7C
+	for <bitcoin-dev@lists.linuxfoundation.org>;
+	Fri, 24 Feb 2017 22:20:22 +0000 (UTC)
+X-Greylist: whitelisted by SQLgrey-1.7.6
+Received: from mail-io0-f174.google.com (mail-io0-f174.google.com
+	[209.85.223.174])
+	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C098E18C
+	for <bitcoin-dev@lists.linuxfoundation.org>;
+	Fri, 24 Feb 2017 22:20:20 +0000 (UTC)
+Received: by mail-io0-f174.google.com with SMTP id l66so2808289ioi.1
+	for <bitcoin-dev@lists.linuxfoundation.org>;
+	Fri, 24 Feb 2017 14:20:20 -0800 (PST)
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
+	d=bittorrent-com.20150623.gappssmtp.com; s=20150623;
+	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
+	:cc; bh=apRFdcYHU8VoMC5fRKoY9Bz/aL0S1caImkffHdneXmA=;
+	b=P9HiKMVKvbUe6MDFOTyEJbLClhDC1tO6sVNKzQQ2UaHH5Nfl3SBQKijhUrpMn80gtn
+	rRNH8baBwhrCp0o8Pf4fMSgJD9JPA1xGtzfOU0K+koZQC2o4vaiURUfYyGmFKV1ZRLGT
+	9Xoxuhd8dfs2DqNA0dq2LFkSw/wt+rcXcwWhdXypBWjcdv8xi+ETmSaYXX7+a6O/n9RK
+	57EMqpxL6elA1+EsFPD1BuwBnc+101DjX0azXQ24stNC7dz0qtuRKoFayS62AbYbohAu
+	QxRKlCz4FHAG0ev6Vz4hP4UqBABB5A7b5VXWw4m5jyeC1Dm8tc/yeqsPvLH3zv2aXBm/
+	mKZw==
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
+	d=1e100.net; s=20161025;
+	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
+	:message-id:subject:to:cc;
+	bh=apRFdcYHU8VoMC5fRKoY9Bz/aL0S1caImkffHdneXmA=;
+	b=sp0TSSLXhaNVeYqLaE3wWBA3AZQ7oZeUEEChkL6A9w6OyopjwnsMZhD20J4CZ81C1R
+	qjNOoD1ADF1qWwrcyBFhUY+ohJ/iNw5/mLlO69yPWpu89ngtaoGJnC5ZcSd+gWZFo7Uz
+	hQLDbVLI7us+E8Q95ssAdFa8QH+dDjwyVs8OJWG05OxaEbeeCF8Gn6pIX8AJIQ9YsT5S
+	WSCr2C5IY+8DI2MdpYNB9VPTdjt4XQcFYxWmwaCgmdkvLO/9UofgboLAI+MchzmQft1c
+	PsL02gV8CeiUb0y1sE/+axFpIHVXMzmimhUa1tdFxw8/xDqexNgfWRIX1bMWSCZEG0GX
+	BXHg==
+X-Gm-Message-State: AMke39mDzUATkS+F3vrMNT8POEs6i7k54A1sTqNQwB4OcEJViw6IRe+JPNsk4osENr+JUHfNKuNOyhwZcpkt6Gwu
+X-Received: by 10.107.26.205 with SMTP id a196mr4651415ioa.214.1487974820033; 
+	Fri, 24 Feb 2017 14:20:20 -0800 (PST)
+MIME-Version: 1.0
+Received: by 10.36.73.150 with HTTP; Fri, 24 Feb 2017 14:20:19 -0800 (PST)
+In-Reply-To: <20170224043613.GA32502@savin.petertodd.org>
+References: <CAAcC9ys5sUxVfOjogFiF3gzk51D_L=QQkOYevTH=qbh_RkA3Hw@mail.gmail.com>
+	<CA+KqGkrUneGe4yORi=JAAWzoO0UftMUuJm3S-__W5sBh-+T1vQ@mail.gmail.com>
+	<20170223235105.GA28497@savin.petertodd.org>
+	<CA+KqGkowxEZeAFYa2JJchBDtRkg1p3YZNocivzu3fAtgRLDRBQ@mail.gmail.com>
+	<20170224010943.GA29218@savin.petertodd.org>
+	<CA+KqGkrOK76S3ffPJmpqYcBwtSeKESqN16yZsrwzDR6JZZmwFA@mail.gmail.com>
+	<20170224025811.GA31911@savin.petertodd.org>
+	<CA+KqGkq7gavAnAk-tcA+gxL2sWpv3ENhEmHrQHaPdyAsKrLjGg@mail.gmail.com>
+	<20170224031531.GA32118@savin.petertodd.org>
+	<CA+KqGkrfhg3GnbWwvKXHQ2NWuCnfzYyTPUxRhzYMuDBiNQR4eA@mail.gmail.com>
+	<20170224043613.GA32502@savin.petertodd.org>
+From: Bram Cohen <bram@bittorrent.com>
+Date: Fri, 24 Feb 2017 14:20:19 -0800
+Message-ID: <CA+KqGkpi4GvgU-K6vt-U5ZN4AkpjZ0rruzddoJS4-V0TcnyqUQ@mail.gmail.com>
+To: Peter Todd <pete@petertodd.org>
+Content-Type: multipart/alternative; boundary=001a113fd05695bf9205494e22bc
+X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,
+	DKIM_VALID,HTML_MESSAGE,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1
+X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
+	smtp1.linux-foundation.org
+Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
+Subject: Re: [bitcoin-dev] A Better MMR Definition
+X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
+X-Mailman-Version: 2.1.12
+Precedence: list
+List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
+List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
+	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
+List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
+List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
+List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
+List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
+	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
+X-List-Received-Date: Fri, 24 Feb 2017 22:20:22 -0000
+
+--001a113fd05695bf9205494e22bc
+Content-Type: text/plain; charset=UTF-8
+
+So your idea is to cluster entries by entry time because newer things are
+more likely to leave and updating multiple things near each other is
+cheaper?
+
+That can be done with my tool. Instead of using hashes for the values being
+stored, you use position entries. The first entry gets a value of all
+zeros, the next one a one followed by all zeros, then the next two
+correspond to the first two with the second bit flipped to one, then the
+next four the first four with the third bit flipped to one, etc. It
+probably performs a little bit better to do it two bits at a time instead
+of one so that the entries are 00, 01, 10, 11, 0001, 0010, 0011, 0101,
+0110, 0111, 1001, etc. If you were to really use this you'd probably want
+to to add some optimizations to use the fact that the terminals fit in 64
+bits instead of 256, but it mostly works unchanged, and gets whatever
+benefits there are to this clustering plus the high performance
+implementation tricks I've built which I keep complaining that nobody's
+giving feedback on.
+
+I'm not sold on this being a win: The empirical access patterns are
+unknown, it requires an extra cache miss per lookup to find the entry
+number, it may be that everything is optimized well enough without it for
+there to be no meaningful gains, and it's a bunch of extra complexity. What
+should be done is that a plain vanilla UTXO set solution is optimized as
+well as it can be first, and then the insertion ordering trick is tried as
+an optimization to see if it's an improvement. Without that baseline
+there's no meaningful basis for comparison, and I'm quite confident that a
+naive implementation which just allocates individual nodes will
+underperform the thing I've come up with, even without adding optimizations
+related to fitting in 64 bits.
+
+On Thu, Feb 23, 2017 at 8:36 PM, Peter Todd <pete@petertodd.org> wrote:
+
+> On Thu, Feb 23, 2017 at 07:32:43PM -0800, Bram Cohen wrote:
+> > On Thu, Feb 23, 2017 at 7:15 PM, Peter Todd <pete@petertodd.org> wrote:
+> >
+> > >
+> > > Glad we're on the same page with regard to what's possible in TXO
+> > > commitments.
+> > >
+> > > Secondly, am I correct in saying your UTXO commitments scheme requires
+> > > random
+> > > access? While you describe it as a "merkle set", obviously to be
+> merkelized
+> > > it'll have to have an ordering of some kind. What do you propose that
+> > > ordering
+> > > to be?
+> > >
+> >
+> > The ordering is by the bits in the hash. Technically it's a Patricia
+> Trie.
+> > I'm using 'merkle tree' to refer to basically anything with a hash root.
+>
+> The hash of what? The values in the set?
+>
+> > > Maybe more specifically, what exact values do you propose to be in the
+> set?
+> > >
+> > >
+> > That is unspecified in the implementation, it just takes a 256 bit value
+> > which is presumably a hash of something. The intention is to nail down a
+> > simple format and demonstrate good performance and leave those semantics
+> to
+> > a higher layer. The simplest thing would be to hash together the txid and
+> > output number.
+>
+> Ok, so let's assume the values in the set are the unspent outpoints.
+>
+> Since we're ordering by the hash of the values in the set, outpoints will
+> be
+> distributed uniformly in the set, and thus the access pattern of data in
+> the
+> set is uniform.
+>
+> Now let's fast-forward 10 years. For the sake of argument, assume that for
+> every 1 UTXO in the set that corresponds to funds in someone's wallet that
+> are
+> likely to be spent, there are 2^12 = 4096 UTXO's that have been permanently
+> lost (and/or created in spam attacks) and thus will never be spent.
+>
+> Since lost UTXO's are *also* uniformly distributed, if I'm processing a new
+> block that spends 2^12 = 4096 UTXO's, on average for each UTXO spent, I'll
+> have to update log2(4096) = 12 more digests than I would have had those
+> "dead"
+> UTXO's not existed.
+>
+> Concretely, imagine our UTXO set had just 8 values in it, and we were
+> updating
+> two of them:
+>
+>                #
+>               / \
+>              /   \
+>             /     \
+>            /       \
+>           /         \
+>          #           #
+>         / \         / \
+>        /   \       /   \
+>       #     .     .     #
+>      / \   / \   / \   / \
+>     .   X .   . .   . X   .
+>
+> To mark two coins as spent, we've had to update 5 inner nodes.
+>
+>
+> Now let's look at what happens in an insertion-ordered TXO commitment
+> scheme.
+> For sake of argument, let's assume the best possible case, where every UTXO
+> spent in that same block was recently created. Since the UTXO's are
+> recently
+> created, chances are almost every single one of those "dead" UTXO's will
+> have
+> been created in the past. Thus, since this is an insertion-ordered data
+> structure, those UTXO's exist in an older part of the data structure that
+> our
+> new block doesn't need to modify at all.
+>
+> Concretely, again let's imagine a TXO commitment with 8 values in it, and
+> two
+> of them being spent:
+>
+>                #
+>               / \
+>              /   \
+>             /     \
+>            /       \
+>           /         \
+>          .           #
+>         / \         / \
+>        /   \       /   \
+>       .     .     .     #
+>      / \   / \   / \   / \
+>     .   . .   . .   . X   X
+>
+> To mark two coins as spent, we've only had to update 3 inner nodes; while
+> our
+> tree is higher with those lost coins, those extra inner nodes are amortised
+> across all the coins we have to update.
+>
+>
+> The situation gets even better when we look at the *new* UTXO's that our
+> block
+> creates. Suppose our UTXO set has size n. To mark a single coin as spent,
+> we
+> have to update log2(n) inner nodes. We do get to amortise this a bit at
+> the top
+> levels in the tree, but even if we assume the amortisation is totally free,
+> we're updating at least log2(n) - log2(m) inner nodes "under" the amortised
+> nodes at the top of the tree for *each* new node.
+>
+> Meanwhile with an insertion-ordered TXO commitment, each new UTXO added to
+> the
+> data set goes in the same place - the end. So almost none of the existing
+> data
+> needs to be touched to add the new UTXOs. Equally, the hashing required
+> for the
+> new UTXO's can be done in an incremental fashion that's very L1/L2 cache
+> friendly.
+>
+>
+> tl;dr: Precisely because access patterns in TXO commitments are *not*
+> uniform,
+> I think we'll find that from a L1/L2/etc cache perspective alone, TXO
+> commitments will result in better performance than UTXO commitments.
+>
+>
+> Now it is true that Bitcoin's current design means we'll need a map of
+> confirmed outpoints to TXO insertion order indexes. But it's not
+> particularly
+> hard to add that "metadata" to transactions on the P2P layer in the same
+> way
+> that segwit added witnesses to transactions without modifying how txids
+> were
+> calculated; if you only connect to peers who provide you with TXO index
+> information in blocks and transactions, you don't need to keep that map
+> yourself.
+>
+> Finally, note how this makes transactions *smaller* in many circumstances:
+> it's
+> just a 8-byte max index rather than a 40 byte outpoint.
+>
+> --
+> https://petertodd.org 'peter'[:-1]@petertodd.org
+>
+
+--001a113fd05695bf9205494e22bc
+Content-Type: text/html; charset=UTF-8
+Content-Transfer-Encoding: quoted-printable
+
+<div dir=3D"ltr">So your idea is to cluster entries by entry time because n=
+ewer things are more likely to leave and updating multiple things near each=
+ other is cheaper?<div><br></div><div>That can be done with my tool. Instea=
+d of using hashes for the values being stored, you use position entries. Th=
+e first entry gets a value of all zeros, the next one a one followed by all=
+ zeros, then the next two correspond to the first two with the second bit f=
+lipped to one, then the next four the first four with the third bit flipped=
+ to one, etc. It probably performs a little bit better to do it two bits at=
+ a time instead of one so that the entries are 00, 01, 10, 11, 0001, 0010, =
+0011, 0101, 0110, 0111, 1001, etc. If you were to really use this you&#39;d=
+ probably want to to add some optimizations to use the fact that the termin=
+als fit in 64 bits instead of 256, but it mostly works unchanged, and gets =
+whatever benefits there are to this clustering plus the high performance im=
+plementation tricks I&#39;ve built which I keep complaining that nobody&#39=
+;s giving feedback on.</div><div><br></div><div>I&#39;m not sold on this be=
+ing a win: The empirical access patterns are unknown, it requires an extra =
+cache miss per lookup to find the entry number, it may be that everything i=
+s optimized well enough without it for there to be no meaningful gains, and=
+ it&#39;s a bunch of extra complexity. What should be done is that a plain =
+vanilla UTXO set solution is optimized as well as it can be first, and then=
+ the insertion ordering trick is tried as an optimization to see if it&#39;=
+s an improvement. Without that baseline there&#39;s no meaningful basis for=
+ comparison, and I&#39;m quite confident that a naive implementation which =
+just allocates individual nodes will underperform the thing I&#39;ve come u=
+p with, even without adding optimizations related to fitting in 64 bits.</d=
+iv></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Thu, =
+Feb 23, 2017 at 8:36 PM, Peter Todd <span dir=3D"ltr">&lt;<a href=3D"mailto=
+:pete@petertodd.org" target=3D"_blank">pete@petertodd.org</a>&gt;</span> wr=
+ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border=
+-left:1px #ccc solid;padding-left:1ex"><span class=3D"">On Thu, Feb 23, 201=
+7 at 07:32:43PM -0800, Bram Cohen wrote:<br>
+&gt; On Thu, Feb 23, 2017 at 7:15 PM, Peter Todd &lt;<a href=3D"mailto:pete=
+@petertodd.org">pete@petertodd.org</a>&gt; wrote:<br>
+&gt;<br>
+&gt; &gt;<br>
+&gt; &gt; Glad we&#39;re on the same page with regard to what&#39;s possibl=
+e in TXO<br>
+&gt; &gt; commitments.<br>
+&gt; &gt;<br>
+&gt; &gt; Secondly, am I correct in saying your UTXO commitments scheme req=
+uires<br>
+&gt; &gt; random<br>
+&gt; &gt; access? While you describe it as a &quot;merkle set&quot;, obviou=
+sly to be merkelized<br>
+&gt; &gt; it&#39;ll have to have an ordering of some kind. What do you prop=
+ose that<br>
+&gt; &gt; ordering<br>
+&gt; &gt; to be?<br>
+&gt; &gt;<br>
+&gt;<br>
+&gt; The ordering is by the bits in the hash. Technically it&#39;s a Patric=
+ia Trie.<br>
+&gt; I&#39;m using &#39;merkle tree&#39; to refer to basically anything wit=
+h a hash root.<br>
+<br>
+</span>The hash of what? The values in the set?<br>
+<span class=3D""><br>
+&gt; &gt; Maybe more specifically, what exact values do you propose to be i=
+n the set?<br>
+&gt; &gt;<br>
+&gt; &gt;<br>
+&gt; That is unspecified in the implementation, it just takes a 256 bit val=
+ue<br>
+&gt; which is presumably a hash of something. The intention is to nail down=
+ a<br>
+&gt; simple format and demonstrate good performance and leave those semanti=
+cs to<br>
+&gt; a higher layer. The simplest thing would be to hash together the txid =
+and<br>
+&gt; output number.<br>
+<br>
+</span>Ok, so let&#39;s assume the values in the set are the unspent outpoi=
+nts.<br>
+<br>
+Since we&#39;re ordering by the hash of the values in the set, outpoints wi=
+ll be<br>
+distributed uniformly in the set, and thus the access pattern of data in th=
+e<br>
+set is uniform.<br>
+<br>
+Now let&#39;s fast-forward 10 years. For the sake of argument, assume that =
+for<br>
+every 1 UTXO in the set that corresponds to funds in someone&#39;s wallet t=
+hat are<br>
+likely to be spent, there are 2^12 =3D 4096 UTXO&#39;s that have been perma=
+nently<br>
+lost (and/or created in spam attacks) and thus will never be spent.<br>
+<br>
+Since lost UTXO&#39;s are *also* uniformly distributed, if I&#39;m processi=
+ng a new<br>
+block that spends 2^12 =3D 4096 UTXO&#39;s, on average for each UTXO spent,=
+ I&#39;ll<br>
+have to update log2(4096) =3D 12 more digests than I would have had those &=
+quot;dead&quot;<br>
+UTXO&#39;s not existed.<br>
+<br>
+Concretely, imagine our UTXO set had just 8 values in it, and we were updat=
+ing<br>
+two of them:<br>
+<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0#<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 / \<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0#=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
+=A0#<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 / \=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/ \<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=
+=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 #=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=
+=A0 =C2=A0#<br>
+=C2=A0 =C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \<br>
+=C2=A0 =C2=A0 .=C2=A0 =C2=A0X .=C2=A0 =C2=A0. .=C2=A0 =C2=A0. X=C2=A0 =C2=
+=A0.<br>
+<br>
+To mark two coins as spent, we&#39;ve had to update 5 inner nodes.<br>
+<br>
+<br>
+Now let&#39;s look at what happens in an insertion-ordered TXO commitment s=
+cheme.<br>
+For sake of argument, let&#39;s assume the best possible case, where every =
+UTXO<br>
+spent in that same block was recently created. Since the UTXO&#39;s are rec=
+ently<br>
+created, chances are almost every single one of those &quot;dead&quot; UTXO=
+&#39;s will have<br>
+been created in the past. Thus, since this is an insertion-ordered data<br>
+structure, those UTXO&#39;s exist in an older part of the data structure th=
+at our<br>
+new block doesn&#39;t need to modify at all.<br>
+<br>
+Concretely, again let&#39;s imagine a TXO commitment with 8 values in it, a=
+nd two<br>
+of them being spent:<br>
+<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0#<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 / \<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
+=A0#<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 / \=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/ \<br>
+=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=A0 =C2=A0\=C2=A0 =C2=A0 =C2=A0 =C2=A0/=C2=
+=A0 =C2=A0\<br>
+=C2=A0 =C2=A0 =C2=A0 .=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=A0 =C2=A0.=C2=A0 =C2=
+=A0 =C2=A0#<br>
+=C2=A0 =C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \=C2=A0 =C2=A0/ \<br>
+=C2=A0 =C2=A0 .=C2=A0 =C2=A0. .=C2=A0 =C2=A0. .=C2=A0 =C2=A0. X=C2=A0 =C2=
+=A0X<br>
+<br>
+To mark two coins as spent, we&#39;ve only had to update 3 inner nodes; whi=
+le our<br>
+tree is higher with those lost coins, those extra inner nodes are amortised=
+<br>
+across all the coins we have to update.<br>
+<br>
+<br>
+The situation gets even better when we look at the *new* UTXO&#39;s that ou=
+r block<br>
+creates. Suppose our UTXO set has size n. To mark a single coin as spent, w=
+e<br>
+have to update log2(n) inner nodes. We do get to amortise this a bit at the=
+ top<br>
+levels in the tree, but even if we assume the amortisation is totally free,=
+<br>
+we&#39;re updating at least log2(n) - log2(m) inner nodes &quot;under&quot;=
+ the amortised<br>
+nodes at the top of the tree for *each* new node.<br>
+<br>
+Meanwhile with an insertion-ordered TXO commitment, each new UTXO added to =
+the<br>
+data set goes in the same place - the end. So almost none of the existing d=
+ata<br>
+needs to be touched to add the new UTXOs. Equally, the hashing required for=
+ the<br>
+new UTXO&#39;s can be done in an incremental fashion that&#39;s very L1/L2 =
+cache<br>
+friendly.<br>
+<br>
+<br>
+tl;dr: Precisely because access patterns in TXO commitments are *not* unifo=
+rm,<br>
+I think we&#39;ll find that from a L1/L2/etc cache perspective alone, TXO<b=
+r>
+commitments will result in better performance than UTXO commitments.<br>
+<br>
+<br>
+Now it is true that Bitcoin&#39;s current design means we&#39;ll need a map=
+ of<br>
+confirmed outpoints to TXO insertion order indexes. But it&#39;s not partic=
+ularly<br>
+hard to add that &quot;metadata&quot; to transactions on the P2P layer in t=
+he same way<br>
+that segwit added witnesses to transactions without modifying how txids wer=
+e<br>
+calculated; if you only connect to peers who provide you with TXO index<br>
+information in blocks and transactions, you don&#39;t need to keep that map=
+<br>
+yourself.<br>
+<br>
+Finally, note how this makes transactions *smaller* in many circumstances: =
+it&#39;s<br>
+just a 8-byte max index rather than a 40 byte outpoint.<br>
+<div class=3D"HOEnZb"><div class=3D"h5"><br>
+--<br>
+<a href=3D"https://petertodd.org" rel=3D"noreferrer" target=3D"_blank">http=
+s://petertodd.org</a> &#39;peter&#39;[:-1]@<a href=3D"http://petertodd.org"=
+ rel=3D"noreferrer" target=3D"_blank">petertodd.org</a><br>
+</div></div></blockquote></div><br></div>
+
+--001a113fd05695bf9205494e22bc--
+
author	Bram Cohen <bram@bittorrent.com>	2017-02-24 14:20:19 -0800
committer	bitcoindev <bitcoindev@gnusha.org>	2017-02-24 22:20:22 +0000
commit	eb049d723c552d5fc69a226b75e6468276f9d998 (patch)
tree	e38fc7e9b3b750637352097f2065437468ed740a /3f/4604fef2a53a84e7c3c10aab47474ad070d17b
parent	9c631aa447f44f341a1cc70c4a75560890ab428e (diff)
download	pi-bitcoindev-eb049d723c552d5fc69a226b75e6468276f9d998.tar.gz pi-bitcoindev-eb049d723c552d5fc69a226b75e6468276f9d998.zip