diff options
author | Peter Todd <pete@petertodd.org> | 2016-06-17 00:34:35 -0400 |
---|---|---|
committer | bitcoindev <bitcoindev@gnusha.org> | 2016-06-17 04:34:43 +0000 |
commit | 490eca7ffa7da4626a3729c2e714d96ba2ee5a14 (patch) | |
tree | 87ade4ec504416a4d86827ef2139480526bd1c03 | |
parent | c81d0794122576abb1875dd410bf499e9e363bc8 (diff) | |
download | pi-bitcoindev-490eca7ffa7da4626a3729c2e714d96ba2ee5a14.tar.gz pi-bitcoindev-490eca7ffa7da4626a3729c2e714d96ba2ee5a14.zip |
Re: [bitcoin-dev] Merkle trees and mountain ranges
-rw-r--r-- | d1/5ce84d761db9ad8cddc04ca48d7db2a15303ae | 306 |
1 files changed, 306 insertions, 0 deletions
diff --git a/d1/5ce84d761db9ad8cddc04ca48d7db2a15303ae b/d1/5ce84d761db9ad8cddc04ca48d7db2a15303ae new file mode 100644 index 000000000..8f14c7843 --- /dev/null +++ b/d1/5ce84d761db9ad8cddc04ca48d7db2a15303ae @@ -0,0 +1,306 @@ +Return-Path: <pete@petertodd.org> +Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org + [172.17.192.35]) + by mail.linuxfoundation.org (Postfix) with ESMTPS id 2742E7AA + for <bitcoin-dev@lists.linuxfoundation.org>; + Fri, 17 Jun 2016 04:34:43 +0000 (UTC) +X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 +Received: from outmail148114.authsmtp.net (outmail148114.authsmtp.net + [62.13.148.114]) + by smtp1.linuxfoundation.org (Postfix) with ESMTP id 16C2B170 + for <bitcoin-dev@lists.linuxfoundation.org>; + Fri, 17 Jun 2016 04:34:41 +0000 (UTC) +Received: from mail-c247.authsmtp.com (mail-c247.authsmtp.com [62.13.128.247]) + by punt20.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5H4YdSm072714; + Fri, 17 Jun 2016 05:34:39 +0100 (BST) +Received: from petertodd.org (ec2-52-5-185-120.compute-1.amazonaws.com + [52.5.185.120]) (authenticated bits=0) + by mail.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5H4YaOu039543 + (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); + Fri, 17 Jun 2016 05:34:37 +0100 (BST) +Received: from [127.0.0.1] (localhost [127.0.0.1]) + by petertodd.org (Postfix) with ESMTPSA id AB44840110; + Fri, 17 Jun 2016 04:32:38 +0000 (UTC) +Received: by localhost (Postfix, from userid 1000) + id 111FA20738; Fri, 17 Jun 2016 00:34:35 -0400 (EDT) +Date: Fri, 17 Jun 2016 00:34:35 -0400 +From: Peter Todd <pete@petertodd.org> +To: Bram Cohen <bram@bittorrent.com> +Message-ID: <20160617043435.GA12800@fedora-21-dvm> +References: <CA+KqGkosGrWQ2Lpg3Ptc+UyidK7H4_HLQ_QWx2DOAFRmLkv4zQ@mail.gmail.com> + <20160616001040.GA5026@fedora-21-dvm> + <CA+KqGkqAHcU2PzEX6OmorXRBQ22eF_QBYYqUDS1v_ZvevhLCuQ@mail.gmail.com> + <20160616032612.GA7792@fedora-21-dvm> + <CA+KqGkqA2WnvBEck3kv6p2no-9wzNVCTNA-MGw=Jg=gMGfrxUQ@mail.gmail.com> +MIME-Version: 1.0 +Content-Type: multipart/signed; micalg=pgp-sha256; + protocol="application/pgp-signature"; boundary="lrZ03NoBR/3+SXJZ" +Content-Disposition: inline +In-Reply-To: <CA+KqGkqA2WnvBEck3kv6p2no-9wzNVCTNA-MGw=Jg=gMGfrxUQ@mail.gmail.com> +User-Agent: Mutt/1.5.23 (2014-03-12) +X-Server-Quench: cc9f5dec-3444-11e6-bcde-0015176ca198 +X-AuthReport-Spam: If SPAM / abuse - report it at: + http://www.authsmtp.com/abuse +X-AuthRoute: OCd2Yg0TA1ZNQRgX IjsJECJaVQIpKltL GxAVKBZePFsRUQkR + aAdMdwUUEkAaAgsB AmAbW1FeU1l7WmQ7 bghPaBtcak9QXgdq + T0pMXVMcUQAQBXVQ eVseURB3cwYIenxx ZEYsDSNSCkEuIBVg + QUpQEXAHZDJmdTJM BBVFdwNVdQJNeEwU a1l3GhFYa3VsNCMk + FAgyOXU9MCtqYAFY WAIJIBoOW0sGBXY1 QRxKGDIyG1EMRiN7 NRUgJVMHdAAA +X-Authentic-SMTP: 61633532353630.1038:706 +X-AuthFastPath: 0 (Was 255) +X-AuthSMTP-Origin: 52.5.185.120/25 +X-AuthVirus-Status: No virus detected - but ensure you scan with your own + anti-virus system. +X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW + autolearn=ham version=3.3.1 +X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on + smtp1.linux-foundation.org +Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org> +Subject: Re: [bitcoin-dev] Merkle trees and mountain ranges +X-BeenThere: bitcoin-dev@lists.linuxfoundation.org +X-Mailman-Version: 2.1.12 +Precedence: list +List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org> +List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, + <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe> +List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/> +List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org> +List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help> +List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, + <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe> +X-List-Received-Date: Fri, 17 Jun 2016 04:34:43 -0000 + + +--lrZ03NoBR/3+SXJZ +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +Content-Transfer-Encoding: quoted-printable + +On Thu, Jun 16, 2016 at 02:07:26AM -0700, Bram Cohen wrote: +> On Wed, Jun 15, 2016 at 8:26 PM, Peter Todd <pete@petertodd.org> wrote: +> Okay, clearly my assumptions about the parts of that post I didn't read +> carefully were way off. I'll have to look through it carefully to be able +> to make coherent apples to apples comparisons. + +Thanks! + +> > I'm worried that once there's real transaction fees everyone might stop +> > > consolidating dust and the set of unspent transactions might grow wit= +hout +> > > bound as well, but that's a topic for another day. +> > +> > Ok, but then if you're concerned about that risk, why introduce a data +> > structure - the STXO set - that's _guaranteed_ to grow without bound? +> > +>=20 +> I'm not proposing STXO set commitments either. My point was that there +> should be incentives for collecting dust. That has nothing to do with this +> thread though and should be discussed separately (also I don't feel like +> discussing it because I don't have a good proposal). + +Ah, yeah, I misunderstood you there; as expected absolutely no-one is propo= +sing +STXO set commitments. :) + +> > > The main differences to your patricia trie are the non-padding sha256= + and +> > > that each level doesn't hash in a record of its depth and the usage of +> > > ONLY0 and ONLY1. +> > +> > I'm rather confused, as the above sounds nothing like what I've +> > implemented, +> > which only has leaf nodes, inner nodes, and the special empty node +> > singleton, +> > for both the MMR and merbinner trees. +> > +>=20 +> It's quite a bit like merbinner trees. I've basically taken the leaf nodes +> and smushed them into the inner nodes above them, thus saving a hashing +> operation and some memory. They're both binary radix trees. + +Ah, I see what you mean now. + +So above you said that in merbinner trees each node "hash[es] in a record of +its depth" That's actually incorrect: each node commits to the prefix that = +all +keys below that level start with, not just the depth. + +This means that in merbinner trees, cases where multiple keys share parts of +the same prefix are handled efficiently, without introducing extra levels +unnecessarily; there's no need for the ONLY0/1 nodes as the children of an +inner node will always be on different sides. + +When keys are randomly distributed, this isn't a big deal; OTOH against +attackers who are choosing keys, e.g. by grinding hashes, merbinner trees +always have maximum depths in proportion to log2(n) of the actual number of +items in the tree. Grinding is particularly annoying to deal with due to the +birthday attack: creating a ground prefix 64 bits long only takes 32 bits w= +orth +of work. + + +In my deterministic expressions work one of the ideas I've been tossing aro= +und +is rather than always using hash digests directly for when you need to comm= +it +to some data, we could instead extend the idea of a digest to that of a +"commitment", where a commitment is simply some short, but variable-sized, +string that uniquely maps to a given set of data. Secondly, commitments do +*not* always guarantee that the original data can't be recovered from the +commitment itself. + +By allowing commitments to be variable sized - say 0 to ~64 bytes - we get a +number of advantages: + +1) Data shorter than the length of a digest (32 bytes) can be included in t= +he +commitment itself, improving efficiency. + +2) Data a little longer than a digest can have hashing delayed, to better f= +ill +up blocks. + +In particular, case #2 handles your leaf node optimizations generically, +without special cases and additional complexity. It'd also be a better way = +to +do the ONLY0/1 cases, as if the "nothing on this side" symbol is a single b= +yte, +each additional colliding level would simply extend the commitment without +hashing. In short, you'd have nearly the same level of optimization even if= + at +the cryptography level your tree consists of only leaves, inner nodes, and = +nil. + +Another advantage of variable sized commitments is that it can help make cl= +ear +to users when it's possible to brute force the message behind the commitmen= +t. +For instance, digest from a hashed four byte integer can be trivially rever= +sed +by just trying all combinations. Equally, if that integer is concatenated w= +ith +a 32 byte digest that the attacker knows, the value of the integer can be b= +rute +forced. + +> > Technically even a patricia trie utxo commitment can have sub-1 cache +> > > misses per update if some of the updates in a single block are close = +to +> > > each other in memory. I think I can get practical Bitcoin updates down +> > to a +> > > little bit less than one l2 cache miss per update, but not a lot less. +> > +> > I'm very confused as to why you think that's possible. When you say +> > "practical +> > Bitcoin updates", what exactly is the data structure you're proposing to +> > update? How is it indexed? +>=20 +>=20 +> My calculations are: a Bitcoin block contains about 2000 updates. The l2 +> cache is about 256 kilobytes, and if an update is about 32 bytes times two +> for the parents, grandparents, etc. then an l2 cache can contain about 40= +00 +> values. If the current utxo size is about 2000 * 4000 =3D 8,000,000 in si= +ze +> then about half the pages which contain a transaction will contain a seco= +nd +> one. I think the utxo set is currently about an order of magnitude greater +> than that, so the number of such collisions will be fairly mall, hence my +> 'less than one but not a lot less' comment. + +Your estimate of updates requiring 32 bytes of data is *way* off. + +Each inner node updated on the path to a leaf node will itself require 32 b= +ytes +of data to be fetched - the digest of the sibling. As of block 416,628, the= +re +are 39,167,128 unspent txouts, giving us a tree about 25 levels deep. + +So if I want to update a single leaf, I need to read: + + 25 nodes * 32 bytes/node =3D 800 bytes + +of data. Naively, that'd mean our 2,000 updates needs to read 1.6MB from RA= +M, +which is 6.4x bigger than the L2 cache - it's just not going to fit. + +Taking into account the fact that this is a batched update improves things a +little bit. For a node at level i with random access patterns and N accesses +total our amortised cost is 1/(1 + N/2^i) Summing that over 2,000 leaf upda= +tes +and 25 levels gives us ~29,000 total updates, 0.9MB, which is still a lot +larger than L2 cache. + +While this might fit in L3 cache - usually on the order of megabytes - this= + is +a rather optimistic scenario anyway: we're assuming no other cache pressure= + and +100% hit rate. + +Anyway hashing is pretty slow. The very fast BLAKE2 is about 3 cycles/byte +(SHA256 is about 15 cycles/byte) so hashing that same data would take around +200 cycles, and probably quite a bit more in practice due to overheads from= + our +short message lengths; fetching a cache line from DRAM only takes about 1,0= +00 +cycles. I'd guess that once other overheads are taken into account, even if= + you +could eliminate L2/L3 cache-misses it wouldn't be much of an improvement. + +> As for how it's indexed, at a crypto definition level it's just a binary +> radix tree. In terms of how it's indexed in memory, that involves some +> optimizations to avoid cache misses. Memory is allocated into blocks of +> about the size of an 12 cache (or maybe an l1 cache, it will require some +> testing and optimization). Blocks are either branch blocks, which keep +> everything in fixed positions, or leaf blocks, which contain fixed size +> entries for nodes plus indexes within the same leaf block of their +> children. Branch blocks can have many children which can be either branch +> blocks or leaf blocks, but typically are either all branch blocks or all +> leaf blocks. Branch blocks always have exactly one parent. Leaf blocks +> always have all their inputs come from a single branch block, but there c= +an +> be multiple ones of those. When a branch block overflows it first tries to +> put stuff into the last leaf block it used, and if there's no more room it +> allocates a new one. It's fairly common for branches to have just a few +> leaf children, but they also could have a lot, depending on whether the +> base 2 log of the number of things currently in the set modulo the number +> levels in a branch is a small number. +>=20 +> Usually when an update is done it consists of first checking the +> appropriate output of the root block (it's jumped to directly to avoid +> unnecessary memory lookups. If there's nothing there the algorithm will +> walk back until it finds something.) That leads directly to (usually) +> another branch whose output is jumped to directly again. At Bitcoin utxo +> set sizes that will usually lead to a leaf block, which is then walked do= +wn +> manually to find the actual terminal node, which is then updated, and the +> parent, grandparent, etc. is then marked invalid until something which was +> already marked invalid is hit, and it exits. Calculation of hash values is +> done lazily. + +I think it's safe to say that given our working set is significantly larger +than the L2/L3 cache available, none of the above optimizations are likely = +to +help much. Better to just keep the codebase simple and use standard techniq= +ues. + +--=20 +https://petertodd.org 'peter'[:-1]@petertodd.org + +--lrZ03NoBR/3+SXJZ +Content-Type: application/pgp-signature; name="signature.asc" +Content-Description: Digital signature + +-----BEGIN PGP SIGNATURE----- + +iQEcBAEBCAAGBQJXY33YAAoJEGOZARBE6K+yRpMH/iu6jy2HeYSemU3z0hjnRf/x +h6C7fEGFM0QvQ5xYB6VL+FnRxNMmiG94rLDlUxTlrQ6U3OvFGPZhOYVI1hyUgzFW +98kB1RfWI233sdu/XKAp93zPKTJ7qT/GQpk2MQG58/gJmtKFDuWV40FYJlHI+62s +nhzEKoq9HLtCNcXUmfAWj/lGaHMhGQlLVs7O1v26lo7Ez8KO1fxFyrtg97LtVzFz +4qTqG3Lk4ic3snBMq48EVLtKMa+jO0TliZhUshEvtgA/T6FprEv1dpLDPyAs44W5 +8D1T4Uki84SYRnGq1IMfXtran9H61ez+XxYU54f0k3yaT5BPD4v3B7zHlrlgyO4= +=EcTA +-----END PGP SIGNATURE----- + +--lrZ03NoBR/3+SXJZ-- + |