Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id A2CE3956 for ; Thu, 16 Jun 2016 00:10:48 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from outmail148114.authsmtp.net (outmail148114.authsmtp.net [62.13.148.114]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id 9BCC4FB for ; Thu, 16 Jun 2016 00:10:47 +0000 (UTC) Received: from mail-c247.authsmtp.com (mail-c247.authsmtp.com [62.13.128.247]) by punt20.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5G0AjZf072975; Thu, 16 Jun 2016 01:10:45 +0100 (BST) Received: from petertodd.org (ec2-52-5-185-120.compute-1.amazonaws.com [52.5.185.120]) (authenticated bits=0) by mail.authsmtp.com (8.14.2/8.14.2/) with ESMTP id u5G0Afpm029478 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Jun 2016 01:10:42 +0100 (BST) Received: from [127.0.0.1] (localhost [127.0.0.1]) by petertodd.org (Postfix) with ESMTPSA id 6653F40092; Thu, 16 Jun 2016 00:08:45 +0000 (UTC) Received: by localhost (Postfix, from userid 1000) id 7A77520738; Wed, 15 Jun 2016 20:10:40 -0400 (EDT) Date: Wed, 15 Jun 2016 20:10:40 -0400 From: Peter Todd To: Bram Cohen , Bitcoin Protocol Discussion Message-ID: <20160616001040.GA5026@fedora-21-dvm> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="opJtzjQTFsWo+cga" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Server-Quench: c4091184-3356-11e6-bcde-0015176ca198 X-AuthReport-Spam: If SPAM / abuse - report it at: http://www.authsmtp.com/abuse X-AuthRoute: OCd2Yg0TA1ZNQRgX IjsJECJaVQIpKltL GxAVKBZePFsRUQkR aAdMdwQUEkAaAgsB AmAbW1VeUV17XWA7 bghPaBtcak9QXgdq T0pMXVMcUQAfAW1X RkMeUBB2cA0IeXZ3 Y0AsDXRbVUV7fUJg QU1RE3AHZDJmdTJM BBVFdwNVdQJNeEwU a1l3GhFYa3VsNCMk FAgyOXU9MCtqYAFY WAIJIBoOW0sGBXY1 QRxKGDIyG1EMRiN7 NRUgJVMHdAAA X-Authentic-SMTP: 61633532353630.1038:706 X-AuthFastPath: 0 (Was 255) X-AuthSMTP-Origin: 52.5.185.120/25 X-AuthVirus-Status: No virus detected - but ensure you scan with your own anti-virus system. X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] Merkle trees and mountain ranges X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jun 2016 00:10:48 -0000 --opJtzjQTFsWo+cga Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 14, 2016 at 05:14:23PM -0700, Bram Cohen via bitcoin-dev wrote: > This is in response to Peter Todd's proposal for Merkle Mountain Range > commitments in blocks: >=20 > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-May/012715.h= tml >=20 > I'm in strong agreement that there's a compelling need to put UTXO > commitments in blocks, and that the big barrier to getting it done is > performance, particularly latency. But I have strong disagreements (or > perhaps the right word is skepticism) about the details. >=20 > Peter proposes that there should be both UTXO and STXO commitments, and No, that's incorrect - I'm only proposing TXO commitments, not UTXO nor STXO commitments. > they should be based on Merkle Mountain Ranges based on Patricia Tries. My > first big disagreement is about the need for STXO commitments. I think > they're unnecessary and a performance problem. The STXO set is much larger > than the utxo set and requires much more memory and horespower to maintai= n. Again, I'm not proposing STXO commitments precisely because the set of _spe= nt_ transactions grows without bound. TXO commitments with committed sums of remaining unspent TXO's and with pruning of old history are special in this regard, because once spent the data associated with spent transactions can = be discarded completely, and at the same time, data associated with old history can be pruned with responsibility for keeping it resting on the shoulders of those owning those coins. > Most if not all of its functionality can in practice be done using the ut= xo > set. Almost anything accepting proofs of inclusion and exclusion will have > a complete history of block headers, so to prove inclusion in the stxo set > you can use a utxo proof of inclusion in the past and a proof of exclusion > for the most recent block. In the case of a txo which has never been > included at all, it's generally possible to show that an ancestor of the > txo in question was at one point included but that an incompatible > descendant of it (or the ancestor itself) is part of the current utxo set. > Generating these sorts of proofs efficiently can for some applications > require a complete STXO set, but that can done with a non-merkle set, > getting the vastly better performance of an ordinary non-cryptographic > hashtable. TXO commitments allows you to do all of this without requiring miners to ha= ve unbounded storage to create new blocks. > The fundamental approach to handling the latency problem is to have the > utxo commitments trail a bit. Computing utxo commitments takes a certain > amount of time, too much to hold up block propagation but still hopefully > vastly less than the average amount of time between blocks. Trailing by a > single block is probably a bad idea because you sometimes get blocks back > to back, but you never get blocks back to back to back to back. Having the > utxo set be trailing by a fixed amount - five blocks is probably excessive > - would do a perfectly good job of keeping latency from every becoming an > issue. Smaller commitments for the utxos added and removed in each block > alone could be added without any significant performance penalty. That way > all blocks would have sufficient commitments for a completely up to date > proofs of inclusion and exclusion. This is not a controversial approach. Agreed - regardless of approach adding latency to commitment calculations of all kinds is something I think we all agree can work in principle, although obviously it should be a last resort technique when optimization fails. > Now I'm going to go out on a limb. My thesis is that usage of a mountain > range is unnecessary, and that a merkle tree in the raw can be made > serviceable by sprinkling magic pixie dust on the performance problem. It'd help if you specified exactly what type of merkle tree you're talking about here; remember that the certificate transparency RFC appears to have reinvented merkle mountain ranges, and they call them "merkle trees". Bitc= oin meanwhile uses a so-called "merkle tree" that's broken, and Zcash uses a partially filled fixed-sized perfect tree. > There are two causes of performance problems for merkle trees: hashing > operations and memory cache misses. For hashing functions, the difference > between a mountain range and a straight merkle tree is roughly that in a > mountain range there's one operation for each new update times the number > of times that thing will get merged into larger hills. If there are fewer > levels of hills the number of operations is less but the expense of proof > of inclusion will be larger. For raw merkle trees the number of operations > per thing added is the log base 2 of the number of levels in the tree, > minus the log base 2 of the number of things added at once since you can = do > lazy evaluation. For practical Bitcoin there are (very roughly) a million > things stored, or 20 levels, and there are (even more roughly) about a > thousand things stored per block, so each thing forces about 20 - 10 =3D = 10 > operations. If you follow the fairly reasonable guideline of mountain ran= ge > hills go up by factors of four, you instead have 20/2 =3D 10 operations p= er > thing added amortized. Depending on details this comparison can go either > way but it's roughly a wash and the complexity of a mountain range is > clearly not worth it at least from the point of view of CPU costs. I'm having a hard time understanding this paragraph; could you explain what= you think is happening when things are "merged into larger hills"? > But CPU costs aren't the main performance problem in merkle trees. The > biggest issues is cache misses, specifically l1 and l2 cache misses. These > tend to take a long time to do, resulting in the CPU spending most of its > time sitting around doing nothing. A naive tree implementation is pretty > much the worst thing you can possibly build from a cache miss standpoint, > and its performance will be completely unacceptable. Mountain ranges do a > fabulous job of fixing this problem, because all their updates are merges > so the metrics are more like cache misses per block instead of cache miss= es > per transaction. > > The magic pixie dust I mentioned earlier involves a bunch of subtle > implementation details to keep cache coherence down which should get the > number of cache misses per transaction down under one, at which point it > probably isn't a bottleneck any more. There is an implementation in the > works here: As UTXO/STXO/TXO sets are all enormously larger than L1/L2 cache, it's impossible to get CPU cache misses below one for update operations. The clo= sest thing to an exception is MMR's, which due to their insertion-ordering could have good cache locality for appends, in the sense that the mountain tips required to recalculate the MMR tip digest will already be in cache from the previous append. But that's not sufficient, as you also need to modify old TXO's further back in the tree to mark them as spent - that data is going t= o be far larger than L1/L2 cache. > https://github.com/bramcohen/MerkleSet >=20 > This implementation isn't finished yet! I'm almost there, and I'm > definitely feeling time pressure now. I've spent quite a lot of time on > this, mostly because of a bunch of technical reworkings which proved > necessary. This is the last time I ever write a database for kicks. But > this implementation is good on all important dimensions, including: >=20 > Lazy root calculation > Few l1 and l2 cache misses > Small proofs of inclusion/exclusion Have you looked at the pruning system that my proofchains work implements? --=20 https://petertodd.org 'peter'[:-1]@petertodd.org --opJtzjQTFsWo+cga Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- iQEcBAEBCAAGBQJXYe59AAoJEGOZARBE6K+yHDkH/0F8SnYloFxwHvdHe3jtkbX3 o8PTfb1d8xmn2ZX3HMIJ4rMWL7KA2rcOpCLfx976ZpnlmngJzM58j49jtKaXT1wV 0FfNRpgL4xScDOa5abNU/4tzaXRNiQQkPGqXp/NFxmngYHHHK06NuI2l7BjrwGeZ BZWAQRQ4Qd4PWdy1/TaUyY5Zelb4tpF0HRnsg8JH2P6MScp7rsrKePO4ehUSWEQy 6RJZlKsc7GGwTpdXxjihHpUUdTYjy/C8fgeRJZrqDz0lIMOgoZ0YyODL8N5PcWPH 7EQoblIWyhY6eyhv3PzeyeAbKHd+Zc5HgTNU+VsXxhn2ft6Nk3xVmuR2Ely41rI= =on7S -----END PGP SIGNATURE----- --opJtzjQTFsWo+cga--