summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTomas <tomas@tomasvdw.nl>2017-04-07 02:17:47 +0200
committerbitcoindev <bitcoindev@gnusha.org>2017-04-07 00:17:50 +0000
commit49345f48f3f8ec141f75e2a9076d92c3e39b0dd1 (patch)
tree6cad5beff5ffd5b72be81623cfffefea74054fc3
parentc5f61d259767f26a274c62391f4c0269d2eb5d6d (diff)
downloadpi-bitcoindev-49345f48f3f8ec141f75e2a9076d92c3e39b0dd1.tar.gz
pi-bitcoindev-49345f48f3f8ec141f75e2a9076d92c3e39b0dd1.zip
Re: [bitcoin-dev] Using a storage engine without UTXO-index
-rw-r--r--ea/288ad81668f83c885497b2c7dc39649c0d001b201
1 files changed, 201 insertions, 0 deletions
diff --git a/ea/288ad81668f83c885497b2c7dc39649c0d001b b/ea/288ad81668f83c885497b2c7dc39649c0d001b
new file mode 100644
index 000000000..7d598475c
--- /dev/null
+++ b/ea/288ad81668f83c885497b2c7dc39649c0d001b
@@ -0,0 +1,201 @@
+Return-Path: <tomas@tomasvdw.nl>
+Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
+ [172.17.192.35])
+ by mail.linuxfoundation.org (Postfix) with ESMTPS id 71FDC6C
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Fri, 7 Apr 2017 00:17:50 +0000 (UTC)
+X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
+Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com
+ [66.111.4.25])
+ by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 6C07426A
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Fri, 7 Apr 2017 00:17:49 +0000 (UTC)
+Received: from compute2.internal (compute2.nyi.internal [10.202.2.42])
+ by mailout.nyi.internal (Postfix) with ESMTP id B39C520931;
+ Thu, 6 Apr 2017 20:17:47 -0400 (EDT)
+Received: from web3 ([10.202.2.213])
+ by compute2.internal (MEProxy); Thu, 06 Apr 2017 20:17:47 -0400
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
+ messagingengine.com; h=content-transfer-encoding:content-type
+ :date:from:in-reply-to:message-id:mime-version:references
+ :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=u9fTn6
+ KB0kgxcxJj8xQR2TUydIzJm89lLSn8o+/+t0g=; b=jbC1H4Pai5IriGs/sNhC6q
+ ivxpzVTxmTo79KKG5bIArlQxa8YlSEqte0IqAkC8OedAWv0j7XG3lDQikqM9E1ZA
+ H806SusVkJp/Fpk3NA2RN7HuGTv2vETDHP0GFVf+AHaN15v8UmO5ShURko5aiyKs
+ K8+ZZC8eJ2yezy2m5ILUiIaF07QB2c4Bld9Xvm4SzIFxpI5Rkf8ICWWJvrqXIoCy
+ MLkPhMXDsqnPO2X0uv+6MOsQHHPOVbz/AfaM976nTGWR5UY18IRuA/9M6FJIUe5X
+ LKUG5hGZHsLjUAUe+tYRqZhpXpVjgRqbXUn5oA9Q2D/4BjWrUW2Fmj5Hs0Ftg1DQ
+ ==
+X-ME-Sender: <xms:q9rmWObVbyEO9T3VeXOG1a7g0ErL8qxg2I6QnBJhdZqXHiCwwWZe1Q>
+Received: by mailuser.nyi.internal (Postfix, from userid 99)
+ id 868AD9EC32; Thu, 6 Apr 2017 20:17:47 -0400 (EDT)
+Message-Id: <1491524267.715789.936916864.1156D8CB@webmail.messagingengine.com>
+From: Tomas <tomas@tomasvdw.nl>
+To: Eric Voskuil <eric@voskuil.org>,
+ Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>,
+ Libbitcoin Development <libbitcoin@lists.dyne.org>
+MIME-Version: 1.0
+Content-Transfer-Encoding: 7bit
+Content-Type: text/plain; charset="utf-8"
+X-Mailer: MessagingEngine.com Webmail Interface - ajax-8e6aa83c
+References: <1491516747.3791700.936828232.69F82904@webmail.messagingengine.com>
+ <eebc06a4-5ab8-46a8-2f50-a472cb57a775@voskuil.org>
+Date: Fri, 07 Apr 2017 02:17:47 +0200
+In-Reply-To: <eebc06a4-5ab8-46a8-2f50-a472cb57a775@voskuil.org>
+X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,
+ DKIM_VALID, RCVD_IN_DNSWL_LOW, URIBL_RHS_DOB autolearn=ham version=3.3.1
+X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
+ smtp1.linux-foundation.org
+X-Mailman-Approved-At: Fri, 07 Apr 2017 00:19:26 +0000
+Subject: Re: [bitcoin-dev] Using a storage engine without UTXO-index
+X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
+X-Mailman-Version: 2.1.12
+Precedence: list
+List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
+List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
+ <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
+List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
+List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
+List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
+List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
+ <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
+X-List-Received-Date: Fri, 07 Apr 2017 00:17:50 -0000
+
+Hi Eric,
+
+Thanks, but I get the impression that the similarity is rather
+superficial.
+
+To address your points:
+
+> (1) higher than necessary storage space requirement due to storing the
+> indexing data required for correlate the spends, and
+
+Hmm. No. Spends are simply scanned in the spend-tree (full tree,
+prunable, fully 5.6gb), or caught by the spend-index (bit index,
+non-prunable, fully 180mb). Neither impose significant storage
+requirements.
+
+> 2) higher than necessary validation complexity and cost in terms of
+> computing the spent-ness (including spender height) of an output.
+>
+> With the exception of de-linking (not deleted) in the case of reorgs, the
+> entire store is append only, implemented in a small set of memory
+> mapped file
+
+I guess this is the key difference. As the spend-tree stores the spend
+information in a tree structure, no reorgs are required, and the
+resulting code is actually much less complex.
+
+Bitcrust simply scans the tree. Although earlier designs used a
+skip-list, it turns out that accompanied by a spent-index lagging a few
+blocks behind, raw scanning is faster then anything even though it needs
+to scan ~5 blocks times ~4000 inputs before reaching the first
+spent-index, the actual scan is highly cache efficient and little more
+then a "REP SCASQ", reaching sub-microsecond per input on each core
+*including* the lookup in the spend index.
+
+ > I don't follow this part, maybe you could clarify. A spends index
+> grows with the size of the spend set (forever) as it cannot be pruned,
+> which certainly exceeds the size of the UTXO set (unless nothing is
+> spent). The advantage is that you don't have to keep rewriting the
+> store when you use a spends set (because the store can be append only).
+
+My point is, that the spend tree grows per *input* of a transaction
+instead of per *output* of a transaction, because this is what is
+scanned on order validation.
+
+The spend tree can be pruned because the spend index (~200mb) catches
+early spends.
+
+Disregarding the baseload script validation, the peak load order
+validation of bitcrust is more negatively effected by a transaction with
+many inputs than by a transaction of many outputs.
+
+I encourage you to check out the results at https://bitcrust.org
+
+Regards,
+Tomas
+
+On Fri, Apr 7, 2017, at 01:38, Eric Voskuil wrote:
+> -----BEGIN PGP SIGNED MESSAGE-----
+> Hash: SHA256
+>
+> On 04/06/2017 03:12 PM, Tomas via bitcoin-dev wrote:
+>
+> Hi Tomas,
+>
+> > I have been working on a bitcoin implementation that uses a
+> > different approach to indexing for verifying the order of
+> > transactions. Instead of using an index of unspent outputs, double
+> > spends are verified by using a spend-tree where spends are scanned
+> > against spent outputs instead of unspent outputs.
+>
+> This is the approach that genjix used in libbitcoin version2. With the
+> exception of de-linking (not deleted) in the case of reorgs, the
+> entire store is append only, implemented in a small set of memory
+> mapped files. The downsides to the approach are:
+>
+> (1) higher than necessary storage space requirement due to storing the
+> indexing data required for correlate the spends, and
+>
+> (2) higher than necessary validation complexity and cost in terms of
+> computing the spent-ness (including spender height) of an output.
+>
+> His implementation used a hash table, so performance-wise it did quite
+> well and would theoretically outperform a tree, O(1) vs. O(log2(N)).
+>
+> > This allows for much better concurrency, as not only blocks, but
+> > also individual inputs can be verified fully in parallel.
+>
+> I was successful in parallelizing input validation (across the inputs
+> of an unconfirmed tx and across the set of all inputs in a block)
+> using the v2 store. However, it is not the case that the spends
+> approach is necessary for concurrency.
+>
+> To resolve the above two problems the version3 store does not use a
+> spends table/index. Nor does it store any table of UTXOs. Yet
+> validation is highly parallelized. Instead of additional indexes it
+> uses the tx hash table, augmented with 32 bits per output for spender
+> height. So there is a O(1) cost of finding the tx and a O(N) cost of
+> finding the spender height where N is the number of outputs in the tx.
+> But because the number of outputs in a tx is bounded (by block size)
+> this is constant time in the number of transactions.
+>
+> This works out much faster than the spends table, and without the
+> storage cost or complexity disadvantages. It also scales with
+> available hardware, as the memory mapped files become in-memory hash
+> tables. For low memory machines we found it was important to implement
+> an opaque UTXO cache to limit paging, but for higher end systems zero
+> cache is optimal.
+>
+> > I am sharing this not only to ask for your feedback, but also to
+> > call for a clear separation of protocol and implementations: As
+> > this solution, reversing the costs of outputs and inputs, seems to
+> > have excellent performance characteristics (as shown in the test
+> > results), updates to the protocol addressing the UTXO growth, might
+> > not be worth considering *protocol improvements* and it might be
+> > best to address these concerns as implementation details.
+>
+> I don't follow this part, maybe you could clarify. A spends index
+> grows with the size of the spend set (forever) as it cannot be pruned,
+> which certainly exceeds the size of the UTXO set (unless nothing is
+> spent). The advantage is that you don't have to keep rewriting the
+> store when you use a spends set (because the store can be append only).
+>
+> Feel free to message me if you'd like to discuss in more detail, or to
+> continue on the libbitcoin mailing list (copied).
+>
+> e
+> -----BEGIN PGP SIGNATURE-----
+> Version: GnuPG v2.0.22 (GNU/Linux)
+>
+> iQEcBAEBCAAGBQJY5tFpAAoJEDzYwH8LXOFOcMgH/2mw5iOvUYNwvZ2z0KKTSUOA
+> Pd8d5mKoWvd94QxhQ+RyTbkEkMhHl75+zcBgRsfUTtZlBIe/Z0+OgVIN6ibEw+WD
+> w7k3HqgQi9gLgydEelxTAX+z3dJ24n4kCCdKAmZbBuK+Yr/7AViugbEqYemKepku
+> pRWZZS74MUvrYesc0xPn4Ao3DTzMjjY0K2mkuqV8jlwdfZjlAQX9pTx+iSCuMhkd
+> HJ8w7s8QnjVnUeOlLe29mZwaFJPyOTLJMqgDE6s2sXacAy5QQbVCatygvDQ8A/wC
+> ktBnKPFb2lGX3bGKu/KwABegBy/hyec+NP0wFR+0MVivCwTK1+SjeHu5MNOSVlM=
+> =tfVj
+> -----END PGP SIGNATURE-----
+