summaryrefslogtreecommitdiff
path: root/ea/288ad81668f83c885497b2c7dc39649c0d001b
blob: 7d598475c0194371f45c6f3148206dde80483b5a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
Return-Path: <tomas@tomasvdw.nl>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 71FDC6C
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  7 Apr 2017 00:17:50 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com
	[66.111.4.25])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 6C07426A
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  7 Apr 2017 00:17:49 +0000 (UTC)
Received: from compute2.internal (compute2.nyi.internal [10.202.2.42])
	by mailout.nyi.internal (Postfix) with ESMTP id B39C520931;
	Thu,  6 Apr 2017 20:17:47 -0400 (EDT)
Received: from web3 ([10.202.2.213])
	by compute2.internal (MEProxy); Thu, 06 Apr 2017 20:17:47 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=content-transfer-encoding:content-type
	:date:from:in-reply-to:message-id:mime-version:references
	:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=u9fTn6
	KB0kgxcxJj8xQR2TUydIzJm89lLSn8o+/+t0g=; b=jbC1H4Pai5IriGs/sNhC6q
	ivxpzVTxmTo79KKG5bIArlQxa8YlSEqte0IqAkC8OedAWv0j7XG3lDQikqM9E1ZA
	H806SusVkJp/Fpk3NA2RN7HuGTv2vETDHP0GFVf+AHaN15v8UmO5ShURko5aiyKs
	K8+ZZC8eJ2yezy2m5ILUiIaF07QB2c4Bld9Xvm4SzIFxpI5Rkf8ICWWJvrqXIoCy
	MLkPhMXDsqnPO2X0uv+6MOsQHHPOVbz/AfaM976nTGWR5UY18IRuA/9M6FJIUe5X
	LKUG5hGZHsLjUAUe+tYRqZhpXpVjgRqbXUn5oA9Q2D/4BjWrUW2Fmj5Hs0Ftg1DQ
	==
X-ME-Sender: <xms:q9rmWObVbyEO9T3VeXOG1a7g0ErL8qxg2I6QnBJhdZqXHiCwwWZe1Q>
Received: by mailuser.nyi.internal (Postfix, from userid 99)
	id 868AD9EC32; Thu,  6 Apr 2017 20:17:47 -0400 (EDT)
Message-Id: <1491524267.715789.936916864.1156D8CB@webmail.messagingengine.com>
From: Tomas <tomas@tomasvdw.nl>
To: Eric Voskuil <eric@voskuil.org>,
	Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>,
	Libbitcoin Development <libbitcoin@lists.dyne.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="utf-8"
X-Mailer: MessagingEngine.com Webmail Interface - ajax-8e6aa83c
References: <1491516747.3791700.936828232.69F82904@webmail.messagingengine.com>
	<eebc06a4-5ab8-46a8-2f50-a472cb57a775@voskuil.org>
Date: Fri, 07 Apr 2017 02:17:47 +0200
In-Reply-To: <eebc06a4-5ab8-46a8-2f50-a472cb57a775@voskuil.org>
X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, RCVD_IN_DNSWL_LOW, URIBL_RHS_DOB autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Fri, 07 Apr 2017 00:19:26 +0000
Subject: Re: [bitcoin-dev] Using a storage engine without UTXO-index
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Apr 2017 00:17:50 -0000

Hi Eric,

Thanks, but I get the impression that the similarity is rather
superficial.  

To address your points:

> (1) higher than necessary storage space requirement due to storing the
> indexing data required for correlate the spends, and

Hmm. No. Spends are simply scanned in the spend-tree (full tree,
prunable, fully 5.6gb), or caught by the spend-index (bit index,
non-prunable, fully 180mb). Neither impose significant storage
requirements.

> 2) higher than necessary validation complexity and cost in terms of
> computing the spent-ness (including spender height) of an output.
>
> With the exception of de-linking (not deleted) in the case of reorgs, the
> entire store is append only, implemented in a small set of memory
> mapped file

I guess this is the key difference. As the spend-tree stores the spend
information in a tree structure, no reorgs are required, and the
resulting code is actually much less complex.

Bitcrust simply scans the tree. Although earlier designs used a
skip-list, it turns out that accompanied by a spent-index lagging a few
blocks behind, raw scanning is faster then anything even though it needs
to scan ~5 blocks times ~4000 inputs before reaching the first
spent-index,  the actual scan is highly cache efficient and little more
then a "REP SCASQ", reaching sub-microsecond per input on each core
*including* the lookup in the spend index.

 > I don't follow this part, maybe you could clarify. A spends index
> grows with the size of the spend set (forever) as it cannot be pruned,
> which certainly exceeds the size of the UTXO set (unless nothing is
> spent). The advantage is that you don't have to keep rewriting the
> store when you use a spends set (because the store can be append only).

My point is, that the spend tree grows per *input* of a transaction
instead of per *output* of a transaction, because this is what is
scanned on order validation.

The spend tree can be pruned because the spend index (~200mb) catches
early spends.

Disregarding the baseload script validation, the peak load order
validation of bitcrust is more negatively effected by a transaction with
many inputs than by a transaction of many outputs.

I encourage you to check out the results at https://bitcrust.org

Regards,
Tomas

On Fri, Apr 7, 2017, at 01:38, Eric Voskuil wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> On 04/06/2017 03:12 PM, Tomas via bitcoin-dev wrote:
> 
> Hi Tomas,
> 
> > I have been working on a bitcoin implementation that uses a
> > different approach to indexing for verifying the order of
> > transactions. Instead of using an index of unspent outputs, double
> > spends are verified by using a spend-tree where spends are scanned
> > against spent outputs instead of unspent outputs.
> 
> This is the approach that genjix used in libbitcoin version2. With the
> exception of de-linking (not deleted) in the case of reorgs, the
> entire store is append only, implemented in a small set of memory
> mapped files. The downsides to the approach are:
> 
> (1) higher than necessary storage space requirement due to storing the
> indexing data required for correlate the spends, and
> 
> (2) higher than necessary validation complexity and cost in terms of
> computing the spent-ness (including spender height) of an output.
> 
> His implementation used a hash table, so performance-wise it did quite
> well and would theoretically outperform a tree, O(1) vs. O(log2(N)).
> 
> > This allows for much better concurrency, as not only blocks, but
> > also individual inputs can be verified fully in parallel.
> 
> I was successful in parallelizing input validation (across the inputs
> of an unconfirmed tx and across the set of all inputs in a block)
> using the v2 store. However, it is not the case that the spends
> approach is necessary for concurrency.
> 
> To resolve the above two problems the version3 store does not use a
> spends table/index. Nor does it store any table of UTXOs. Yet
> validation is highly parallelized. Instead of additional indexes it
> uses the tx hash table, augmented with 32 bits per output for spender
> height. So there is a O(1) cost of finding the tx and a O(N) cost of
> finding the spender height where N is the number of outputs in the tx.
> But because the number of outputs in a tx is bounded (by block size)
> this is constant time in the number of transactions.
> 
> This works out much faster than the spends table, and without the
> storage cost or complexity disadvantages. It also scales with
> available hardware, as the memory mapped files become in-memory hash
> tables. For low memory machines we found it was important to implement
> an opaque UTXO cache to limit paging, but for higher end systems zero
> cache is optimal.
> 
> > I am sharing this not only to ask for your feedback, but also to
> > call for a clear separation of protocol and implementations: As
> > this solution, reversing the costs of outputs and inputs, seems to
> > have excellent performance characteristics (as shown in the test
> > results), updates to the protocol addressing the UTXO growth, might
> > not be worth considering *protocol improvements* and it might be
> > best to address these concerns as implementation details.
> 
> I don't follow this part, maybe you could clarify. A spends index
> grows with the size of the spend set (forever) as it cannot be pruned,
> which certainly exceeds the size of the UTXO set (unless nothing is
> spent). The advantage is that you don't have to keep rewriting the
> store when you use a spends set (because the store can be append only).
> 
> Feel free to message me if you'd like to discuss in more detail, or to
> continue on the libbitcoin mailing list (copied).
> 
> e
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.22 (GNU/Linux)
> 
> iQEcBAEBCAAGBQJY5tFpAAoJEDzYwH8LXOFOcMgH/2mw5iOvUYNwvZ2z0KKTSUOA
> Pd8d5mKoWvd94QxhQ+RyTbkEkMhHl75+zcBgRsfUTtZlBIe/Z0+OgVIN6ibEw+WD
> w7k3HqgQi9gLgydEelxTAX+z3dJ24n4kCCdKAmZbBuK+Yr/7AViugbEqYemKepku
> pRWZZS74MUvrYesc0xPn4Ao3DTzMjjY0K2mkuqV8jlwdfZjlAQX9pTx+iSCuMhkd
> HJ8w7s8QnjVnUeOlLe29mZwaFJPyOTLJMqgDE6s2sXacAy5QQbVCatygvDQ8A/wC
> ktBnKPFb2lGX3bGKu/KwABegBy/hyec+NP0wFR+0MVivCwTK1+SjeHu5MNOSVlM=
> =tfVj
> -----END PGP SIGNATURE-----