summaryrefslogtreecommitdiff
path: root/8d/3c0a4ad3f46ec9b35dcefc3c60037fe8c0f1c3
blob: 06879f4ff3d3768e590afb9253cb9935299bf712 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
Return-Path: <david.vorick@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id E306A723
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon, 17 Apr 2017 06:54:52 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-wr0-f170.google.com (mail-wr0-f170.google.com
	[209.85.128.170])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 74041A7
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon, 17 Apr 2017 06:54:51 +0000 (UTC)
Received: by mail-wr0-f170.google.com with SMTP id l28so78622288wre.0
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Sun, 16 Apr 2017 23:54:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:from:date:message-id:subject:to;
	bh=4Hwk5+o3gJDKBBWlKbYuNkKXlxNYtA65H3u8VfTVad4=;
	b=El4xbPn08z+JC6szX1J3E6DoggQ8Wo66fD/7mVqc5rhEb0E7htt9tJVVrHM4bIQD1d
	Ebq9QwcyZzCe7iKMRJIsaNp01r+WW5nEQeBY+VjSTq/8JfIgaSFegVKZ9mebdqxeSN4U
	wXlt4SVJ81kHyfjqmJRTpENzofBFbCArci7dmMwUw4CscPvKaS9Nxrfy+cipiy6iXXfd
	G/skYRqj7/A8polpmmBG+GN6WsDdJ7wdFKr0yBmWBt1Byumada3+sqoM2chOmKu6tNsE
	5d39Fbu6usQKmlhu6Q3Z8Ok9UXS6kvzs1HcY9cvqPLF5vNhP5WDc9Mu6TaNdFIu5TtSp
	CTqw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
	bh=4Hwk5+o3gJDKBBWlKbYuNkKXlxNYtA65H3u8VfTVad4=;
	b=Ej0FHUzOPMbgecOCc970Aj7G3L17NCad4olXov5OdS0P0GMQzXCfJRCPCYGcsvxvLT
	VPHsurki4KhF5HHcrWXL/4MFzXyl6BQMbTQhSbkWs6FpN1VvysOdbZvEr4UHLxTKvepK
	dEOucOQVvGGeIa1spPFCaLQa7v3vmPDtu/EsW33Nni6Bu0xU/6kT36urz/pniRqgvlMR
	YzOvYEXi+Y7NJ3AW4bF9s4s+3aSiVm/hmcoeUMc1i7/X1X845ilYEfN0tbCoGB7C9eo6
	LNnjjxe0zUBtxy5CNMZ+yUYsnnuUg42wsJmb8oYgGJV7jciI/9esN86btRxgkY78ZQU7
	DIQA==
X-Gm-Message-State: AN3rC/5e+MPpfcguMLjpe7gHbcrIMj/5HkIA1pNvROSqz1ydE+mLkPxZ
	Dxd6p280sBKQXQo6x3F4OTv8wnDOBA==
X-Received: by 10.223.164.6 with SMTP id d6mr15646014wra.144.1492412089694;
	Sun, 16 Apr 2017 23:54:49 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.45.19 with HTTP; Sun, 16 Apr 2017 23:54:49 -0700 (PDT)
From: David Vorick <david.vorick@gmail.com>
Date: Mon, 17 Apr 2017 02:54:49 -0400
Message-ID: <CAFVRnypbQQ-vsSLqv48cYaqTCty4R1DmFRqfAvxe4mAqyQNXxQ@mail.gmail.com>
To: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: multipart/alternative; boundary=f403045f138877004e054d574459
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE,
	RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Subject: [bitcoin-dev] Small Nodes: A Better Alternative to Pruned Nodes
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 06:54:53 -0000

--f403045f138877004e054d574459
Content-Type: text/plain; charset=UTF-8

*Rationale:*

A node that stores the full blockchain (I will use the term archival node)
requires over 100GB of disk space, which I believe is one of the most
significant barriers to more people running full nodes. And I believe the
ecosystem would benefit substantially if more users were running full nodes.

The best alternative today to storing the full blockchain is to run a
pruned node, which keeps only the UTXO set and throws away already verified
blocks. The operator of the pruned node is able to enjoy the full security
benefits of a full node, but is essentially leeching the network, as they
performed a large download likely without contributing anything back.

This puts more pressure on the archival nodes, as the archival nodes need
to pick up the slack and help new nodes bootstrap to the network. As the
pressure on archival nodes grows, fewer people will be able to actually run
archival nodes, and the situation will degrade. The situation would likely
become problematic quickly if bitcoin-core were to ship with the defaults
set to a pruned node.

Even further, the people most likely to care about saving 100GB of disk
space are also the people least likely to care about some extra bandwidth
usage. For datacenter nodes, and for nodes doing lots of bandwidth, the
bandwidth is usually the biggest cost of running the node. For home users
however, as long as they stay under their bandwidth cap, the bandwidth is
actually free. Ideally, new nodes would be able to bootstrap from nodes
that do not have to pay for their bandwidth, instead of needing to rely on
a decreasing percentage of heavy-duty archival nodes.

I have (perhaps incorrectly) identified disk space consumption as the most
significant factor in your average user choosing to run a pruned node or a
lite client instead of a full node. The average user is not typically too
worried about bandwidth, and is also not typically too worried about
initial blockchain download time. But the 100GB hit to your disk space can
be a huge psychological factor, especially if your hard drive only has
500GB available in the first place, and 250+ GB is already consumed by
other files you have.

I believe that improving the disk usage situation would greatly benefit
decentralization, especially if it could be done without putting pressure
on archival nodes.

*Small Nodes Proposal:*

I propose an alternative to the pruned node that does not put undue
pressure on archival nodes, and would be acceptable and non-risky to ship
as a default in bitcoin-core. For lack of a better name, I'll call this new
type of node a 'small node'. The intention is that bitcoin-core would
eventually ship 'small nodes' by default, such that the expected amount of
disk consumption drops from today's 100+ GB to less than 30 GB.

My alternative proposal has the following properties:

+ Full nodes only need to store ~20% of the blockchain
+ With very high probability, a new node will be able to recover the entire
blockchain by connecting to 6 random small node peers.
+ An attacker that can eliminate a chosen+ 95% of the full nodes running
today will be unable to prevent new nodes from downloading the full
blockchain, even if the attacker is also able to eliminate all archival
nodes. (assuming all nodes today were small nodes instead of archival nodes)

Method:

A small node will pick an index [5, 256). This index is that node's
permanent index. When storing a block, instead of storing the full block,
the node will use Reed-Solomon coding to erasure code the block using a
5-of-256 scheme. The result will be 256 pieces that are 20% of the size of
the block each. The node picks the piece that corresponds to its index, and
stores that instead. (Indexes 0-4 are reserved for archival nodes -
explained later)

The node is now storing a fragment of every block. Alone, this fragment
cannot be used to recover any piece of the blockchain. However, when paired
with any 5 unique fragments (fragments of the same index will not be
unique), the full block can be recovered.

Nodes can optionally store more than 1 fragment each. At 5 fragments, the
node becomes a full archival node, and the chosen indexes should be 0-4.
This is advantageous for the archival node as the encoded data for the
first 5 indexes will actually be identical to the block itself - there is
no computational overhead for selecting the first indexes. There is also no
need to choose random indexes, because the full block can be recovered no
matter which indexes are chosen.

When connecting to new peers, the indexes of each peer needs to be known.
Once peers totaling 5 unique indexes are discovered, blockchain download
can begin. Connecting to just 5 small node peers provides a >95% chance of
getting 5 uniques, with exponentially improving odds of success as you
connect to more peers. Connecting to a single archive node guarantees that
any gaps can be filled.

A good encoder should be able to turn a block into a 5-of-256 piece set in
under 10 milliseconds using a single core on a standard consumer desktop.
This should not slow down initial blockchain download substantially, though
the overhead is more than a rounding error.

*DoS Prevention:*

A malicious node may provide garbage data instead of the actual piece.
Given just the garbage data and 4 other correct pieces, it is impossible
(best I know anyway) to tell which piece is the garbage piece.

One option in this case would be to seek out an archival node that could
verify the correctness of the pieces, and identify the malicious node.

Another option would be to have the small nodes store a cryptographic
checksum of each piece. Obtaining the cryptographic checksum for all 256
pieces would incur a nontrivial amount of hashing (post segwit, as much as
100MB of extra hashing per block), and would require an additional ~4kb of
storage per block. The hashing overhead here may be prohibitive.

Another solution would be to find additional pieces and brute-force
combinations of 5 until a working combination was discovered. Though this
sounds nasty, it should take less than five seconds of computation to find
the working combination given 5 correct pieces and 2 incorrect pieces. This
computation only needs to be performed once to identify the malicious peers.

I also believe that alternative erasure coding schemes exist which actually
are able to identify the bad pieces given sufficient good pieces, however I
don't know if they have the same computational performance as the best
Reed-Solomon coding implementations.

*Deployment:*

Small nodes are completely useless unless the critical mass of 5 pieces can
be obtained. The first version that supports small node block downloads
should default everyone to an archival node (meaning indexes 0-4 are used)

Once there are enough small-node-enabled archive nodes, the default can be
switched so that nodes only have a single index by default. In the first
few days, when there are only a few small nodes, the previously-deployed
archival nodes can help fill in the gaps, and the small nodes can be useful
for blockchain download right away.

----------------------------------

This represents a non-trivial amount of code, but I believe that the result
would be a non-trivial increase in the percentage of users running full
nodes, and a healthier overall network.

--f403045f138877004e054d574459
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><b>Rationale:</b><br></div><div><br>A node that =
stores the full blockchain (I will use the term archival node) requires ove=
r 100GB of disk space, which I believe is one of the most significant barri=
ers to more people running full nodes. And I believe the ecosystem would be=
nefit substantially if more users were running full nodes.<br><br></div>The=
 best alternative today to storing the full blockchain is to run a pruned n=
ode, which keeps only the UTXO set and throws away already verified blocks.=
 The operator of the pruned node is able to enjoy the full security benefit=
s of a full node, but is essentially leeching the network, as they performe=
d a large download likely without contributing anything back.<br><br></div>=
<div>This puts more pressure on the archival nodes, as the archival nodes n=
eed to pick up the slack and help new nodes bootstrap to the network. As th=
e pressure on archival nodes grows, fewer people will be able to actually r=
un archival nodes, and the situation will degrade. The situation would like=
ly become problematic quickly if bitcoin-core were to ship with the default=
s set to a pruned node.<br><br></div><div>Even further, the people most lik=
ely to care about saving 100GB of disk space are also the people least like=
ly to care about some extra bandwidth usage. For datacenter nodes, and for =
nodes doing lots of bandwidth, the bandwidth is usually the biggest cost of=
 running the node. For home users however, as long as they stay under their=
 bandwidth cap, the bandwidth is actually free. Ideally, new nodes would be=
 able to bootstrap from nodes that do not have to pay for their bandwidth, =
instead of needing to rely on a decreasing percentage of heavy-duty archiva=
l nodes.<br><br></div><div>I have (perhaps incorrectly) identified disk spa=
ce consumption as the most significant factor in your average user choosing=
 to run a pruned node or a lite client instead of a full node. The average =
user is not typically too worried about bandwidth, and is also not typicall=
y too worried about initial blockchain download time. But the 100GB hit to =
your disk space can be a huge psychological factor, especially if your hard=
 drive only has 500GB available in the first place, and 250+ GB is already =
consumed by other files you have.<br><br></div><div>I believe that improvin=
g the disk usage situation would greatly benefit decentralization, especial=
ly if it could be done without putting pressure on archival nodes.<br></div=
><div><br></div><div><b>Small Nodes Proposal:</b><br><br></div><div>I propo=
se an alternative to the pruned node that does not put undue pressure on ar=
chival nodes, and would be acceptable and non-risky to ship as a default in=
 bitcoin-core. For lack of a better name, I&#39;ll call this new type of no=
de a &#39;small node&#39;. The intention is that bitcoin-core would eventua=
lly ship &#39;small nodes&#39; by default, such that the expected amount of=
 disk consumption drops from today&#39;s 100+ GB to less than 30 GB.<br><br=
></div><div>My alternative proposal has the following properties:<br><br></=
div><div>+ Full nodes only need to store ~20% of the blockchain<br></div><d=
iv>+ With very high probability, a new node will be able to recover the ent=
ire blockchain by connecting to 6 random small node peers.<br></div><div>+ =
An attacker that can eliminate a chosen+ 95% of the full nodes running toda=
y will be unable to prevent new nodes from downloading the full blockchain,=
 even if the attacker is also able to eliminate all archival nodes. (assumi=
ng all nodes today were small nodes instead of archival nodes)<br><br></div=
><div>Method:<br><br></div><div>A small node will pick an index [5, 256). T=
his index is that node&#39;s permanent index. When storing a block, instead=
 of storing the full block, the node will use Reed-Solomon coding to erasur=
e code the block using a 5-of-256 scheme. The result will be 256 pieces tha=
t are 20% of the size of the block each. The node picks the piece that corr=
esponds to its index, and stores that instead. (Indexes 0-4 are reserved fo=
r archival nodes - explained later)<br><br></div><div>The node is now stori=
ng a fragment of every block. Alone, this fragment cannot be used to recove=
r any piece of the blockchain. However, when paired with any 5 unique fragm=
ents (fragments of the same index will not be unique), the full block can b=
e recovered.<br><br></div><div>Nodes can optionally store more than 1 fragm=
ent each. At 5 fragments, the node becomes a full archival node, and the ch=
osen indexes should be 0-4. This is advantageous for the archival node as t=
he encoded data for the first 5 indexes will actually be identical to the b=
lock itself - there is no computational overhead for selecting the first in=
dexes. There is also no need to choose random indexes, because the full blo=
ck can be recovered no matter which indexes are chosen.<br><br></div><div>W=
hen connecting to new peers, the indexes of each peer needs to be known. On=
ce peers totaling 5 unique indexes are discovered, blockchain download can =
begin. Connecting to just 5 small node peers provides a &gt;95% chance of g=
etting 5 uniques, with exponentially improving odds of success as you conne=
ct to more peers. Connecting to a single archive node guarantees that any g=
aps can be filled.<br><br></div><div>A good encoder should be able to turn =
a block into a 5-of-256 piece set in under 10 milliseconds using a single c=
ore on a standard consumer desktop. This should not slow down initial block=
chain download substantially, though the overhead is more than a rounding e=
rror.<br></div><div><br></div><div><b>DoS Prevention:</b><br><br></div><div=
>A malicious node may provide garbage data instead of the actual piece. Giv=
en just the garbage data and 4 other correct pieces, it is impossible (best=
 I know anyway) to tell which piece is the garbage piece.<br><br></div><div=
>One option in this case would be to seek out an archival node that could v=
erify the correctness of the pieces, and identify the malicious node.<br><b=
r></div><div>Another option would be to have the small nodes store a crypto=
graphic checksum of each piece. Obtaining the cryptographic checksum for al=
l 256 pieces would incur a nontrivial amount of hashing (post segwit, as mu=
ch as 100MB of extra hashing per block), and would require an additional ~4=
kb of storage per block. The hashing overhead here may be prohibitive.<br><=
br></div><div>Another solution would be to find additional pieces and brute=
-force combinations of 5 until a working combination was discovered. Though=
 this sounds nasty, it should take less than five seconds of computation to=
 find the working combination given 5 correct pieces and 2 incorrect pieces=
. This computation only needs to be performed once to identify the maliciou=
s peers.<br><br></div><div>I also believe that alternative erasure coding s=
chemes exist which actually are able to identify the bad pieces given suffi=
cient good pieces, however I don&#39;t know if they have the same computati=
onal performance as the best Reed-Solomon coding implementations.<br></div>=
<br><div><b>Deployment:</b><br><br></div><div>Small nodes are completely us=
eless unless the critical mass of 5 pieces can be obtained. The first versi=
on that supports small node block downloads should default everyone to an a=
rchival node (meaning indexes 0-4 are used)<br><br></div><div>Once there ar=
e enough small-node-enabled archive nodes, the default can be switched so t=
hat nodes only have a single index by default. In the first few days, when =
there are only a few small nodes, the previously-deployed archival nodes ca=
n help fill in the gaps, and the small nodes can be useful for blockchain d=
ownload right away.<br><br>----------------------------------<br><br></div>=
<div>This represents a non-trivial amount of code, but I believe that the r=
esult would be a non-trivial increase in the percentage of users running fu=
ll nodes, and a healthier overall network.<br></div></div>

--f403045f138877004e054d574459--