summaryrefslogtreecommitdiff
path: root/99/791a3788f8c43b55b74f8160671f7416eecfe7
blob: 36584b85b14ba97f3280d07b55cc2edcaa6f43c4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
Received: from sog-mx-1.v43.ch3.sourceforge.com ([172.29.43.191]
	helo=mx.sourceforge.net)
	by sfs-ml-3.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <tier.nolan@gmail.com>) id 1YsEqQ-0006ZQ-Hl
	for bitcoin-development@lists.sourceforge.net;
	Tue, 12 May 2015 18:23:54 +0000
Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of gmail.com
	designates 209.85.220.170 as permitted sender)
	client-ip=209.85.220.170; envelope-from=tier.nolan@gmail.com;
	helo=mail-qk0-f170.google.com; 
Received: from mail-qk0-f170.google.com ([209.85.220.170])
	by sog-mx-1.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128)
	(Exim 4.76) id 1YsEqP-0006hL-IA
	for bitcoin-development@lists.sourceforge.net;
	Tue, 12 May 2015 18:23:54 +0000
Received: by qkgy4 with SMTP id y4so11829430qkg.2
	for <bitcoin-development@lists.sourceforge.net>;
	Tue, 12 May 2015 11:23:48 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.55.15.15 with SMTP id z15mr3586886qkg.21.1431455028145; Tue,
	12 May 2015 11:23:48 -0700 (PDT)
Received: by 10.140.85.241 with HTTP; Tue, 12 May 2015 11:23:48 -0700 (PDT)
In-Reply-To: <20150512171640.GA32606@savin.petertodd.org>
References: <CANJO25J1WRHtfQLVXUB2s_sjj39pTPWmixAcXNJ3t-5os8RPmQ@mail.gmail.com>
	<CANJO25JTtfmfsOQYOzJeksJn3CoKE3W8iLGsRko-_xd4XhB3ZA@mail.gmail.com>
	<CAJHLa0O5OxaX5g3u=dnCY6Lz_gK3QZgQEPNcWNVRD4JziwAmvg@mail.gmail.com>
	<20150512171640.GA32606@savin.petertodd.org>
Date: Tue, 12 May 2015 19:23:48 +0100
Message-ID: <CAE-z3OV3VdSoiTSfASwYHr1CjZSqio303sqGq_1Y9yaYgov2sw@mail.gmail.com>
From: Tier Nolan <tier.nolan@gmail.com>
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Content-Type: multipart/alternative; boundary=001a11475dda772c460515e6980e
X-Spam-Score: 2.5 (++)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
	sender-domain
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	(tier.nolan[at]gmail.com)
	-0.0 SPF_PASS               SPF: sender matches SPF record
	1.2 MISSING_HEADERS        Missing To: header
	1.0 HTML_MESSAGE           BODY: HTML included in message
	-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from
	author's domain
	0.1 DKIM_SIGNED            Message has a DKIM or DK signature,
	not necessarily valid
	-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
	1.9 MALFORMED_FREEMAIL Bad headers on message from free email service
X-Headers-End: 1YsEqP-0006hL-IA
Subject: Re: [Bitcoin-development] Proposed additional options for pruned
	nodes
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Tue, 12 May 2015 18:23:54 -0000

--001a11475dda772c460515e6980e
Content-Type: text/plain; charset=UTF-8

On Tue, May 12, 2015 at 6:16 PM, Peter Todd <pete@petertodd.org> wrote:

>
> Lots of people are tossing around ideas for partial archival nodes that
> would store a subset of blocks, such that collectively the whole
> blockchain would be available even if no one node had the entire chain.
>

A compact way to describe which blocks are stored helps to mitigate against
fingerprint attacks.

It also means that a node could compactly indicate which blocks it stores
with service bits.

The node could pick two numbers

W = window = a power of 2
P = position = random value less than W

The node would store all blocks with a height of P mod W.  The block hash
could be used too.

This has the nice feature that the node can throw away half of its data and
still represent what is stored.

W_new = W * 2
P_new = (random_bool()) ? P + W/2 : P;

Half of the stored blocks would match P_new mod W_new and the other half
could be deleted.  This means that the store would use up between 50% and
100% of the allocated size.

Another benefit is that it increases the probability that at least someone
has every block.

If N nodes each store 1% of the blocks, then the odds of a block being
stored is pow(0.99, N).  For 1000 nodes, that gives odds of 1 in 23,164
that a block will be missing.  That means that around 13 out of 300,000
blocks would be missing.  There would likely be more nodes than that, and
also storage nodes, so it is not a major risk.

If everyone is storing 1% of blocks, then they would set W to 128.  As long
as all of the 128 buckets is covered by some nodes, then all blocks are
stored.  With 1000 nodes, that gives odds of 0.6% that at least one bucket
will be missed.  That is better than around 13 blocks being missing.

Nodes could inform peers of their W and P parameters on connection.  The
version message could be amended or a "getparams" message of some kind
could be added.

W could be encoded with 4 bits and P could be encoded with 16 bits, for 20
in total.  W = 1 << bits[19:16] and P = bits[14:0].  That gives a maximum W
of 32768, which is likely to many bits for P.

Initial download would be harder, since new nodes would have to connect to
at least 100 different nodes.  They could download from random nodes, and
just download the ones they are missing from storage nodes.  Even storage
nodes could have a range of W values.

--001a11475dda772c460515e6980e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On T=
ue, May 12, 2015 at 6:16 PM, Peter Todd <span dir=3D"ltr">&lt;<a href=3D"ma=
ilto:pete@petertodd.org" target=3D"_blank">pete@petertodd.org</a>&gt;</span=
> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0=
.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class=
=3D""><br>
</span>Lots of people are tossing around ideas for partial archival nodes t=
hat<br>
would store a subset of blocks, such that collectively the whole<br>
blockchain would be available even if no one node had the entire chain.<br>=
</blockquote><div><br></div><div>A compact way to describe which blocks are=
 stored helps to mitigate against fingerprint attacks.<br><br></div></div>I=
t also means that a node could compactly indicate which blocks it stores wi=
th service bits.<br><br></div><div class=3D"gmail_extra">The node could pic=
k two numbers<br><br></div><div class=3D"gmail_extra">W =3D window =3D a po=
wer of 2<br></div><div class=3D"gmail_extra">P =3D position =3D random valu=
e less than W<br><br></div><div class=3D"gmail_extra">The node would store =
all blocks with a height of P mod W.=C2=A0 The block hash could be used too=
.<br><br></div><div class=3D"gmail_extra">This has the nice feature that th=
e node can throw away half of its data and still represent what is stored.<=
br><br></div><div class=3D"gmail_extra">W_new =3D W * 2<br></div><div class=
=3D"gmail_extra">P_new =3D (random_bool()) ? P + W/2 : P;<br><br></div><div=
 class=3D"gmail_extra">Half of the stored blocks would match P_new mod W_ne=
w and the other half could be deleted.=C2=A0 This means that the store woul=
d use up between 50% and 100% of the allocated size.<br><br></div><div clas=
s=3D"gmail_extra">Another benefit is that it increases the probability that=
 at least someone has every block.<br><br></div><div class=3D"gmail_extra">=
If N nodes each store 1% of the blocks, then the odds of a block being stor=
ed is pow(0.99, N).=C2=A0 For 1000 nodes, that gives odds of 1 in 23,164 th=
at a block will be missing.=C2=A0 That means that around 13 out of 300,000 =
blocks would be missing.=C2=A0 There would likely be more nodes than that, =
and also storage nodes, so it is not a major risk.<br><br></div><div class=
=3D"gmail_extra">If everyone is storing 1% of blocks, then they would set W=
 to 128.=C2=A0 As long as all of the 128 buckets is covered by some nodes, =
then all blocks are stored.=C2=A0 With 1000 nodes, that gives odds of 0.6% =
that at least one bucket will be missed.=C2=A0 That is better than around 1=
3 blocks being missing.<br><br></div><div class=3D"gmail_extra">Nodes could=
 inform peers of their W and P parameters on connection.=C2=A0 The version =
message could be amended or a &quot;getparams&quot; message of some kind co=
uld be added.<br><br></div><div class=3D"gmail_extra">W could be encoded wi=
th 4 bits and P could be encoded with 16 bits, for 20 in total.=C2=A0 W =3D=
 1 &lt;&lt; bits[19:16] and P =3D bits[14:0].=C2=A0 That gives a maximum W =
of 32768, which is likely to many bits for P.<br><br></div><div class=3D"gm=
ail_extra">Initial download would be harder, since new nodes would have to =
connect to at least 100 different nodes.=C2=A0 They could download from ran=
dom nodes, and just download the ones they are missing from storage nodes.=
=C2=A0 Even storage nodes could have a range of W values.<br></div></div>

--001a11475dda772c460515e6980e--