summaryrefslogtreecommitdiff
path: root/17/372f5f81f719224a3f1684b71d2d3ae7971b49
blob: 5bd43f518b87008ba9bf8fd35c2288e0431a3515 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
Return-Path: <conner@lightning.engineering>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 8E366D00
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 24 May 2018 01:04:47 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-ua0-f173.google.com (mail-ua0-f173.google.com
	[209.85.217.173])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 8BD59180
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 24 May 2018 01:04:46 +0000 (UTC)
Received: by mail-ua0-f173.google.com with SMTP id b25-v6so16067257uak.3
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Wed, 23 May 2018 18:04:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=lightning-engineering.20150623.gappssmtp.com; s=20150623;
	h=mime-version:references:in-reply-to:from:date:message-id:subject:to
	:cc; bh=uR1/NR6x6swPHU1N1JiAjUctDbWuf+3vn+WVNocYG9w=;
	b=Me70NCjpLsLNvZJAsy4NxgVgMp3WGuIXpuZ4E5io2Yd5QitXcXcFtT4BzOQzPSilXD
	MCWX98KCQv6AAvhC1d4n+Nq4EjQNPjkbrhGM/DV5TTnaXXVuVC12oWooWDTZqqlQ6bPi
	8oPt4y73M+7iT5m7LQ39J3K8rK9z44VurQ3jT5Aw6U6PHw71+IiT2XdEr8seGIbf2oxh
	NuaAKEVmb5GjdexfXDcJc2BSgV3H7RuXQtTVOys27ce2RJnXusLBYPN+D6cS1gL2T/kn
	LAc6sghx0VIMAjMAddoDF1uIbvgSKEU4y8hevVQQiiVWmFRNj4s909vkrCKvxX6rNkuJ
	izow==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:references:in-reply-to:from:date
	:message-id:subject:to:cc;
	bh=uR1/NR6x6swPHU1N1JiAjUctDbWuf+3vn+WVNocYG9w=;
	b=FbNoxA9j6UJtGzOzPWqKqz+wDZLKcjqfeNuDDjOQae8e63HGSDgpRWmrSontCViiKQ
	9ACXXXNpV8bPqC6gQ0RYFEhjtoY1tvF1hp6uLEEc1DM9L6/75XBYMDjaj6qPCqq7xY5+
	TSzs7sJd8F5IUKhzsGh0WnzL7j5DXpv7zWUsRGMCg3/L/pSJqCscKyg3f4k1stZOmjzc
	tjBuVUI9usclGAubmPitQ9SqB121A7C4Sbk5/JmaBDLelv7rR8n8ADCbua9lMmej+xQe
	6klNTywGFi6HWSpyP3KHflY9zFycjWEJKeUvTpPZEqZiMJSeYkcUCQnswGNMKY0qGtqH
	CqSQ==
X-Gm-Message-State: ALKqPwfR+CdSWCCMdVnzwDgHXwaAj9mqFSFE2U6gjhAoJdbn1TYlxW04
	/J+Fe4v/bgwJXMiUPVp1I8dT6YnTCjRnepzlMmO+eg==
X-Google-Smtp-Source: AB8JxZrCxE70GvSZBqriJj3q0M+CHwG20R2Vh8YysEAOqQapBJW6YEH6I6zhludmHuF+KmRDnfESf6tSiEvciE66Mog=
X-Received: by 2002:ab0:30d5:: with SMTP id
	c21-v6mr3566808uam.69.1527123885604; 
	Wed, 23 May 2018 18:04:45 -0700 (PDT)
MIME-Version: 1.0
References: <d43c6082-1b2c-c95b-5144-99ad0021ea6c@mattcorallo.com>
	<CAAS2fgRF-MhOvpFY6c_qAPzNMo3GQ28RExdSbOV6Q6Oy2iWn1A@mail.gmail.com>
	<22d375c7-a032-8691-98dc-0e6ee87a4b08@mattcorallo.com>
	<CAAS2fgR3QRHeHEjjOS1ckEkL-h7=Na56G12hYW9Bmy9WEMduvg@mail.gmail.com>
	<CADZtCShLmH_k-UssNWahUNHgHvWQQ1y638LwaOfnJEipwjbiYg@mail.gmail.com>
	<CAAS2fgQLCN_cuZ-3QPjCLfYOtHfEk=SenTn5=y9LfGzJxLPR3Q@mail.gmail.com>
	<CADZtCSjYr6VMBVQ=rx44SgRWcFSXhVXUZJB=rHMh4X78Z2eY1A@mail.gmail.com>
	<CAO3Pvs9K3n=OzVQ06XGQvzNC+Aqp9S60kWM9VRPA8hWTJ3u9BQ@mail.gmail.com>
	<c23a5346-9f99-44f0-abbf-d7e7979bf1d8@gmail.com>
	<CAO3Pvs_MA4TtgCCu1NgCBjK2bZRN+rKnGQJN6m4yTrViBXRiPA@mail.gmail.com>
	<CAD3i26BibcaMdbQv-j+Egz_1y0GuhzepBp5ATNpj=Qv8hi1TVA@mail.gmail.com>
	<CADZtCShAYpbN=4qNoX5c8yd1j08+mEZzG8gZwcHrj2suY0mb9w@mail.gmail.com>
	<CADZtCShYnM3A949H18V2+BArA-K9J+cDkd=rX8xRn0+0js5CwA@mail.gmail.com>
	<CAAS2fgTXS5Tains7dfe_Rc9JxR6M=NuFW9UtieRELm+6N2uNog@mail.gmail.com>
In-Reply-To: <CAAS2fgTXS5Tains7dfe_Rc9JxR6M=NuFW9UtieRELm+6N2uNog@mail.gmail.com>
From: Conner Fromknecht <conner@lightning.engineering>
Date: Wed, 23 May 2018 18:04:34 -0700
Message-ID: <CAFfwr8F+ghYb2HYEgC7Lh7Z-ytNE7EABr6cxiVXYhWLk-TPO7A@mail.gmail.com>
To: Gregory Maxwell <greg@xiph.org>, 
	Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: multipart/alternative; boundary="000000000000bad74f056ce93cbc"
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Thu, 24 May 2018 01:12:46 +0000
Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Thu, 24 May 2018 01:04:47 -0000

--000000000000bad74f056ce93cbc
Content-Type: text/plain; charset="UTF-8"

Hi all,

Jimpo, thanks for looking into those stats! I had always imagined that there
would be a more significant savings in having all filters in one bundle, as
opposed to separate. These results are interesting, to say the least, and
definitely offer us some flexibility in options for filter sharding.

So far, the bulk of this discussion has centered around bandwidth. I am
concerned, however, that splitting up the filters is at odds with the other
goal of the proposal in offering improved privacy.

Allowing clients to choose individual filter sets trivially exposes the
type of
data that client is interested in. This alone might be enough to
fingerprint the
function of a peer and reduce anonymity set justifying their potential
behavior.

Furthermore, if a match is encountered, and block requested, full nodes have
more targeted insight into what caused a particular match. They could infer
that
the client received funds in a particular block, e.g., if they are only
requesting
output scripts.

This is above and beyond the additional complexity of now syncing,
validating,
and managing five or six distinct header/filter-header/filter/block chains.

I agree that saving on bandwidth is an important goal, but bandwidth and
privacy
are always seemingly at odds. Strictly comparing the bandwidth requirements
of
a system that heavily weighs privacy to existing ones, e.g. BIP39, that
don't is a
losing battle IMO.

I'm not fundamentally opposed to splitting the filters, I certainly see the
arguments for flexibility. However, I also want to ensure we are
considering the
second order effects that fall out of optimizing for one metric when others
exist.

Cheers,
Conner
On Wed, May 23, 2018 at 10:29 Gregory Maxwell via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Any chance you could add a graph of input-scripts  (instead of input
> outpoints)?
>
> On Wed, May 23, 2018 at 7:38 AM, Jim Posen via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org> wrote:
> > So I checked filter sizes (as a proportion of block size) for each of the
> > sub-filters. The graph is attached.
> >
> > As interpretation, the first ~120,000 blocks are so small that the
> > Golomb-Rice coding can't compress the filters that well, which is why the
> > filter sizes are so high proportional to the block size. Except for the
> > input filter, because the coinbase input is skipped, so many of them
> have 0
> > elements. But after block 120,000 or so, the filter compression converges
> > pretty quickly to near the optimal value. The encouraging thing here is
> that
> > if you look at the ratio of the combined size of the separated filters vs
> > the size of a filter containing all of them (currently known as the basic
> > filter), they are pretty much the same size. The mean of the ratio
> between
> > them after block 150,000 is 99.4%. So basically, not much compression
> > efficiently is lost by separating the basic filter into sub-filters.
> >
> > On Tue, May 22, 2018 at 5:42 PM, Jim Posen <jim.posen@gmail.com> wrote:
> >>>
> >>> My suggestion was to advertise a bitfield for each filter type the node
> >>> serves,
> >>> where the bitfield indicates what elements are part of the filters.
> This
> >>> essentially
> >>> removes the notion of decided filter types and instead leaves the
> >>> decision to
> >>> full-nodes.
> >>
> >>
> >> I think it makes more sense to construct entirely separate filters for
> the
> >> different types of elements and allow clients to download only the ones
> they
> >> care about. If there are enough elements per filter, the compression
> ratio
> >> shouldn't be much worse by splitting them up. This prevents the
> exponential
> >> blowup in the number of filters that you mention, Johan, and it works
> nicely
> >> with service bits for advertising different filter types independently.
> >>
> >> So if we created three separate filter types, one for output scripts,
> one
> >> for input outpoints, and one for TXIDs, each signaled with a separate
> >> service bit, are people good with that? Or do you think there shouldn't
> be a
> >> TXID filter at all, Matt? I didn't include the option of a prev output
> >> script filter or rolling that into the block output script filter
> because it
> >> changes the security model (cannot be proven to be correct/incorrect
> >> succinctly).
> >>
> >> Then there's the question of whether to separate or combine the headers.
> >> I'd lean towards keeping them separate because it's simpler that way.
> >
> >
> >
> > _______________________________________________
> > bitcoin-dev mailing list
> > bitcoin-dev@lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
> >
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>

--000000000000bad74f056ce93cbc
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">





<p class=3D"inbox-inbox-p1"><span class=3D"inbox-inbox-s1">Hi all,<br></spa=
n><span class=3D"inbox-inbox-s2"><br>Jimpo</span><span class=3D"inbox-inbox=
-s1">, thanks for looking into those stats! I had always imagined that ther=
e<br></span>would be a more significant savings in having all filters in on=
e bundle, as<br>opposed to separate. These results are interesting, to say =
the least, and<span class=3D"inbox-inbox-Apple-converted-space"><br></span>=
<span class=3D"inbox-inbox-s1">definitely offer us some flexibility in opti=
ons for filter </span><span class=3D"inbox-inbox-s2">sharding</span><span c=
lass=3D"inbox-inbox-s1">.<br></span></p><p class=3D"inbox-inbox-p1">So far,=
 the bulk of this discussion has centered around bandwidth. I am<br>concern=
ed, however, that splitting up the filters is at odds with the other <br>go=
al of the proposal in offering improved privacy.<br><br>Allowing clients to=
 choose individual filter sets trivially exposes the type of<br>data that c=
lient is interested in. This alone might be enough to fingerprint the<span =
class=3D"inbox-inbox-Apple-converted-space"><br></span>function of a peer a=
nd reduce anonymity set justifying their potential<br>behavior.<br><br>Furt=
hermore, if a match is encountered, and block requested, full nodes have<br=
>more targeted insight into what caused a particular match. They could infe=
r that<br>the client received funds in a particular block, e.g., if they ar=
e only requesting<br>output scripts.<br><br>This is above and beyond the ad=
ditional complexity of now syncing, validating,<br>and managing five or six=
 distinct header/filter-header/filter/block chains.<br><br>I agree that sav=
ing on bandwidth is an important goal, but bandwidth and privacy<br>are alw=
ays seemingly at odds. Strictly comparing the bandwidth requirements of<br>=
a system that heavily weighs privacy to existing ones, e.g. BIP39, that don=
&#39;t is a <br>losing battle IMO.</p><p class=3D"inbox-inbox-p1">I&#39;m n=
ot fundamentally opposed to splitting the filters, I certainly see the<br>a=
rguments for flexibility. However, I also want to ensure we are considering=
 the<br>second order effects that fall out of optimizing for one<span class=
=3D"inbox-inbox-Apple-converted-space">=C2=A0</span>metric when others exis=
t.</p><p class=3D"inbox-inbox-p1">Cheers,<br>Conner</p><div class=3D"gmail_=
quote"><div dir=3D"ltr">On Wed, May 23, 2018 at 10:29 Gregory Maxwell via b=
itcoin-dev &lt;<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" tar=
get=3D"_blank">bitcoin-dev@lists.linuxfoundation.org</a>&gt; wrote:<br></di=
v><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:=
1px #ccc solid;padding-left:1ex">Any chance you could add a graph of input-=
scripts=C2=A0 (instead of input outpoints)?<br>
<br>
On Wed, May 23, 2018 at 7:38 AM, Jim Posen via bitcoin-dev<br>
&lt;<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_bla=
nk">bitcoin-dev@lists.linuxfoundation.org</a>&gt; wrote:<br>
&gt; So I checked filter sizes (as a proportion of block size) for each of =
the<br>
&gt; sub-filters. The graph is attached.<br>
&gt;<br>
&gt; As interpretation, the first ~120,000 blocks are so small that the<br>
&gt; Golomb-Rice coding can&#39;t compress the filters that well, which is =
why the<br>
&gt; filter sizes are so high proportional to the block size. Except for th=
e<br>
&gt; input filter, because the coinbase input is skipped, so many of them h=
ave 0<br>
&gt; elements. But after block 120,000 or so, the filter compression conver=
ges<br>
&gt; pretty quickly to near the optimal value. The encouraging thing here i=
s that<br>
&gt; if you look at the ratio of the combined size of the separated filters=
 vs<br>
&gt; the size of a filter containing all of them (currently known as the ba=
sic<br>
&gt; filter), they are pretty much the same size. The mean of the ratio bet=
ween<br>
&gt; them after block 150,000 is 99.4%. So basically, not much compression<=
br>
&gt; efficiently is lost by separating the basic filter into sub-filters.<b=
r>
&gt;<br>
&gt; On Tue, May 22, 2018 at 5:42 PM, Jim Posen &lt;<a href=3D"mailto:jim.p=
osen@gmail.com" target=3D"_blank">jim.posen@gmail.com</a>&gt; wrote:<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; My suggestion was to advertise a bitfield for each filter type=
 the node<br>
&gt;&gt;&gt; serves,<br>
&gt;&gt;&gt; where the bitfield indicates what elements are part of the fil=
ters. This<br>
&gt;&gt;&gt; essentially<br>
&gt;&gt;&gt; removes the notion of decided filter types and instead leaves =
the<br>
&gt;&gt;&gt; decision to<br>
&gt;&gt;&gt; full-nodes.<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; I think it makes more sense to construct entirely separate filters=
 for the<br>
&gt;&gt; different types of elements and allow clients to download only the=
 ones they<br>
&gt;&gt; care about. If there are enough elements per filter, the compressi=
on ratio<br>
&gt;&gt; shouldn&#39;t be much worse by splitting them up. This prevents th=
e exponential<br>
&gt;&gt; blowup in the number of filters that you mention, Johan, and it wo=
rks nicely<br>
&gt;&gt; with service bits for advertising different filter types independe=
ntly.<br>
&gt;&gt;<br>
&gt;&gt; So if we created three separate filter types, one for output scrip=
ts, one<br>
&gt;&gt; for input outpoints, and one for TXIDs, each signaled with a separ=
ate<br>
&gt;&gt; service bit, are people good with that? Or do you think there shou=
ldn&#39;t be a<br>
&gt;&gt; TXID filter at all, Matt? I didn&#39;t include the option of a pre=
v output<br>
&gt;&gt; script filter or rolling that into the block output script filter =
because it<br>
&gt;&gt; changes the security model (cannot be proven to be correct/incorre=
ct<br>
&gt;&gt; succinctly).<br>
&gt;&gt;<br>
&gt;&gt; Then there&#39;s the question of whether to separate or combine th=
e headers.<br>
&gt;&gt; I&#39;d lean towards keeping them separate because it&#39;s simple=
r that way.<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; _______________________________________________<br>
&gt; bitcoin-dev mailing list<br>
&gt; <a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_bl=
ank">bitcoin-dev@lists.linuxfoundation.org</a><br>
&gt; <a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-=
dev" rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.org=
/mailman/listinfo/bitcoin-dev</a><br>
&gt;<br>
_______________________________________________<br>
bitcoin-dev mailing list<br>
<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">=
bitcoin-dev@lists.linuxfoundation.org</a><br>
<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.org/mail=
man/listinfo/bitcoin-dev</a><br>
</blockquote></div></div>

--000000000000bad74f056ce93cbc--