aa/af00786efeb2d416178a7215198471d52833d1


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348

Return-Path: <jim.posen@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 40959E62
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 17 May 2018 21:27:18 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-qt0-f178.google.com (mail-qt0-f178.google.com
	[209.85.216.178])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 2F06A6CF
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 17 May 2018 21:27:17 +0000 (UTC)
Received: by mail-qt0-f178.google.com with SMTP id q13-v6so7762368qtp.4
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 17 May 2018 14:27:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:cc; bh=UxQSIuFgpj/R+hKZvfJ+9oiH4zEsct+mAiVfSurRSW8=;
	b=KqZRCvheo2KCniJDSqNkJ7giq7Ym7kC9ZNx/oOdKVljwe+S6kjUr+HDBDy+tt7E9Im
	q7rJaPpjQc6pklr3ngen9vSCFVSgIfgd2f0Xa//BWOKCuuHC9xCpiVS0p1a4iFf8dKd8
	6RS2uM58QHadJxdjM4y8SaKb68Q+ico3eJMlstwzLZTVvFVVp+3+ZgqnIl83KZ9pPeTW
	aGAsL7VnO17gjoBw/+iCRrc15QGChz13IrSnIoA4nkn6HTuYY5yyF/zOyVMujnoJovnx
	HDC6y8Yig7R3sZyT+KKWn/tAwxfJhfpgNTei65Q6wea6lOfvNGDqK8sjFY6wOcwxlUua
	PjwQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc;
	bh=UxQSIuFgpj/R+hKZvfJ+9oiH4zEsct+mAiVfSurRSW8=;
	b=Dd63AIeJ7mv61dvT/BgcXGnsY6UbUkLqXY1KyMp1g3hFOAXWgbumktt0nLT/DFBjfC
	cEzWfkxYWii4nbMldZyrclFoZNJfYgLqLIy8jKWBkx1r75hEBbICUkrZmdq4/Ck6KOJR
	0m0dbWm9ziCjMQiKVe3Dyt/Gh1z1WU3j+7TOqeOWf0bf4+nE1+BxNQkClDJTkQ+H/VWl
	Bf8V2QrfIB9fB4weC5WV42AZxWEAKUYOF1qwcUdhe3y9Exw63wUaekuLQnyH3cvd50GG
	fmVxwpeHJlsEF0GyDI9oJkIHSMuTC2doz5dIgoDJNSCwfMqAw3oOPqSTX41tt+M+5MbH
	gsuA==
X-Gm-Message-State: ALKqPwe8DqtDQQwvHXteyJMWnyC8ieT9wUNCGLK+E9jJgEyp/IyNCM/h
	Qm+wSMaMnxVnu0tv24wqMw1dog3+bzJQb0gxSPA=
X-Google-Smtp-Source: AB8JxZrTEXsQ07zoqkLOD0+TGiDTRDasH3fzgU42oCM9tpp2v7+tdUzOZYrZPNyuCwX+P0zMz9YDAScu77GafuUkFag=
X-Received: by 2002:ac8:31ca:: with SMTP id
	i10-v6mr6767180qte.166.1526592436127; 
	Thu, 17 May 2018 14:27:16 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.200.50.92 with HTTP; Thu, 17 May 2018 14:27:15 -0700 (PDT)
In-Reply-To: <CAAS2fgQLCN_cuZ-3QPjCLfYOtHfEk=SenTn5=y9LfGzJxLPR3Q@mail.gmail.com>
References: <d43c6082-1b2c-c95b-5144-99ad0021ea6c@mattcorallo.com>
	<CAAS2fgRF-MhOvpFY6c_qAPzNMo3GQ28RExdSbOV6Q6Oy2iWn1A@mail.gmail.com>
	<22d375c7-a032-8691-98dc-0e6ee87a4b08@mattcorallo.com>
	<CAAS2fgR3QRHeHEjjOS1ckEkL-h7=Na56G12hYW9Bmy9WEMduvg@mail.gmail.com>
	<CADZtCShLmH_k-UssNWahUNHgHvWQQ1y638LwaOfnJEipwjbiYg@mail.gmail.com>
	<CAAS2fgQLCN_cuZ-3QPjCLfYOtHfEk=SenTn5=y9LfGzJxLPR3Q@mail.gmail.com>
From: Jim Posen <jim.posen@gmail.com>
Date: Thu, 17 May 2018 14:27:15 -0700
Message-ID: <CADZtCSjYr6VMBVQ=rx44SgRWcFSXhVXUZJB=rHMh4X78Z2eY1A@mail.gmail.com>
To: Gregory Maxwell <greg@xiph.org>
Content-Type: multipart/alternative; boundary="000000000000df4b8e056c6d7fd5"
X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW
	autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Thu, 17 May 2018 21:44:23 +0000
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Thu, 17 May 2018 21:27:18 -0000

--000000000000df4b8e056c6d7fd5
Content-Type: text/plain; charset="UTF-8"

>
> It isn't a question of 'some lite clients' -- I am aware of no
> implementation of these kinds of measures in any cryptocurrency ever.
>

Doesn't mean there can't or shouldn't be a first. :-)


> The same kind of comparison to the block could have been done with
> BIP37 filtering, but no one has implemented that. (similarly, the
> whitepaper suggests doing that for all network rules when a
> disagreement has been seen, though that isn't practical for all
> network rules it could be done for many of them-- but again no
> implementation or AFAIK any interest in implementing that)


Correct me if I'm wrong, but I don't think it's true that the same could be
done for BIP 37. With BIP 37, one would have to download every partial
block from every peer to determine if there is a difference between them.
With BIP 157, you only download a 32 byte filter header from every peer
(because filters are deterministic), and using that commitment can
determine whether there's a conflict requiring further interrogation. The
difference in overhead makes checking for conflicts with BIP 157 practical,
whereas it's not as practical with BIP 37.


> Sure, but at what cost?   And "additional" while nice doesn't
> necessarily translate into a meaningful increase in delivered security
> for any particular application.
>
> I think we might be speaking too generally here.
>

Sure. The security model that BIP 157 now allows is that a light client with*
at least one honest peer serving filters* can get the correct information
about the chain. No, this does not prevent against total eclipse attacks,
but I think it's a much stronger security guarantee than requiring all
peers or even a majority of peers to be honest. In a decentralized network
that stores money, I think there's a big difference between those security
models.


> But in exchange the filters for a given FP rate would be probably
> about half the current size (actual measurements would be needed
> because the figure depends on much scriptpubkey reuse there is, it
> probably could be anywhere between 1/3 and 2/3rd).
>

This does not seem right. Let's assume txids are removed because they are
not relevant to this particular point. The difference as I understand it is
whether to include in the filter serialized outpoints for inputs or
serialized prev scriptPubkeys for inputs. When hashed these are the same
size, and there's an equal number of them (one per input in a block). So
the only savings comes from deduping the prev scriptPubkeys with each other
and with the scriptPubkeys in the block's outputs. So it comes down
entirely to how much address reuse there is on the chain.


> Monitoring inputs by scriptPubkey vs input-txid also has a massive
> advantage for parallel filtering:  You can usually known your pubkeys
> well in advance, but if you have to change what you're watching block
>  N+1 for based on the txids that paid you in N you can't filter them
> in parallel.
>

Yes, I'll grant that this is a benefit of your suggestion.


> I think Peter missed Matt's point that you can monitor for a specific
> transaction's confirmation by monitoring for any of the outpoints that
> transaction contains. Because the txid commits to the outpoints there
> shouldn't be any case where the txid is knowable but (an) outpoint is
> not.  Removal of the txid and monitoring for any one of the outputs
> should be a strict reduction in the false positive rate for a given
> filter size (the filter will contain strictly fewer elements and the
> client will match for the same (or usually, fewer) number).
>
> I _think_ dropping txids as matt suggests is an obvious win that costs
> nothing.  Replacing inputs with scripts as I suggested has some
> trade-offs.
>

I may have interpreted this differently. So wallets need a way to know when
the transactions they send get confirmed (for obvious usability reasons and
so for automatic fee-bumping). One way is to match the spent outpoints
against the filter, which I think of as the standard. Another would be to
match the txid of the spending transaction against the first, which only
works if the transaction is not malleable. Another would be to match the
change output script against the first, assuming the wallet does not reuse
change addresses and that the spending transaction does in fact have a
change output.

Now lets say these pieces of data, txids, output scripts, and spent
outpoints are in three separate filters that a wallet can download
separately or choose not to download. The spent outpoint method is the most
reliable and has no caviats. It also allows for theft detection as Peter
notes, which is a very nice property indeed. If the wallet uses the txid
matching though, the txid filter would be smaller because there are fewer
txids per block than inputs. So there could be some bandwidth savings to
that approach. The change output watching is probably the nicest in some
ways because the client needs the output filter anyway. If the transaction
has no change output with a unique script, the client could watch for any
of the other outputs on the spending tx, but may get more false positives
depending on the degree of address reuse.

--000000000000df4b8e056c6d7fd5
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><blo=
ckquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #c=
cc solid;padding-left:1ex">It isn&#39;t a question of &#39;some lite client=
s&#39; -- I am aware of no<br>
implementation of these kinds of measures in any cryptocurrency ever.<br></=
blockquote><div><br></div><div>Doesn&#39;t mean there can&#39;t or shouldn&=
#39;t be a first. :-)</div><div>=C2=A0</div><blockquote class=3D"gmail_quot=
e" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">=
<span style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:1=
2.8px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:nor=
mal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;=
text-transform:none;white-space:normal;word-spacing:0px;background-color:rg=
b(255,255,255);text-decoration-style:initial;text-decoration-color:initial;=
float:none;display:inline">The same kind of comparison to the block could h=
ave been done with</span><br style=3D"color:rgb(34,34,34);font-family:arial=
,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatures:norma=
l;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align=
:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:=
0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-de=
coration-color:initial"><span style=3D"color:rgb(34,34,34);font-family:aria=
l,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatures:norm=
al;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-alig=
n:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing=
:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-d=
ecoration-color:initial;float:none;display:inline">BIP37 filtering, but no =
one has implemented that. (similarly, the</span><br style=3D"color:rgb(34,3=
4,34);font-family:arial,sans-serif;font-size:12.8px;font-style:normal;font-=
variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-sp=
acing:normal;text-align:start;text-indent:0px;text-transform:none;white-spa=
ce:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoratio=
n-style:initial;text-decoration-color:initial"><span style=3D"color:rgb(34,=
34,34);font-family:arial,sans-serif;font-size:12.8px;font-style:normal;font=
-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-s=
pacing:normal;text-align:start;text-indent:0px;text-transform:none;white-sp=
ace:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decorati=
on-style:initial;text-decoration-color:initial;float:none;display:inline">w=
hitepaper suggests doing that for all network rules when a</span><br style=
=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8px;font-=
style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-we=
ight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transf=
orm:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,2=
55);text-decoration-style:initial;text-decoration-color:initial"><span styl=
e=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8px;font=
-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-w=
eight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-trans=
form:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,=
255);text-decoration-style:initial;text-decoration-color:initial;float:none=
;display:inline">disagreement has been seen, though that isn&#39;t practica=
l for all</span><br style=3D"color:rgb(34,34,34);font-family:arial,sans-ser=
if;font-size:12.8px;font-style:normal;font-variant-ligatures:normal;font-va=
riant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;te=
xt-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;backg=
round-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-=
color:initial"><span style=3D"color:rgb(34,34,34);font-family:arial,sans-se=
rif;font-size:12.8px;font-style:normal;font-variant-ligatures:normal;font-v=
ariant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;t=
ext-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;back=
ground-color:rgb(255,255,255);text-decoration-style:initial;text-decoration=
-color:initial;float:none;display:inline">network rules it could be done fo=
r many of them-- but again no</span><br style=3D"color:rgb(34,34,34);font-f=
amily:arial,sans-serif;font-size:12.8px;font-style:normal;font-variant-liga=
tures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal=
;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wo=
rd-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:init=
ial;text-decoration-color:initial"><span style=3D"color:rgb(34,34,34);font-=
family:arial,sans-serif;font-size:12.8px;font-style:normal;font-variant-lig=
atures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:norma=
l;text-align:start;text-indent:0px;text-transform:none;white-space:normal;w=
ord-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:ini=
tial;text-decoration-color:initial;float:none;display:inline">implementatio=
n or AFAIK any interest in implementing that)</span></blockquote><div><br><=
/div><div>Correct me if I&#39;m wrong, but I don&#39;t think it&#39;s true =
that the same could be done for BIP 37. With BIP 37, one would have to down=
load every partial block from every peer to determine if there is a differe=
nce between them. With BIP 157, you only download a 32 byte filter header f=
rom every peer (because filters are deterministic), and using that commitme=
nt can determine whether there&#39;s a conflict requiring further interroga=
tion. The difference in overhead makes checking for conflicts with BIP 157 =
practical, whereas it&#39;s not as practical with BIP 37.</div><div>=C2=A0<=
/div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-le=
ft:1px #ccc solid;padding-left:1ex">Sure, but at what cost?=C2=A0 =C2=A0And=
 &quot;additional&quot; while nice doesn&#39;t<br>
necessarily translate into a meaningful increase in delivered security<br>
for any particular application.<br>
<br>
I think we might be speaking too generally here.<br></blockquote><div><br><=
/div><div>Sure. The security model that BIP 157 now allows is that a light =
client with<i> at least one honest peer serving filters</i> can get the cor=
rect information about the chain. No, this does not prevent against total e=
clipse attacks, but I think it&#39;s a much stronger security guarantee tha=
n requiring all peers or even a majority of peers to be honest. In a decent=
ralized network that stores money, I think there&#39;s a big difference bet=
ween those security models.</div><div>=C2=A0</div><blockquote class=3D"gmai=
l_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left=
:1ex">
But in exchange the filters for a given FP rate would be probably<br>
about half the current size (actual measurements would be needed<br>
because the figure depends on much scriptpubkey reuse there is, it<br>
probably could be anywhere between 1/3 and 2/3rd).=C2=A0<br></blockquote><d=
iv><br></div><div>This does not seem right. Let&#39;s assume txids are remo=
ved because they are not relevant to this particular point. The difference =
as I understand it is whether to include in the filter serialized outpoints=
 for inputs or serialized prev scriptPubkeys for inputs. When hashed these =
are the same size, and there&#39;s an equal number of them (one per input i=
n a block). So the only savings comes from deduping the prev scriptPubkeys =
with each other and with the scriptPubkeys in the block&#39;s outputs. So i=
t comes down entirely to how much address reuse there is on the chain.</div=
><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex">Monitoring inputs by scrip=
tPubkey vs input-txid also has a massive<br>
advantage for parallel filtering:=C2=A0 You can usually known your pubkeys<=
br>
well in advance, but if you have to change what you&#39;re watching block<b=
r>
=C2=A0N+1 for based on the txids that paid you in N you can&#39;t filter th=
em<br>
in parallel.<br></blockquote><div><br></div><div>Yes, I&#39;ll grant that t=
his is a benefit of your suggestion.</div><div>=C2=A0</div><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pad=
ding-left:1ex">I think Peter missed Matt&#39;s point that you can monitor f=
or a specific<br>
transaction&#39;s confirmation by monitoring for any of the outpoints that<=
br>
transaction contains. Because the txid commits to the outpoints there<br>
shouldn&#39;t be any case where the txid is knowable but (an) outpoint is<b=
r>
not.=C2=A0 Removal of the txid and monitoring for any one of the outputs<br=
>
should be a strict reduction in the false positive rate for a given<br>
filter size (the filter will contain strictly fewer elements and the<br>
client will match for the same (or usually, fewer) number).<br>
<br>
I _think_ dropping txids as matt suggests is an obvious win that costs<br>
nothing.=C2=A0 Replacing inputs with scripts as I suggested has some<br>
trade-offs.<br>
</blockquote></div><br></div><div class=3D"gmail_extra">I may have interpre=
ted this differently. So wallets need a way to know when the transactions t=
hey send get confirmed (for obvious usability reasons and so for automatic =
fee-bumping). One way is to match the spent outpoints against the filter, w=
hich I think of as the standard. Another would be to match the txid of the =
spending transaction against the first, which only works if the transaction=
 is not malleable. Another would be to match the change output script again=
st the first, assuming the wallet does not reuse change addresses and that =
the spending transaction does in fact have a change output.</div><div class=
=3D"gmail_extra"><br></div><div class=3D"gmail_extra">Now lets say these pi=
eces of data, txids, output scripts, and spent outpoints are in three separ=
ate filters that a wallet can download separately or choose not to download=
. The spent outpoint method is the most reliable and has no caviats. It als=
o allows for theft detection as Peter notes, which is a very nice property =
indeed. If the wallet uses the txid matching though, the txid filter would =
be smaller because there are fewer txids per block than inputs. So there co=
uld be some bandwidth savings to that approach. The change output watching =
is probably the nicest in some ways because the client needs the output fil=
ter anyway. If the transaction has no change output with a unique script, t=
he client could watch for any of the other outputs on the spending tx, but =
may get more false positives depending on the degree of address reuse.</div=
></div>

--000000000000df4b8e056c6d7fd5--