81/189a5ccf77ad747fe44e40f2780ea69b5204c6


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350

Return-Path: <jim.posen@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 8ED3D92EE
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  4 Feb 2019 20:18:22 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com
	[209.85.160.170])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 657BE87D
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  4 Feb 2019 20:18:21 +0000 (UTC)
Received: by mail-qt1-f170.google.com with SMTP id 2so1428467qtb.5
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon, 04 Feb 2019 12:18:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:references:in-reply-to:from:date:message-id:subject:to
	:cc; bh=hVnprkP1IITWKhB8ToUgEX0GkMgEmPMFS49/Gw2ioIs=;
	b=LTfsq8iO8vViZAWGabWSEDpjH95i+qzeTTv27Sj3lOsiT8aO9ekOoI0nV+Pl5YfTwL
	+20b5ZLqQfk2QGGRC/bJ6hgJvDCh1gbKurxf3zZzj+HYvfvtBJBWBtl3QZ8WKLKUEh4j
	aQ1IM0Vp9GZixNzP432kqMmmFaAsaymi+c345B9T2xlzqU3Yin3sf1tRwQiSpuUp5Ogf
	vkAG/AOm2WcDxpuuDT0+OHHJP8hT9Cw0Ego+al0IJx8VuXzf7C5cWbrkuBTtaBu3DE56
	3709TZmVLatrHtYdVUlRcbhMqsSGsGCR+RYBTVnwlhQLYsjs5B1Im3n1SUt0VkyJUtwu
	s3ew==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:references:in-reply-to:from:date
	:message-id:subject:to:cc;
	bh=hVnprkP1IITWKhB8ToUgEX0GkMgEmPMFS49/Gw2ioIs=;
	b=WW+xM7z5/dqT9h1X0USs2w3BjKiHtx8UDeHfj0VHi0d397rZQf/tjVQXENLMfSqnFx
	ODdnCPHO4kh0ceF2xXsusFSnEVhJZIEHnArA5/ILB4TbFJBIHP+K7Ko45korxVRpI1yq
	z2g4SNetH5EjT0MdQrl5NqWApbJhu6ZGOSO8XhJ7WyxUtbuqqLDEWfCwHEaIFJZKv/1S
	n5FgRV0XWKgKVbtuPFZzK+0dv4fF+dJ67lu+3Tf5oK8qUePpNh6P/dlTqFTG4cCnJzC3
	4XaHVIqGeKfKnXLYfWdGOQHXbYva0StSfhnjsRTCaVvAfI1ECJBSyaeqrgHVd9AxfH4r
	Lv1g==
X-Gm-Message-State: AHQUAuboXG+VF0jnmEV9XuYXJufQ8QmiLBiLmatka1Pii5X6rHMb6WEt
	XQfyNaeWb/gOev5ZCgGNUL1ifDa2wn7qxKwekd0=
X-Google-Smtp-Source: AHgI3IZlCG/QdEdLGLzFcqQli4M4ODYXX/dwqul+39nTuV1MHRdPKI/Y7XsLnIi8NANfCL2eRufcXosCWCV4B2AFhjg=
X-Received: by 2002:a0c:80a8:: with SMTP id 37mr936698qvb.191.1549311500273;
	Mon, 04 Feb 2019 12:18:20 -0800 (PST)
MIME-Version: 1.0
References: <6D57649F-0236-4FBA-8376-4815F5F39E8A@gmail.com>
In-Reply-To: <6D57649F-0236-4FBA-8376-4815F5F39E8A@gmail.com>
From: Jim Posen <jim.posen@gmail.com>
Date: Mon, 4 Feb 2019 12:18:08 -0800
Message-ID: <CADZtCSgKu1LvjePNPT=0C0UYQvb47Ca0YN+B_AfgVNTpcOno4w@mail.gmail.com>
To: Tamas Blummer <tamas.blummer@gmail.com>, 
	Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: multipart/alternative; boundary="0000000000009ecea1058117314f"
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE,
	RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Mon, 04 Feb 2019 22:27:12 +0000
Cc: Jim Posen <jimpo@coinbase.com>
Subject: Re: [bitcoin-dev] Interrogating a BIP157 server,
	BIP158 change proposal
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Feb 2019 20:18:22 -0000

--0000000000009ecea1058117314f
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Please see the thread "BIP 158 Flexibility and Filter Size" from 2018
regarding the decision to remove outpoints from the filter [1].

Thanks for bringing this up though, because more discussion is needed on
the client protocol given that clients cannot reliably determine the
integrity of a block filter in a bandwidth-efficient manner (due to the
inclusion of input scripts).

I see three possibilities:
1) Introduce a new P2P message to retrieve all prev-outputs for a given
block (essentially the undo data in Core), and verify the scripts against
the block by executing them. While this permits some forms of input script
malleability (and thus cannot discriminate between all valid and invalid
filters), it restricts what an attacker can do. This was proposed by Laolu
AFAIK, and I believe this is how btcd is proceeding.
2) Clients track multiple possible filter header chains and essentially
consider the union of their matches. So if any filter received for a
particular block header matches, the client downloads the block. The client
can ban a peer if they 1) ever return a filter omitting some data that is
observed in the downloaded block, 2) repeatedly serve filters that trigger
false positive block downloads where such a number of false positives is
statistically unlikely, or 3) repeatedly serves filters that are
significantly larger than the expected size (essentially padding the actual
filters with garbage to waste bandwidth). I have not done the analysis yet,
but we should be able to come up with some fairly simple banning heuristics
using Chernoff bounds. The main downside is that the client logic to track
multiple possible filter chains and filters per block is more complex and
bandwidth increases if connected to a malicious server. I first heard about
this idea from David Harding.
3) Rush straight to committing the filters into the chain (via witness
reserved value or coinbase OP_RETURN) and give up on the pre-softfork BIP
157 P2P mode.

I'm in favor of option #2 despite the downsides since it requires the
smallest number of changes and is supported by the BIP 157 P2P protocol as
currently written. (Though the recommended client protocol in the BIP needs
to be updated to account for this). Another benefit of it is that it
removes some synchronicity assumptions where a peer with the correct
filters keeps timing out and is assumed to be dishonest, while the
dishonest peer is assumed to be OK because it is responsive.

If anyone has other ideas, I'd love to hear them.

-jimpo

[1]
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-June/016057.ht=
ml


On Mon, Feb 4, 2019 at 10:53 AM Tamas Blummer via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> TLDR: a change to BIP158 would allow decision on which filter chain is
> correct at lower bandwith use
>
> Assume there is a BIP157 client that learned a filter header chain earlie=
r
> and is now offered an alternate reality by a newly connected BIP157 serve=
r.
>
> The client notices the alternate reality by routinely asking for filter
> chain checkpoints after connecting to a new BIP157 server. A divergence a=
t
> a checkpoint means that the server disagrees the client's history at or
> before the first diverging checkpoint. The client would then request the
> filter headers between the last matching and first divergent checkpoint,
> and quickly figure which block=E2=80=99s filter is the first that does no=
t match
> previous assumption, and request that filter from the server.
>
> The client downloads the corresponding block, checks that its header fits
> the PoW secured best header chain, re-calculates merkle root of its
> transaction list to know that it is complete and queries the filter to se=
e
> if every output script of every transaction is contained in there, if not
> the server is lying, the case is closed, the server disconnected.
>
> Having all output scripts in the filter does not however guarantee that
> the filter is correct since it might omit input scripts. Inputs scripts a=
re
> not part of the downloaded block, but are in some blocks before that.
> Checking those are out of reach for lightweight client with tools given b=
y
> the current BIP.
>
> A remedy here would be an other filter chain on created and spent
> outpoints as is implemented currently by Murmel. The outpoint filter chai=
n
> must offer a match for every spent output of the block with the divergent
> filter, otherwise the interrogated server is lying since a PoW secured
> block can not spend coins out of nowhere. Doing this check would already
> force the client to download the outpoint filter history up-to the point =
of
> divergence. Then the client would have to download and PoW check every
> block that shows a match in outpoints until it figures that one of the
> spent outputs has a script that was not in the server=E2=80=99s filter, i=
n which
> case the server is lying. If everything checks out then the previous
> assumption on filter history was incorrect and should be replaced by the
> history offered by the interrogated server.
>
> As you see the interrogation works with this added filter but is highly
> ineffective. A really light client should not be forced to download lots =
of
> blocks just to uncover a lying filter server. This would actually be an
> easy DoS on light BIP157 clients.
>
> A better solution is a change to BIP158 such that the only filter contain=
s
> created scripts and spent outpoints. It appears to me that this would ser=
ve
> well both wallets and interrogation of filter servers well:
>
> Wallets would recognize payments to their addresses by the filter as
> output scripts are included, spends from the wallet would be recognized a=
s
> a wallet already knows outpoints of its previously received coins, so it
> can query the filters for them.
>
> Interrogation of a filter server also simplifies, since the filter of the
> block can be checked entirely against the contents of the same block. The
> decision on filter correctness does not require more bandwith then downlo=
ad
> of a block at the mismatching checkpoint. The client could only be forced
> at max. to download 1/1000 th of the blockchain in addition to the filter
> header history.
>
> Therefore I suggest to change BIP158 to have a base filter, defined as:
>
> A basic filter MUST contain exactly the following items for each
> transaction in a block:
>         =E2=80=A2 Spent outpoints
>         =E2=80=A2 The scriptPubKey of each output, aside from all OP_RETU=
RN output
> scripts.
>
> Tamas Blummer
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>

--0000000000009ecea1058117314f
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">Please see the thread &quot;BIP 158 Flexi=
bility and Filter Size&quot; from 2018 regarding the decision to remove out=
points from the filter [1].</div><div dir=3D"ltr"><br></div><div>Thanks for=
 bringing this up though, because more discussion is needed on the client p=
rotocol given that clients cannot reliably determine the integrity of a blo=
ck filter in a bandwidth-efficient manner (due to the inclusion of input sc=
ripts).</div><div><br></div><div>I see three possibilities:</div><div>1) In=
troduce a new P2P message to retrieve all prev-outputs for a given block (e=
ssentially the undo data in Core), and verify the scripts against the block=
 by executing them. While this permits some forms of input script malleabil=
ity (and thus cannot discriminate between all valid and invalid filters), i=
t restricts what an attacker can do. This was proposed by Laolu AFAIK, and =
I believe this is how btcd is proceeding.</div><div>2) Clients track multip=
le possible filter header chains and essentially consider the union of thei=
r matches. So if any filter received for a particular block header matches,=
 the client downloads the block. The client can ban a peer if they 1) ever =
return a filter omitting some data that is observed in the downloaded block=
, 2) repeatedly serve filters that trigger false positive block downloads w=
here such a number of false positives is statistically unlikely, or 3) repe=
atedly serves filters that are significantly larger than the expected size =
(essentially padding the actual filters with garbage to waste bandwidth). I=
 have not done the analysis yet, but we should be able to come up with some=
 fairly simple banning heuristics using Chernoff bounds. The main downside =
is that the client logic to track multiple possible filter chains and filte=
rs per block is more complex and bandwidth increases if connected to a mali=
cious server. I first heard about this idea from David Harding.</div><div>3=
) Rush straight to committing the filters into the chain (via witness reser=
ved value or coinbase OP_RETURN) and give up on the pre-softfork BIP 157 P2=
P mode.</div><div><br></div><div>I&#39;m in favor of option #2 despite the =
downsides since it requires the smallest number of changes and is supported=
 by the BIP 157 P2P protocol as currently written. (Though the recommended =
client protocol in the BIP needs to be updated to account for this). Anothe=
r benefit of it is that it removes some synchronicity assumptions where a p=
eer with the correct filters keeps timing out and is assumed to be dishones=
t, while the dishonest peer is assumed to be OK because it is responsive.</=
div><div><br></div><div>If anyone has other ideas, I&#39;d love to hear the=
m.</div><div><br></div><div>-jimpo</div><div><br></div><div dir=3D"ltr">[1]=
 <a href=3D"https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-Ju=
ne/016057.html">https://lists.linuxfoundation.org/pipermail/bitcoin-dev/201=
8-June/016057.html</a></div><div dir=3D"ltr"><br></div><div dir=3D"ltr"><br=
></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail=
_attr">On Mon, Feb 4, 2019 at 10:53 AM Tamas Blummer via bitcoin-dev &lt;<a=
 href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.li=
nuxfoundation.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote"=
 style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);p=
adding-left:1ex">TLDR: a change to BIP158 would allow decision on which fil=
ter chain is correct at lower bandwith use<br>
<br>
Assume there is a BIP157 client that learned a filter header chain earlier =
and is now offered an alternate reality by a newly connected BIP157 server.=
<br>
<br>
The client notices the alternate reality by routinely asking for filter cha=
in checkpoints after connecting to a new BIP157 server. A divergence at a c=
heckpoint means that the server disagrees the client&#39;s history at or be=
fore the first diverging checkpoint. The client would then request the filt=
er headers between the last matching and first divergent checkpoint, and qu=
ickly figure which block=E2=80=99s filter is the first that does not match =
previous assumption, and request that filter from the server.<br>
<br>
The client downloads the corresponding block, checks that its header fits t=
he PoW secured best header chain, re-calculates merkle root of its transact=
ion list to know that it is complete and queries the filter to see if every=
 output script of every transaction is contained in there, if not the serve=
r is lying, the case is closed, the server disconnected.<br>
<br>
Having all output scripts in the filter does not however guarantee that the=
 filter is correct since it might omit input scripts. Inputs scripts are no=
t part of the downloaded block, but are in some blocks before that. Checkin=
g those are out of reach for lightweight client with tools given by the cur=
rent BIP.<br>
<br>
A remedy here would be an other filter chain on created and spent outpoints=
 as is implemented currently by Murmel. The outpoint filter chain must offe=
r a match for every spent output of the block with the divergent filter, ot=
herwise the interrogated server is lying since a PoW secured block can not =
spend coins out of nowhere. Doing this check would already force the client=
 to download the outpoint filter history up-to the point of divergence. The=
n the client would have to download and PoW check every block that shows a =
match in outpoints until it figures that one of the spent outputs has a scr=
ipt that was not in the server=E2=80=99s filter, in which case the server i=
s lying. If everything checks out then the previous assumption on filter hi=
story was incorrect and should be replaced by the history offered by the in=
terrogated server. <br>
<br>
As you see the interrogation works with this added filter but is highly ine=
ffective. A really light client should not be forced to download lots of bl=
ocks just to uncover a lying filter server. This would actually be an easy =
DoS on light BIP157 clients.<br>
<br>
A better solution is a change to BIP158 such that the only filter contains =
created scripts and spent outpoints. It appears to me that this would serve=
 well both wallets and interrogation of filter servers well:<br>
<br>
Wallets would recognize payments to their addresses by the filter as output=
 scripts are included, spends from the wallet would be recognized as a wall=
et already knows outpoints of its previously received coins, so it can quer=
y the filters for them.<br>
<br>
Interrogation of a filter server also simplifies, since the filter of the b=
lock can be checked entirely against the contents of the same block. The de=
cision on filter correctness does not require more bandwith then download o=
f a block at the mismatching checkpoint. The client could only be forced at=
 max. to download 1/1000 th of the blockchain in addition to the filter hea=
der history.<br>
<br>
Therefore I suggest to change BIP158 to have a base filter, defined as:<br>
<br>
A basic filter MUST contain exactly the following items for each transactio=
n in a block:<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 Spent outpoints<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 The scriptPubKey of each output, asid=
e from all OP_RETURN output scripts.<br>
<br>
Tamas Blummer<br>
_______________________________________________<br>
bitcoin-dev mailing list<br>
<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">=
bitcoin-dev@lists.linuxfoundation.org</a><br>
<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.org/mail=
man/listinfo/bitcoin-dev</a><br>
</blockquote></div>

--0000000000009ecea1058117314f--