4d/1a386eaa21257d1274d77b685e1479e6ac24d9


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281

Received: from sog-mx-2.v43.ch3.sourceforge.com ([172.29.43.192]
	helo=mx.sourceforge.net)
	by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <mh.in.england@gmail.com>) id 1YPA6m-0006oI-K6
	for bitcoin-development@lists.sourceforge.net;
	Sat, 21 Feb 2015 13:28:36 +0000
Received-SPF: pass (sog-mx-2.v43.ch3.sourceforge.com: domain of gmail.com
	designates 74.125.82.182 as permitted sender)
	client-ip=74.125.82.182; envelope-from=mh.in.england@gmail.com;
	helo=mail-we0-f182.google.com; 
Received: from mail-we0-f182.google.com ([74.125.82.182])
	by sog-mx-2.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128)
	(Exim 4.76) id 1YPA6j-000643-SB
	for bitcoin-development@lists.sourceforge.net;
	Sat, 21 Feb 2015 13:28:36 +0000
Received: by wevm14 with SMTP id m14so10200772wev.8
	for <bitcoin-development@lists.sourceforge.net>;
	Sat, 21 Feb 2015 05:28:27 -0800 (PST)
MIME-Version: 1.0
X-Received: by 10.194.93.134 with SMTP id cu6mr4514559wjb.79.1424525307877;
	Sat, 21 Feb 2015 05:28:27 -0800 (PST)
Sender: mh.in.england@gmail.com
Received: by 10.194.188.11 with HTTP; Sat, 21 Feb 2015 05:28:27 -0800 (PST)
In-Reply-To: <CALqxMTETmkF3j0YpfMLYhLGwwd7Nw7Qu3kR80D3pjTn_g5+Xxw@mail.gmail.com>
References: <CALqxMTE2doZjbsUxd-e09+euiG6bt_J=_BwKY_Ni3MNK6BiW1Q@mail.gmail.com>
	<CANEZrP32M-hSU-a1DA5aTQXsx-6425sTeKW-m-cSUuXCYf+zuQ@mail.gmail.com>
	<CALqxMTFNdtUup5MB2Dc_AmQ827sM-t5yx7WQubbfOEd_bO_Ong@mail.gmail.com>
	<CANEZrP0cOY5Wt_mvBSdGGmi4NfZi04SQ7d6GLpnRxmqvXNArGA@mail.gmail.com>
	<CALqxMTE1OANaMAvqrcOLuKtYd_jmqYp5GcB4CX77S8+fR05=jg@mail.gmail.com>
	<CAAS2fgSsXDTzxS29_SZvy1_Tie8=EGDhUjGkyGTXbc=47ta20w@mail.gmail.com>
	<CANEZrP2XoVL6sWxA5KpsGsNxXi-hwdVN=BqXJfn17N-W0_SHEg@mail.gmail.com>
	<CALqxMTETmkF3j0YpfMLYhLGwwd7Nw7Qu3kR80D3pjTn_g5+Xxw@mail.gmail.com>
Date: Sat, 21 Feb 2015 14:28:27 +0100
X-Google-Sender-Auth: oC0HUha2RqjwrGnlYTO3lyDlz2s
Message-ID: <CANEZrP0nAmhe_jPh5GYD1gX1FLop6zsw+MyXsYizHBR=enfT9g@mail.gmail.com>
From: Mike Hearn <mike@plan99.net>
To: Adam Back <adam@cypherspace.org>
Content-Type: multipart/alternative; boundary=047d7bb7092cf31f66050f9924e4
X-Spam-Score: -0.5 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
	sender-domain
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	(mh.in.england[at]gmail.com)
	-0.0 SPF_PASS               SPF: sender matches SPF record
	1.0 HTML_MESSAGE           BODY: HTML included in message
	0.1 DKIM_SIGNED            Message has a DKIM or DK signature,
	not necessarily valid
	-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-Headers-End: 1YPA6j-000643-SB
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] bloom filtering, privacy
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Sat, 21 Feb 2015 13:28:36 -0000

--047d7bb7092cf31f66050f9924e4
Content-Type: text/plain; charset=UTF-8

Let's put the UTXO commitments/anti-fraud proofs to one side for a moment.
I would like to see them happen one day, but they aren't critical to these
protocols and are just proving to be a distraction.


> Then they make fresh random connections to different nodes and request
> download of the respective individual transactions from the full node.
>

...

About privacy the node can make different random connections to
> different nodes to fetch addresses ..... The full node cant
> correlate the addresses as belonging to the same person by correlating
> the download requests for them, because they are made via different
> nodes.


Apologies for the wall of text, but I don't think this will work nor solve
any real problem. And I must justify such a strong statement clearly.

*First: technical issues*

When you download the per-block Bloom filter and test, what you get back is
a set of script elements (addresses, keys, OP_RETURN tags etc). But then in
the next step you are saying that you connect to random peers and request
individual transactions. We don't know that at this point. All we know are
a set of addresses that possibly matched. So I think what you mean is
"wallets connect to random peers and request transactions in block N that
match a given set of addresses".

This is what Bloom filtering already does, of course. Doing the test
against the per-block filter first doesn't seem to buy us much because with
thousands of transactions per block, even a very tiny FP rate will still
trigger a match on every single one.

The second problem I see is that we can't do this in parallel because of
the following edge case: wallet contains key K and someone sends it money
using an OP_CHECKSIG output. The input which spends this output does not
contain any predictable data, thus we do not know what to look for in the
following blocks to detect a spend of it until we have seen the first
transaction and know its hash.

In practice this means we must either scan through the chain in sequence
and update our matching criteria if we see such an output (this is what the
Bloom filtering protocol already does server-side), or we must constrain
the user such that output scripts always force repetition of predictable
data - this is what mostly happens today due to pay-to-address outputs, but
not always, and correctness is more important than completeness.

If we can't do it in parallel then we must suffer a node round-trip for
every single block we traverse, because we can't request long runs of
blocks with a single command. That latency will kill performance dead. It's
a non starter.

But let's imagine we don't care about OP_CHECKSIG outputs and are willing
to ignore them. There are cases where they are the best and most efficient
technical solution, but let's put that to one side.

The primary difference after making the above changes are that no one node
gets a filter containing *all* our keys and addresses. I don't think a per
block pre-test filter would gain us much efficiency so from a privacy
perspective this is what it boils down to - sharding of the scan.

But we can already do this with the current Bloom filtering protocol.
BitcoinJ doesn't do so because having multiple parallel scans uses up
network IOPs which are a resource of unknown quantity, and because stepping
through the chain in parallel with multiple peers complicates the chain
sync implementation quite a bit.

*Second: this doesn't solve any real problem*

Who cares about collecting Bloom filters off the wire?

Commercial fraudsters? Doubtful. There are much easier ways to steal money.

Spies? Yes! Without a doubt NSA/GCHQ are building or have built databases
of IP addresses to Bitcoin addresses and are correlating it via XKEYSCORE
with other identifiable information.

However, just requesting data from different nodes doesn't help with that,
because they are doing DPI and can still see all the connections, so can
still combine all the filters or received transactions.

Ah, you say, but we're requesting everything via Tor.

Yes, about that. We've implemented that already. Some wallets even use it
by default, like Alon & Chris' Bitcoin Authenticator wallet. It's just one
line of code to activate.

Unfortunately there are severe practical problems to using Tor:

   1. If you don't have a warm consensus then booting it up is very slow.
   We're already slower than our competitors like blockchain.info and
   VISA/MasterCard, we can't make this any worse.

   This one is possibly not that big a deal and can be solved with more
   technical tricks.

   2. Bitcoin Core's DoS strategy means anyone can block all of Tor quite
   trivially. So we'd need some complicated fallback mechanism to disable Tor
   remotely, in case someone did this.

   3. Bitcoin wire traffic isn't encrypted or authenticated so it makes it
   much easier for trolls to tamper with lots of wire traffic at once, whereas
   without Tor it's much harder.

Let's ignore the fact that the Tor project insists on poking the law
enforcement bear with rusty nails, and has been receiving tipoffs about
plans to seize directory authorities. How much Bitcoin wallets should rely
on Tor sticking around is a debate for some other time.

There's a much simpler way to fix all of this - add opportunistic
encryption to the wire protocol.

--047d7bb7092cf31f66050f9924e4
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div=
>Let&#39;s put the UTXO commitments/anti-fraud proofs to one side for a mom=
ent. I would like to see them happen one day, but they aren&#39;t critical =
to these protocols and are just proving to be a distraction.</div><div><br>=
</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 =
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Then they make fresh =
random connections to different nodes and request<br>
download of the respective individual transactions from the full node.<br><=
/blockquote><div><br></div><div>...</div><div><br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">About privacy the node can make different random connections =
to<br>
different nodes to fetch addresses .....=C2=A0The full node cant<br>
correlate the addresses as belonging to the same person by correlating<br>
the download requests for them, because they are made via different<br>
nodes.</blockquote><div><br></div><div>Apologies for the wall of text, but =
I don&#39;t think this will work nor solve any real problem. And I must jus=
tify such a strong statement clearly.</div><div><br></div><div><b>First: te=
chnical issues</b></div><div><br></div><div>When you download the per-block=
 Bloom filter and test, what you get back is a set of script elements (addr=
esses, keys, OP_RETURN tags etc). But then in the next step you are saying =
that you connect to random peers and request individual transactions. We do=
n&#39;t know that at this point. All we know are a set of addresses that po=
ssibly matched. So I think what you mean is &quot;wallets connect to random=
 peers and request transactions in block N that match a given set of addres=
ses&quot;.=C2=A0</div><div><br></div><div>This is what Bloom filtering alre=
ady does, of course. Doing the test against the per-block filter first does=
n&#39;t seem to buy us much because with thousands of transactions per bloc=
k, even a very tiny FP rate will still trigger a match on every single one.=
</div><div><br></div><div>The second problem I see is that we can&#39;t do =
this in parallel because of the following edge case: wallet contains key K =
and someone sends it money using an OP_CHECKSIG output. The input which spe=
nds this output does not contain any predictable data, thus we do not know =
what to look for in the following blocks to detect a spend of it until we h=
ave seen the first transaction and know its hash.=C2=A0</div><div><br></div=
><div>In practice this means we must either scan through the chain in seque=
nce and update our matching criteria if we see such an output (this is what=
 the Bloom filtering protocol already does server-side), or we must constra=
in the user such that output scripts always force repetition of predictable=
 data - this is what mostly happens today due to pay-to-address outputs, bu=
t not always, and correctness is more important than completeness.</div><di=
v><br></div><div>If we can&#39;t do it in parallel then we must suffer a no=
de round-trip for every single block we traverse, because we can&#39;t requ=
est long runs of blocks with a single command. That latency will kill perfo=
rmance dead. It&#39;s a non starter.</div><div><br></div><div>But let&#39;s=
 imagine we don&#39;t care about OP_CHECKSIG outputs and are willing to ign=
ore them. There are cases where they are the best and most efficient techni=
cal solution, but let&#39;s put that to one side.</div><div><br></div><div>=
The primary difference after making the above changes are that no one node =
gets a filter containing <i>all</i>=C2=A0our keys and addresses. I don&#39;=
t think a per block pre-test filter would gain us much efficiency so from a=
 privacy perspective this is what it boils down to - sharding of the scan.<=
/div><div><br></div><div>But we can already do this with the current Bloom =
filtering protocol. BitcoinJ doesn&#39;t do so because having multiple para=
llel scans uses up network IOPs which are a resource of unknown quantity, a=
nd because stepping through the chain in parallel with multiple peers compl=
icates the chain sync implementation quite a bit.</div><div><br></div><div>=
<b>Second: this doesn&#39;t solve any real problem</b></div><div><br></div>=
<div>Who cares about collecting Bloom filters off the wire?</div><div><br><=
/div><div>Commercial fraudsters? Doubtful. There are much easier ways to st=
eal money.</div><div><br></div><div>Spies? Yes! Without a doubt NSA/GCHQ ar=
e building or have built databases of IP addresses to Bitcoin addresses and=
 are correlating it via XKEYSCORE with other identifiable information.</div=
><div><br></div><div>However, just requesting data from different nodes doe=
sn&#39;t help with that, because they are doing DPI and can still see all t=
he connections, so can still combine all the filters or received transactio=
ns.</div><div><br></div><div>Ah, you say, but we&#39;re requesting everythi=
ng via Tor.=C2=A0</div><div><br></div><div>Yes, about that. We&#39;ve imple=
mented that already. Some wallets even use it by default, like Alon &amp; C=
hris&#39; Bitcoin Authenticator wallet. It&#39;s just one line of code to a=
ctivate.</div><div><br></div><div>Unfortunately there are severe practical =
problems to using Tor:</div><div><ol><li>If you don&#39;t have a warm conse=
nsus then booting it up is very slow. We&#39;re already slower than our com=
petitors like <a href=3D"http://blockchain.info">blockchain.info</a> and VI=
SA/MasterCard, we can&#39;t make this any worse.<br><br>This one is possibl=
y not that big a deal and can be solved with more technical tricks.<br><br>=
</li><li>Bitcoin Core&#39;s DoS strategy means anyone can block all of Tor =
quite trivially. So we&#39;d need some complicated fallback mechanism to di=
sable Tor remotely, in case someone did this.<br><br></li><li>Bitcoin wire =
traffic isn&#39;t encrypted or authenticated so it makes it much easier for=
 trolls to tamper with lots of wire traffic at once, whereas without Tor it=
&#39;s much harder.<br></li></ol><div>Let&#39;s ignore the fact that the To=
r project insists on poking the law enforcement bear with rusty nails, and =
has been receiving tipoffs about plans to seize directory authorities. How =
much Bitcoin wallets should rely on Tor sticking around is a debate for som=
e other time.</div></div><div><br></div><div>There&#39;s a much simpler way=
 to fix all of this - add opportunistic encryption to the wire protocol.</d=
iv></div></div></div>

--047d7bb7092cf31f66050f9924e4--