summaryrefslogtreecommitdiff
path: root/28/f508aea06cc80ed69252343c4a2e4c8fa46232
blob: 39102de3ebadec1817e3bbaf61b4f00ac9cc9f78 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
Received: from sog-mx-1.v43.ch3.sourceforge.com ([172.29.43.191]
	helo=mx.sourceforge.net)
	by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <pete@petertodd.org>) id 1WuPWB-0005Pz-Uz
	for bitcoin-development@lists.sourceforge.net;
	Tue, 10 Jun 2014 17:07:27 +0000
Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of petertodd.org
	designates 62.13.149.95 as permitted sender)
	client-ip=62.13.149.95; envelope-from=pete@petertodd.org;
	helo=outmail149095.authsmtp.com; 
Received: from outmail149095.authsmtp.com ([62.13.149.95])
	by sog-mx-1.v43.ch3.sourceforge.com with esmtp (Exim 4.76)
	id 1WuPWA-0008TQ-2L for bitcoin-development@lists.sourceforge.net;
	Tue, 10 Jun 2014 17:07:27 +0000
Received: from mail-c235.authsmtp.com (mail-c235.authsmtp.com [62.13.128.235])
	by punt15.authsmtp.com (8.14.2/8.14.2/) with ESMTP id s5AH7Drd032459;
	Tue, 10 Jun 2014 18:07:13 +0100 (BST)
Received: from savin (76-10-178-109.dsl.teksavvy.com [76.10.178.109])
	(authenticated bits=128)
	by mail.authsmtp.com (8.14.2/8.14.2/) with ESMTP id s5AH76nD054023
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
	Tue, 10 Jun 2014 18:07:08 +0100 (BST)
Date: Tue, 10 Jun 2014 13:08:46 -0400
From: Peter Todd <pete@petertodd.org>
To: Mike Hearn <mike@plan99.net>, Jeff Garzik <jgarzik@bitpay.com>
Message-ID: <20140610170846.GB21293@savin>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="CUfgB8w4ZwR/yMy5"
Content-Disposition: inline
In-Reply-To: <CAJHLa0OJffXU_LkmVhCJ80mphc5zEPuDZzuSKvpLkrNUWU7Y1w@mail.gmail.com>
	<CANEZrP33n+UBE+0Zb_mh=+qjJA+C+Nny9quC5B0HpuLC1XygMA@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Server-Quench: a846716b-f0c1-11e3-b396-002590a15da7
X-AuthReport-Spam: If SPAM / abuse - report it at:
	http://www.authsmtp.com/abuse
X-AuthRoute: OCd2Yg0TA1ZNQRgX IjsJECJaVQIpKltL GxAVKBZePFsRUQkR
	bgdMdwIUEkAaAgsB AmIbWlxeUFp7WWo7 bAxPbAVDY01GQQRq
	WVdMSlVNFUsrBG15 ARhAJhl3cwVEezBy bE5nXj4KVEB8ckd7
	EFMCFT4HeGZhPWMC AkNRcR5UcAFPdx8U a1UrBXRDAzANdhES
	HhM4ODE3eDlSNilR RRkIIFQOdA4hGjk7 QlgIFDQzdQAA
X-Authentic-SMTP: 61633532353630.1023:706
X-AuthFastPath: 0 (Was 255)
X-AuthSMTP-Origin: 76.10.178.109/587
X-AuthVirus-Status: No virus detected - but ensure you scan with your own
	anti-virus system.
X-Spam-Score: -1.5 (-)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
	sender-domain
	-0.0 SPF_PASS               SPF: sender matches SPF record
	0.0 FAKE_REPLY_C           FAKE_REPLY_C
X-Headers-End: 1WuPWA-0008TQ-2L
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] Bloom bait
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Tue, 10 Jun 2014 17:07:28 -0000


--CUfgB8w4ZwR/yMy5
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jun 10, 2014 at 06:38:23PM +0800, Mike Hearn wrote:
> >
> > As I explained in the email you're replying to and didn't quote, bloom
> > filters has O(n) cost per query, so sending different bloom filters to
> > different peers for privacy reasons costs the network significant disk
> > IO resources. If I were to actually implement it it'd look like a DoS
> > attack on the network.
> >
>=20
> DoS attack? Nice try.

Suppose I wrote an single address lookup tool for Android that connected
to multiple peers and used bloom filters to find the history of a
specific address. Of course, I don't want to use too much bandwidth
being on mobile, so I'll use as specific a bloom filter as possible. I
might even connect to multiple peers to speed up the lookup.

Is this any different from my bloom filter IO attack code? Nope. Hence,
splitting up bloom filter requests for better privacy will certainly
look like a DoS attack and will certainly greatly increase the load on
the network.


> Now consider a prefix filtering implementation. You need to calculate a
> sorted list of all the data elements and tx hashes in the block, that maps
> to the location in the block where the tx data can be found. These
> per-block indexes take up extra disk space and, realistically, would like=
ly
> be implemented using LevelDB as that's a tool which is designed for
> creating and using these kinds of tables, so then you're both loading the
> block data itself (blocks are sized about right currently to always fit in
> the default kernel readahead window) AND also seeking through the indexes,
> and building them too. A smart implementation might try and pack the index
> next to each block so it's possible to load both at once with a single
> seek, but that would probably be more work, as it'd force building of the
> index to be synchronous with saving the block to disk thus slowing down
> block relay. In contrast a LevelDB based index would do the bulk of the
> index-building work on a separate core.

That's exactly the kinds of optimizations obelisk is implementing to
make its prefix lookup database fast. Also those optimizations are
situation dependent, for instance "packing the index next to each block"
is irrelevant if you put archival blockchain data on a slow HD, and
indexes on a fast SSD, something some obelisk servers do.

More to the point, your showing quite clearly there isn't just one
optimal way to do it. Applying a bloom filter, or a prefix filter, or
some as yet unknown filter, to blockchain data is a service and that
service has different tradeoffs compared to just serving up archival
block history. There is zero reason not to make that service something
you advertise with NODE_BLOOM - after all, you already have the code in
bitcoinj to do the exact same thing by checking the advertised protocol
version.


On Tue, Jun 10, 2014 at 09:02:00AM -0400, Jeff Garzik wrote:
> Most of this description of disk activity is true, but it omits one
> key point:  Total cached data (working set).  It is a binary, first
> order question:  are you hitting pagecache, or the disk?  When nodes
> act as archival data sources, the pagecache pressure is immense.  When
> nodes just primarily serve recent blocks, that data is being served
> out of pagecache. As I directly observed running public nodes, the
> disks were running constantly, impacting all clients, even clients
> downloading only recent blocks.
>=20
> Luckily, headers are served out of RAM, so that part of the sync is alway=
s fast.
>=20
> NODE_BLOOM -- and block download in general -- will tend to be slower
> than it could be, due to the working set almost always being larger
> than available pagecache.  Fix that problem, NODE_BLOOM will always
> operate out of pagecache, and disk activity will not be an issue.
>=20
> Once you start hitting the disk, you've already lost.

Yup. I discussed this with Matt Corallo at the financial crypto
conference a few months back and he made the same point. Unfortunately
we'll need an upgrade to let nodes advertise ranges of blocks to begin
to fix that issue, and even then it still shows quite clearly how it's
not optimal if we force everyone to share blockchain data in the same
way.

--=20
'peter'[:-1]@petertodd.org
000000000000000023c7fc084ed84b891cc2fa90e4a34708d6b2370d3ec1c85d

--CUfgB8w4ZwR/yMy5
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)

iQGrBAEBCACVBQJTlzuYXhSAAAAAABUAQGJsb2NraGFzaEBiaXRjb2luLm9yZzAw
MDAwMDAwMDAwMDAwMDAzZjBiYjI0M2IwMjZmYTkwNDk0NmYyNjJlZWJmMDY4NDYx
NGViYjI2MjJlNGU1NTQvFIAAAAAAFQARcGthLWFkZHJlc3NAZ251cGcub3JncGV0
ZUBwZXRlcnRvZC5vcmcACgkQJIFAPaXwkftr7AgAr6LTDzsBVv4pjvVlc4GpgL4b
I44h1VuXiujXRRHH0mPoLQoTm7/XHcYeL4LfZi2H98jEOracYtc1ONBiMIkZauFO
ZsAVroEyjr1+KrNUfvzZ8NYX8onIMIUkWQtTnV1227HMgxPkoYWpI3VDI/FxFcVu
in7EBCjMW08L36lnzl24u6mFgCyQXaV5PBKzjtzkQRo4eOR2H6ImtuCpxCnWKzwS
7dV8E9zonS6kwyu9Uq+35eEtlPXNte9q8lsHQYGOly30PK45x3mebx8dSqa2+cj9
s9aacOtyIqjYt9+sM2Upjif6oyBo+xVkPVpTG/Lm1UZTS3odjTmYUFiW3bYE3g==
=0B4b
-----END PGP SIGNATURE-----

--CUfgB8w4ZwR/yMy5--