summaryrefslogtreecommitdiff
path: root/19/c8be25ac84d8153b38712bc6f0fad173cb0e2b
blob: b04cc1a7e51d7a0e0b86c6716c19b8727102d248 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
Received: from sog-mx-2.v43.ch3.sourceforge.com ([172.29.43.192]
	helo=mx.sourceforge.net)
	by sfs-ml-1.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <mh.in.england@gmail.com>) id 1VBLmm-0001U8-8S
	for bitcoin-development@lists.sourceforge.net;
	Mon, 19 Aug 2013 09:30:04 +0000
Received-SPF: pass (sog-mx-2.v43.ch3.sourceforge.com: domain of gmail.com
	designates 209.85.219.41 as permitted sender)
	client-ip=209.85.219.41; envelope-from=mh.in.england@gmail.com;
	helo=mail-oa0-f41.google.com; 
Received: from mail-oa0-f41.google.com ([209.85.219.41])
	by sog-mx-2.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128)
	(Exim 4.76) id 1VBLmj-0003rg-3B
	for bitcoin-development@lists.sourceforge.net;
	Mon, 19 Aug 2013 09:30:04 +0000
Received: by mail-oa0-f41.google.com with SMTP id j6so3127532oag.28
	for <bitcoin-development@lists.sourceforge.net>;
	Mon, 19 Aug 2013 02:29:55 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.182.204.4 with SMTP id ku4mr12141342obc.21.1376904595722;
	Mon, 19 Aug 2013 02:29:55 -0700 (PDT)
Sender: mh.in.england@gmail.com
Received: by 10.76.80.165 with HTTP; Mon, 19 Aug 2013 02:29:55 -0700 (PDT)
In-Reply-To: <20130819001357.GA4281@savin>
References: <20130819001357.GA4281@savin>
Date: Mon, 19 Aug 2013 11:29:55 +0200
X-Google-Sender-Auth: nhY3jXTcGIL0cqe2OE_MCrjfrYU
Message-ID: <CANEZrP1hdnEE9HsKoSctbQeDxpXce71Y96C-6Jf+xZN6ZMhASg@mail.gmail.com>
From: Mike Hearn <mike@plan99.net>
To: Peter Todd <pete@petertodd.org>
Content-Type: multipart/alternative; boundary=e89a8ff2528a51599d04e44995d5
X-Spam-Score: -0.5 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
	sender-domain
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	(mh.in.england[at]gmail.com)
	-0.0 SPF_PASS               SPF: sender matches SPF record
	1.0 HTML_MESSAGE           BODY: HTML included in message
	0.1 DKIM_SIGNED            Message has a DKIM or DK signature,
	not necessarily valid
	-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-Headers-End: 1VBLmj-0003rg-3B
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] Bloom io attack effectiveness
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Mon, 19 Aug 2013 09:30:04 -0000

--e89a8ff2528a51599d04e44995d5
Content-Type: text/plain; charset=UTF-8

On Mon, Aug 19, 2013 at 2:13 AM, Peter Todd <pete@petertodd.org> wrote:

> In any case given that SPV peers don't contribute back to the network
> they should obviously be heavily deprioritized and served only with
> whatever resources a node has spare.


Well, I'm glad we're making progress towards this kind of model :)

If I had to write a scoring function for node importance, I'd start by
making nodes I connected to more important than nodes that connected to me.
That should prevent the kind of attacks you're talking about. You can then
score within those subsets with greater subtlety, like using how long the
connection has been active (or extending that with signed timestamps).

This doesn't have any in-built bias against SPV nodes, which is probably
very hard to technically implement anyway. But it encodes the intuitive
notion that nodes I selected myself are less likely to be DoS attackers
than nodes which connected to me.

But the trick is to implement the prioritisation code. The usual way to do
this is to have a thread pool that pops requests off a queue. You can
either have multiple queues for different priority bands, or code that
locks the queue and re-orders it when something new is added. I tend to
find the multiple queues approach simpler, especially, it's simpler to
export statistics about that via RPC that make it easy to understand what's
going on underneath the hood.

So IMHO a patch to address I/O exhaustion should look something like this:

   1. Add a thread pool of 2-3 threads (to give the kernel room to overlap
   IO) which take in CBlock load requests and then do the load/parse/filter in
   the background.

   2. Each thread starts by blocking on a counting semaphore which
   represents the total number of requests.

   3. The network thread message loop is adjusted so it can receive some
   kind of futures/callbacks/closure object (I guess Boost provides this,
   alternatively we could switch to using C++11). The closures should also
   have the score of the node they were created for (note: score not a CNode*
   as that complicates memory management).

   4. At the start of the network loop a thread-local (or global) variable
   is set that contains the nodes current score, which is just an n-of-m score
   where M is the total number of connected nodes and N is the ranked
   importance. At that point any code that needs to prioritise nodes off
   against each other can just check that variable whilst doing work. The
   network loop looks at which file descriptors are select()able and their
   scores, which closures are pending execution and their scores, then decides
   whether to handle new network data or run a closure. If there is a draw
   between the scores, closures take priority to reduce memory pressure and
   lower latency.

   5. Handling of "getdata" then ends up calling a function that requests a
   load of a block from disk, and runs a closure when it's finished. The
   closure inherits the nodes current score, of course, so when the block load
   is completed execution of the rest of the getdata handling takes priority
   over handling new traffic from network nodes. When the closure executes, it
   writes the loaded/filtered data out over the network socket and deletes

The function that takes a CBlockIndex and yields a future<CBlock> or
closure or whatever would internally lock the job queue(s), add the new
task and then do a stable sort of the queue using the scoring function,
which in this case would simply use the node score as the job score.

It's a fair amount of work, but should ensure that "good" nodes outcompete
"bad" nodes for disk IO. Any other disk IO operations can be done in the
same way. Note that the bulk of LevelDB write work is already handled on a
background thread. The foreground thread only writes a log entry to disk
and updates some in-memory data structures.

--e89a8ff2528a51599d04e44995d5
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Mon, Aug 19, 2013 at 2:13 AM, Peter Todd <span dir=3D"l=
tr">&lt;<a href=3D"mailto:pete@petertodd.org" target=3D"_blank">pete@petert=
odd.org</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><div class=3D"g=
mail_quote">
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">In any case given that SPV peers don&#39;t c=
ontribute back to the network<br>
they should obviously be heavily deprioritized and served only with<br>
whatever resources a node has spare.</blockquote><div><br></div><div>Well, =
I&#39;m glad we&#39;re making progress towards this kind of model :)</div><=
div><br></div><div>If I had to write a scoring function for node importance=
, I&#39;d start by making nodes I connected to more important than nodes th=
at connected to me. That should prevent the kind of attacks you&#39;re talk=
ing about. You can then score within those subsets with greater subtlety, l=
ike using how long the connection has been active (or extending that with s=
igned timestamps).</div>
<div><br></div><div>This doesn&#39;t have any in-built bias against SPV nod=
es, which is probably very hard to technically implement anyway. But it enc=
odes the intuitive notion that nodes I selected myself are less likely to b=
e DoS attackers than nodes which connected to me.</div>
<div><br></div><div>But the trick is to implement the prioritisation code. =
The usual way to do this is to have a thread pool that pops requests off a =
queue. You can either have multiple queues for different priority bands, or=
 code that locks the queue and re-orders it when something new is added. I =
tend to find the multiple queues approach simpler, especially, it&#39;s sim=
pler to export statistics about that via RPC that make it easy to understan=
d what&#39;s going on underneath the hood.</div>
<div><br></div><div>So IMHO a patch to address I/O exhaustion should look s=
omething like this:</div><div><div><ol><li>Add a thread pool of 2-3 threads=
 (to give the kernel room to overlap IO) which take in CBlock load requests=
 and then do the load/parse/filter in the background.<br>
<br></li><li>Each thread starts by blocking on a counting semaphore which r=
epresents the total number of requests.<br><br></li><li>The network thread =
message loop is adjusted so it can receive some kind of futures/callbacks/c=
losure object (I guess Boost provides this, alternatively we could switch t=
o using C++11). The closures should also have the score of the node they we=
re created for (note: score not a CNode* as that complicates memory managem=
ent).<br>
<br></li><li>At the start of the network loop a thread-local (or global) va=
riable is set that contains the nodes current score, which is just an n-of-=
m score where M is the total number of connected nodes and N is the ranked =
importance. At that point any code that needs to prioritise nodes off again=
st each other can just check that variable whilst doing work. The network l=
oop looks at which file descriptors are select()able and their scores, whic=
h closures are pending execution and their scores, then decides whether to =
handle new network data or run a closure. If there is a draw between the sc=
ores, closures take priority to reduce memory pressure and lower latency.<b=
r>
<br></li><li>Handling of &quot;getdata&quot; then ends up calling a functio=
n that requests a load of a block from disk, and runs a closure when it&#39=
;s finished. The closure inherits the nodes current score, of course, so wh=
en the block load is completed execution of the rest of the getdata handlin=
g takes priority over handling new traffic from network nodes. When the clo=
sure executes, it writes the loaded/filtered data out over the network sock=
et and deletes=C2=A0</li>
</ol><div>The function that takes a CBlockIndex and yields a future&lt;CBlo=
ck&gt; or closure or whatever would internally lock the job queue(s), add t=
he new task and then do a stable sort of the queue using the scoring functi=
on, which in this case would simply use the node score as the job score.</d=
iv>
</div><div><br></div><div>It&#39;s a fair amount of work, but should ensure=
 that &quot;good&quot; nodes outcompete &quot;bad&quot; nodes for disk IO. =
Any other disk IO operations can be done in the same way. Note that the bul=
k of LevelDB write work is already handled on a background thread. The fore=
ground thread only writes a log entry to disk and updates some in-memory da=
ta structures.</div>
<br></div></div></div></div>

--e89a8ff2528a51599d04e44995d5--