summaryrefslogtreecommitdiff
path: root/f9/d43c97e53364c6175292a60a31f88455c003d4
blob: cf5de395441c3e0012c1e4d3266a88974a860080 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
Return-Path: <kalle@rosenbaum.se>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 10AE9273
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 25 Jun 2015 21:02:28 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-qg0-f48.google.com (mail-qg0-f48.google.com
	[209.85.192.48])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id D731111F
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 25 Jun 2015 21:02:26 +0000 (UTC)
Received: by qgal13 with SMTP id l13so29197491qga.3
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 25 Jun 2015 14:02:26 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:mime-version:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type;
	bh=C8tjAW1UfwLK78subLBwDqnaaSYH3BvaaTc0e/Q97hw=;
	b=LOyESBNGe4PcknzmrDhWIHj7kvYI92s56dHB3wp+ORdfBevUg9TaHPwt1MLg5IuKp2
	AJZAKR+iZ1gn1d+SzKDuFHjm6CRaz/TMB8T+SgAfZwXQoh/pOozGDWJYVmS3iPwjebCH
	fUIcl+Lx/zdVoIQhAPcRPl2Fol+yaLxH+kmuQT/UiCIjQoz8EDfEfKj++bPQnVNlVICc
	TDHkL9oo/zsUvsKR805D1JaiE1QMyGkIyvUJIKxgtdJDoRdbYENEP6/18FTLSDI8PumK
	HnnRv9hFO9M27xovSmORfYHjNpM+bugh7qXofvXlLnfosPqJ+IoMUGWGugxHPI5VOJHM
	U/4w==
X-Gm-Message-State: ALoCoQmfSfbgoUaleMGO1hyMVAHRxg107HcOBDDEaCYOUls2bTHf6wQ5tQmOKZGa/aEWnG6TwBC8
MIME-Version: 1.0
X-Received: by 10.55.18.14 with SMTP id c14mr73840599qkh.51.1435266146020;
	Thu, 25 Jun 2015 14:02:26 -0700 (PDT)
Received: by 10.96.127.227 with HTTP; Thu, 25 Jun 2015 14:02:25 -0700 (PDT)
In-Reply-To: <871th3t1go.fsf@rustcorp.com.au>
References: <871th3t1go.fsf@rustcorp.com.au>
Date: Thu, 25 Jun 2015 23:02:25 +0200
Message-ID: <CAPswA9yUUUH7XM_wC=+QXnbEhp49hOyFAtfJztq+r2p0rNmeiQ@mail.gmail.com>
From: Kalle Rosenbaum <kalle@rosenbaum.se>
To: Rusty Russell <rusty@rustcorp.com.au>
Content-Type: multipart/alternative; boundary=001a11471190cac5b205195df07d
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE,
	RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Cc: bitcoin-dev@lists.linuxfoundation.org
Subject: Re: [bitcoin-dev] [RFC] IBLT block testing implementation
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Development Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 21:02:28 -0000

--001a11471190cac5b205195df07d
Content-Type: text/plain; charset=UTF-8

2015-06-23 7:53 GMT+02:00 Rusty Russell <rusty@rustcorp.com.au>:
> Hi all,
>
>         I've come up with a model for using IBLT to communicate blocks
> between peers.  The gory details can be found on github: it's a
> standalone C++ app for testing not integrated with bitcoin.
>
>         https://github.com/rustyrussell/bitcoin-iblt

Good to see that you're working on this. Really exciting!

I want to take the opportunity to link to my previous work on IBLTs, for
those that haven't seen it, where I investigate the behaviour of the IBLT
when changing different parameters, like cell count, hashFunctionCount etc:
https://github.com/kallerosenbaum/bitcoin-iblt/wiki

From glancing over your implementation, I see that you don't use a
keyHashSum, in fact you don't use a key at all, but only a
concatenatenation of (txid48, fragid, tx-chunk) as value. Here the
txid48+fragid functions as a kind of keyHashSum. I think this might be a
very good idea,

If you have a false positive with count == 1, then you would probably
detect it if fragid is outside reasonable limit from from base_fragid. Did
you implement your idea to remove all the count==1 fagments in ascending
order of (fragid-base_fragid)?

Anyhow, I think we should make some more comparable tests, just as you
proposed last year when I didn't reply, sorry... My code is a more straight
forward implementation of the IBLT paper (http://arxiv.org/pdf/1101.2245.pdf)
and encoding blocks is done pretty much as Gavin proposed in his gist. That
should function as a baseline so that we can validate that your
optimizations actually work. Please contact me directly if you are
interested in that, and we'll figure something out.

More comments/questions inline:

>
> Assumptions and details:
>
> 1. The base idea comes from Gavin's Block Propagation gist:
>         https://gist.github.com/gavinandresen/e20c3b5a1d4b97f79ac2
>
> 2. It relies on similarity in mempools, with some selection hints.  This
>    is designed to be as flexible as possible to make fewest assumptions
>    on tx selection policy.
>
> 3. The selection hints are: minimum fee-per-byte (fixed point) and
>    bitmaps of included-despite-that and rejected-despite-that.  The
>    former covers things like child-pays-for-parent and the priority
>    area.  The latter covers other cases like Eligius censoring "spam",
>    bitcoin version differences, etc.
>

There is a chance that the bit prefix of the added or removed tx is not
unique within the receiver's mempool. In that case the receiver can
probably just use the earliest matching transaction and hope for the best.
If not -> bad luck. Is that how you do it?

> 4. Cost to represent these exceptional added or excluded transactions is
>    (on average) log2(transactions in mempool) bits.

These exceptional tx *could* instead be encoded in the IBLT, just as if
they were unknown differences. Your bitmaps is probably a more compact
representation, but it's also more complex.

>
> 5. At Peiter Wuille's suggestion, it is designed to be reencoded between
>    nodes.  It seems fast enough for that, and neighboring nodes should
>    have most similar mempools.
>

What is the purpose of reencoding when relaying? Is that to improve the
failure probability as new tx may have arrived in the mempool of the
intermediary node?

> 6. It performs reasonably well on my 100 block sample in bitcoin-corpus
>    (chosen to include a mempool spike):
>
>    Average block range (bytes):                         7988-999829
>    Block size mean (bytes):                             401926
>
>    Minimal decodable BLT size range (bytes):            314-386361
>    Minimal decodable BLT size mean (bytes):             13265
>
> 7. Actual results will have to be worse than these minima, as peers will
>    have to oversize the IBLT to have reasonable chance of success.
>

Yes, I have made some analysis on this here:
https://github.com/kallerosenbaum/bitcoin-iblt/wiki/FailureProbability. We
use quite different data structures for encoding blocks in IBLT, but it
might apply to your implementation as well. An interesting result is that
the space saving percentage actually increases as blocks grow.

> 8. There is more work to do, and more investigation to be done, but I
>    don't expect more than a 25% reduction in this "ideal minimum"
>    result.
>
> Special thanks to Kalle Rosenbaum for helping investigate IBLTs with me
> last year.

Thank you too!

Regards,
Kalle

>
> Cheers,
> Rusty.
> PS. I work for Blockstream.  And I'm supposed to be working on
>     Lightning, not this.

--001a11471190cac5b205195df07d
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>2015-06-23 7:53 GMT+02:00 Rusty Russell &lt;<a href=
=3D"mailto:rusty@rustcorp.com.au" target=3D"_blank">rusty@rustcorp.com.au</=
a>&gt;:<br>&gt; Hi all,<br>&gt;<br>&gt; =C2=A0 =C2=A0 =C2=A0 =C2=A0 I&#39;v=
e come up with a model for using IBLT to communicate blocks<br>&gt; between=
 peers.=C2=A0 The gory details can be found on github: it&#39;s a<br>&gt; s=
tandalone C++ app for testing not integrated with bitcoin.<br>&gt;<br>&gt; =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <a href=3D"https://github.com/rustyrussell/bitc=
oin-iblt" target=3D"_blank">https://github.com/rustyrussell/bitcoin-iblt</a=
><br><br>Good to see that you&#39;re working on this. Really exciting!<br><=
br>I want to take the opportunity to link to my previous work on IBLTs, for=
 those that haven&#39;t seen it,
 where I investigate the behaviour of the IBLT when changing different=20
parameters, like cell count, hashFunctionCount etc: <a href=3D"https://gith=
ub.com/kallerosenbaum/bitcoin-iblt/wiki" target=3D"_blank">https://github.c=
om/kallerosenbaum/bitcoin-iblt/wiki</a><br><br></div><div>From glancing ove=
r your implementation, I see that you don&#39;t use a keyHashSum, in fact y=
ou don&#39;t use a key at all, but only a concatenatenation of (txid48, fra=
gid, tx-chunk) as value. Here the txid48+fragid functions as a kind of keyH=
ashSum. I think this might be a very good idea,<br><br>If you have a false =
positive with count =3D=3D 1, then you would probably detect it if fragid i=
s outside reasonable limit from from base_fragid. Did you implement your id=
ea to remove all the count=3D=3D1 fagments in ascending order of (fragid-ba=
se_fragid)?<br><br></div><div>Anyhow, I think we should make some more comp=
arable tests, just as you proposed last year when I didn&#39;t reply, sorry=
... My code is a more straight forward implementation of the IBLT paper (<a=
 href=3D"http://arxiv.org/pdf/1101.2245.pdf">http://arxiv.org/pdf/1101.2245=
.pdf</a>) and encoding blocks is done pretty much as Gavin proposed in his =
gist. That should function as a baseline so that we can validate that your =
optimizations actually work. Please contact me directly if you are interest=
ed in that, and we&#39;ll figure something out.<br></div><div><br></div><di=
v>More comments/questions inline:<br></div><div><br>&gt;<br>&gt; Assumption=
s and details:<br>&gt;<br>&gt; 1. The base idea comes from Gavin&#39;s Bloc=
k Propagation gist:<br>&gt; =C2=A0 =C2=A0 =C2=A0 =C2=A0 <a href=3D"https://=
gist.github.com/gavinandresen/e20c3b5a1d4b97f79ac2" target=3D"_blank">https=
://gist.github.com/gavinandresen/e20c3b5a1d4b97f79ac2</a><br>&gt;<br>&gt; 2=
. It relies on similarity in mempools, with some selection hints.=C2=A0 Thi=
s<br>&gt; =C2=A0 =C2=A0is designed to be as flexible as possible to make fe=
west assumptions<br>&gt; =C2=A0 =C2=A0on tx selection policy.<br>&gt;<br>&g=
t; 3. The selection hints are: minimum fee-per-byte (fixed point) and<br>&g=
t; =C2=A0 =C2=A0bitmaps of included-despite-that and rejected-despite-that.=
=C2=A0 The<br>&gt; =C2=A0 =C2=A0former covers things like child-pays-for-pa=
rent and the priority<br>&gt; =C2=A0 =C2=A0area.=C2=A0 The latter covers ot=
her cases like Eligius censoring &quot;spam&quot;,<br>&gt; =C2=A0 =C2=A0bit=
coin version differences, etc.<br>&gt;<br><br>There is a chance that the bi=
t prefix of the added or removed tx is not unique within the receiver&#39;s=
 mempool. In that case the receiver can probably just use the earliest matc=
hing transaction and hope for the best. If not -&gt; bad luck. Is that how =
you do it?<br><br>&gt; 4. Cost to represent these exceptional added or excl=
uded transactions is<br>&gt; =C2=A0 =C2=A0(on average) log2(transactions in=
 mempool) bits.<br><br>These exceptional tx *could* instead be encoded in t=
he IBLT, just as if they were unknown differences. Your bitmaps is probably=
 a more compact representation, but it&#39;s also more complex.<br><br>&gt;=
<br>&gt; 5. At Peiter Wuille&#39;s suggestion, it is designed to be reencod=
ed between<br>&gt; =C2=A0 =C2=A0nodes.=C2=A0 It seems fast enough for that,=
 and neighboring nodes should<br>&gt; =C2=A0 =C2=A0have most similar mempoo=
ls.<br>&gt;<br><br>What is the purpose of reencoding when relaying? Is that=
 to improve the failure probability as new tx may have arrived in the mempo=
ol of the intermediary node?<br><br>&gt; 6. It performs reasonably well on =
my 100 block sample in bitcoin-corpus<br>&gt; =C2=A0 =C2=A0(chosen to inclu=
de a mempool spike):<br>&gt;<br>&gt; =C2=A0 =C2=A0Average block range (byte=
s): =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 7988-999829<br>&gt; =C2=A0 =C2=A0Block size mean (bytes): =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 401926<br>&gt;<br>&gt; =C2=A0 =C2=A0Minimal decodable =
BLT size range (bytes): =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0314-386361=
<br>&gt; =C2=A0 =C2=A0Minimal decodable BLT size mean (bytes): =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 13265<br>&gt;<br>&gt; 7. Actual results wil=
l have to be worse than these minima, as peers will<br>&gt; =C2=A0 =C2=A0ha=
ve to oversize the IBLT to have reasonable chance of success.<br>&gt;<br><b=
r>Yes, I have made some analysis on this here: <a href=3D"https://github.co=
m/kallerosenbaum/bitcoin-iblt/wiki/FailureProbability" target=3D"_blank">ht=
tps://github.com/kallerosenbaum/bitcoin-iblt/wiki/FailureProbability</a>. W=
e use quite different data structures for encoding blocks in IBLT, but it m=
ight apply to your implementation as well. An interesting result is that th=
e space saving percentage actually increases as blocks grow.<br></div><div>=
<br>&gt; 8. There is more work to do, and more investigation to be done, bu=
t I<br>&gt; =C2=A0 =C2=A0don&#39;t expect more than a 25% reduction in this=
 &quot;ideal minimum&quot;<br>&gt; =C2=A0 =C2=A0result.<br>&gt;<br>&gt; Spe=
cial thanks to Kalle Rosenbaum for helping investigate IBLTs with me<br>&gt=
; last year.<br><br></div><div>Thank you too!<br><br></div><div>Regards,<br=
></div><div>Kalle<br></div><div><br></div><div>&gt;<br>&gt; Cheers,<br>&gt;=
 Rusty.<br>&gt; PS. I work for Blockstream.=C2=A0 And I&#39;m supposed to b=
e working on<br>&gt; =C2=A0 =C2=A0 Lightning, not this.</div></div>

--001a11471190cac5b205195df07d--