summaryrefslogtreecommitdiff
path: root/52/c3323b399b086ad402486ab80f7523f09f1f5f
blob: 09dd20127185ff8967fba3ea81e76c0ad348b534 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
Return-Path: <lucasontivero@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 99043E4F
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Wed, 30 May 2018 03:10:07 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-lf0-f51.google.com (mail-lf0-f51.google.com
	[209.85.215.51])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 88FA3224
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Wed, 30 May 2018 03:10:06 +0000 (UTC)
Received: by mail-lf0-f51.google.com with SMTP id n18-v6so1897167lfh.10
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Tue, 29 May 2018 20:10:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to; 
	bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=;
	b=adfXPA8m2ZqSm5E0/6rGPxim5iao1hmZYWxAxgvWmCyCjChC2qRRojSHSMEGBadlbn
	itwQ8MPmrczczmCECVhzjqOz6FB+t3yuv6py36HJ9PRpJZ5ZqatgoscJY5MW3ij6C4vz
	v4fUkGcz2sTts3yHHfJvRz4dQj6xEpfK0KNiJ0XisRhsnctR5Z4U21yh4DC3PAzGyMKu
	KgAqIzQ2H035Z3rl8g36+kO0qzgtNWLUTNpxNDlLNbeRJg8kPmQwdQgJ8UopX67n9RVV
	UBVC6gI4riGwy05oq0qWnzVXHBvJNRW+XIRzmDw0QuFZczNwSXnkrJyqKMwpsTKNfHgV
	Z8Jw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to;
	bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=;
	b=a46WNKwhB+iP8LzObZjcXbTwmaBWBfBeI1ISWqctDOYWtTCUJDZI4mKlkuwGzod2Em
	bVqyG3y8TLqN/AQKUOiM2CfOuBvHu57aKvK4k2xSCIJsPTCWlafUw0G6h+fEUVaPYTdH
	P/jszD8qXPGfKl9AJ8nVh2ediuKWCe1+pNVycDtoKPjZEQ+JnkbVpVsM38ukunVEfc3e
	TaCgDoIdTEabIYT5rLXstVAXgV8V0IZMUHt0kzTtDICmPslBmgdde3qGeoG2BGbOI2Hm
	7Lv/4uh8wVvshipi7riDQK6E0hlevWUPUZqgFuxpzH/2X+5aYl2oGywqQxWg+JaCh365
	X2cA==
X-Gm-Message-State: ALKqPwdpMQC8/xON3yC4zH8KdjvH2xBX3oNPiZ2lyBFjvhCimoCKY4pf
	TWz3Az+eufSN40vuvgv8z26yt33wyGnBacwDTsY=
X-Google-Smtp-Source: ADUXVKKU4P6/h2mgWIj3cMoO/x0vHXAgoW9C8Ee3HB/5TJEox9vNxb8gFC9B0/MTVz3eUSSun5rvxnVagI8blMpSoKc=
X-Received: by 2002:a19:97cb:: with SMTP id
	z194-v6mr498101lfd.17.1527649804876; 
	Tue, 29 May 2018 20:10:04 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a19:3bd4:0:0:0:0:0 with HTTP; Tue, 29 May 2018 20:10:04
	-0700 (PDT)
In-Reply-To: <CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com>
References: <CAPg+sBgywj6PgijmSNkYYkKKQuek2g9-cSy6GJBpV+=gom7LfQ@mail.gmail.com>
	<CAAS2fgS5cnNZSp7DJdDEdt1ainezfg7aoAbga2Py7gqfe267kw@mail.gmail.com>
	<CAAS2fgQSS7R+PtmpcXJqjeXWnLa8S8O_1nFgdCYiLqRYbQM3QQ@mail.gmail.com>
	<CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com>
From: Lucas Ontivero <lucasontivero@gmail.com>
Date: Wed, 30 May 2018 00:10:04 -0300
Message-ID: <CALHvQn0eSU4f_f-XgTefVxfGKTjHqMsQ+bQDOb9qxk_tfuPqqg@mail.gmail.com>
To: Jim Posen <jim.posen@gmail.com>, 
	Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: multipart/alternative; boundary="000000000000f5fefd056d63af9e"
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE,
	RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Wed, 30 May 2018 03:11:05 +0000
Subject: Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Wed, 30 May 2018 03:10:07 -0000

--000000000000f5fefd056d63af9e
Content-Type: text/plain; charset="UTF-8"

Hi Jim,

Yes please, could you share CSV? We are developing a Wallet that uses
Golomb-Rice filters it would help a lot for determine the best P value
depending on the estimated number of elements the client needs to watch.

2018-05-29 19:38 GMT-03:00 Jim Posen via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org>:

> This is a really cool finding, thanks Pieter!
>
> I did some more analysis on selecting a good P value to reduce total data
> downloaded considering both filters themselves and blocks in the case of
> false positive matches, using data from mainnet. The quantity it minimizes
> is:
>
> filter_size(N, B) + block_size * false_positive_probability(C, N, B)
>
> N is the number of filter elements per block
> B is the Golomb-Rice coding parameter
> C is the number of filter elements watched by the client
>
> The main result is that:
>
> For C = 10, B = 13 is optimal
> For C = 100, B = 16 is optimal
> For C = 1,000, B = 20 is optimal
> For C = 10,000, B = 23 is optimal
>
> So any value of B in the range 16 to 20 seems reasonable, with M = 1.4971
> * 2^B for optimal compression, as Pieter derived. The selection of the
> parameter depends on the target number of elements that a client may watch.
>
> I attached some of the results, and would be happy to share the CSV and
> raw notebook if people are interested.
>
>
> On Fri, May 25, 2018 at 2:14 PM Gregory Maxwell via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>> On Fri, May 25, 2018 at 6:42 PM, Gregory Maxwell <greg@xiph.org> wrote:
>> > configuration is roughly right, then M=1569861 and rice parameter 19
>> > should be used.
>>
>> That should have been M=784931 B=19  ... paste error.
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>

--000000000000f5fefd056d63af9e
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hi Jim,</div><div><br></div><div>Yes please, could yo=
u share CSV? We are developing a Wallet that uses Golomb-Rice filters it wo=
uld help a lot for determine the best P value depending on the estimated nu=
mber of elements the client needs to watch. <br></div></div><div class=3D"g=
mail_extra"><br><div class=3D"gmail_quote">2018-05-29 19:38 GMT-03:00 Jim P=
osen via bitcoin-dev <span dir=3D"ltr">&lt;<a href=3D"mailto:bitcoin-dev@li=
sts.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.linuxfoundatio=
n.org</a>&gt;</span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:=
0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Th=
is is a really cool finding, thanks Pieter!<div><br></div><div>I did some m=
ore analysis on selecting a good P value to reduce total data downloaded co=
nsidering both filters themselves and blocks in the case of false positive =
matches, using data from mainnet. The quantity it minimizes is:</div><div><=
br></div><div>filter_size(N, B)=C2=A0+ block_size * false_positive_probabil=
ity(C, N, B)</div><div><br></div><div>N is the number of filter elements pe=
r block</div><div>B is the Golomb-Rice coding parameter</div><div>C is the =
number of filter elements watched by the client</div><div><br></div><div>Th=
e main result is that:</div><div><br></div><div>For C =3D 10, B =3D 13 is o=
ptimal</div><div>For C =3D 100, B =3D 16 is optimal</div><div>For C =3D 1,0=
00, B =3D 20 is optimal</div><div>For C =3D 10,000, B =3D 23 is optimal</di=
v><div><br></div><div>So any value of B in the range 16 to 20 seems reasona=
ble, with M =3D=C2=A0<span style=3D"color:rgb(34,34,34);font-family:sans-se=
rif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-var=
iant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;tex=
t-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;backgr=
ound-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-c=
olor:initial;float:none;display:inline">1.4971 * 2^B for optimal compressio=
n, as Pieter derived. The selection of the parameter depends on the target =
number of elements that a client may watch.</span></div><div><span style=3D=
"color:rgb(34,34,34);font-family:sans-serif;font-size:13px;font-style:norma=
l;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;wh=
ite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-de=
coration-style:initial;text-decoration-color:initial;float:none;display:inl=
ine"><br></span></div><div><span style=3D"color:rgb(34,34,34);font-family:s=
ans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:sta=
rt;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;=
background-color:rgb(255,255,255);text-decoration-style:initial;text-decora=
tion-color:initial;float:none;display:inline">I attached some of the result=
s, and would be happy to share the CSV and raw notebook if people are inter=
ested.</span></div><div><br></div></div><div class=3D"HOEnZb"><div class=3D=
"h5"><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Fri, May 25, 2018 a=
t 2:14 PM Gregory Maxwell via bitcoin-dev &lt;<a href=3D"mailto:bitcoin-dev=
@lists.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.<wbr>linuxf=
oundation.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri=
, May 25, 2018 at 6:42 PM, Gregory Maxwell &lt;<a href=3D"mailto:greg@xiph.=
org" target=3D"_blank">greg@xiph.org</a>&gt; wrote:<br>
&gt; configuration is roughly right, then M=3D1569861 and rice parameter 19=
<br>
&gt; should be used.<br>
<br>
That should have been M=3D784931 B=3D19=C2=A0 ... paste error.<br>
______________________________<wbr>_________________<br>
bitcoin-dev mailing list<br>
<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">=
bitcoin-dev@lists.<wbr>linuxfoundation.org</a><br>
<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org=
/mailman/listinfo/bitcoin-<wbr>dev</a><br>
</blockquote></div>
</div></div><br>______________________________<wbr>_________________<br>
bitcoin-dev mailing list<br>
<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.=
<wbr>linuxfoundation.org</a><br>
<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org=
/mailman/listinfo/bitcoin-<wbr>dev</a><br>
<br></blockquote></div><br></div>

--000000000000f5fefd056d63af9e--