summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLucas Ontivero <lucasontivero@gmail.com>2018-05-30 00:10:04 -0300
committerbitcoindev <bitcoindev@gnusha.org>2018-05-30 03:10:07 +0000
commit3b46091427498634633eeb7a8800916e40a5038e (patch)
tree99199594e16d8330ab7bd822d48403d343039732
parentd37c05cc1233677cf6878a17efcd72450a04deb2 (diff)
downloadpi-bitcoindev-3b46091427498634633eeb7a8800916e40a5038e.tar.gz
pi-bitcoindev-3b46091427498634633eeb7a8800916e40a5038e.zip
Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets
-rw-r--r--52/c3323b399b086ad402486ab80f7523f09f1f5f213
1 files changed, 213 insertions, 0 deletions
diff --git a/52/c3323b399b086ad402486ab80f7523f09f1f5f b/52/c3323b399b086ad402486ab80f7523f09f1f5f
new file mode 100644
index 000000000..09dd20127
--- /dev/null
+++ b/52/c3323b399b086ad402486ab80f7523f09f1f5f
@@ -0,0 +1,213 @@
+Return-Path: <lucasontivero@gmail.com>
+Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
+ [172.17.192.35])
+ by mail.linuxfoundation.org (Postfix) with ESMTPS id 99043E4F
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Wed, 30 May 2018 03:10:07 +0000 (UTC)
+X-Greylist: whitelisted by SQLgrey-1.7.6
+Received: from mail-lf0-f51.google.com (mail-lf0-f51.google.com
+ [209.85.215.51])
+ by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 88FA3224
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Wed, 30 May 2018 03:10:06 +0000 (UTC)
+Received: by mail-lf0-f51.google.com with SMTP id n18-v6so1897167lfh.10
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Tue, 29 May 2018 20:10:06 -0700 (PDT)
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
+ h=mime-version:in-reply-to:references:from:date:message-id:subject:to;
+ bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=;
+ b=adfXPA8m2ZqSm5E0/6rGPxim5iao1hmZYWxAxgvWmCyCjChC2qRRojSHSMEGBadlbn
+ itwQ8MPmrczczmCECVhzjqOz6FB+t3yuv6py36HJ9PRpJZ5ZqatgoscJY5MW3ij6C4vz
+ v4fUkGcz2sTts3yHHfJvRz4dQj6xEpfK0KNiJ0XisRhsnctR5Z4U21yh4DC3PAzGyMKu
+ KgAqIzQ2H035Z3rl8g36+kO0qzgtNWLUTNpxNDlLNbeRJg8kPmQwdQgJ8UopX67n9RVV
+ UBVC6gI4riGwy05oq0qWnzVXHBvJNRW+XIRzmDw0QuFZczNwSXnkrJyqKMwpsTKNfHgV
+ Z8Jw==
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
+ d=1e100.net; s=20161025;
+ h=x-gm-message-state:mime-version:in-reply-to:references:from:date
+ :message-id:subject:to;
+ bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=;
+ b=a46WNKwhB+iP8LzObZjcXbTwmaBWBfBeI1ISWqctDOYWtTCUJDZI4mKlkuwGzod2Em
+ bVqyG3y8TLqN/AQKUOiM2CfOuBvHu57aKvK4k2xSCIJsPTCWlafUw0G6h+fEUVaPYTdH
+ P/jszD8qXPGfKl9AJ8nVh2ediuKWCe1+pNVycDtoKPjZEQ+JnkbVpVsM38ukunVEfc3e
+ TaCgDoIdTEabIYT5rLXstVAXgV8V0IZMUHt0kzTtDICmPslBmgdde3qGeoG2BGbOI2Hm
+ 7Lv/4uh8wVvshipi7riDQK6E0hlevWUPUZqgFuxpzH/2X+5aYl2oGywqQxWg+JaCh365
+ X2cA==
+X-Gm-Message-State: ALKqPwdpMQC8/xON3yC4zH8KdjvH2xBX3oNPiZ2lyBFjvhCimoCKY4pf
+ TWz3Az+eufSN40vuvgv8z26yt33wyGnBacwDTsY=
+X-Google-Smtp-Source: ADUXVKKU4P6/h2mgWIj3cMoO/x0vHXAgoW9C8Ee3HB/5TJEox9vNxb8gFC9B0/MTVz3eUSSun5rvxnVagI8blMpSoKc=
+X-Received: by 2002:a19:97cb:: with SMTP id
+ z194-v6mr498101lfd.17.1527649804876;
+ Tue, 29 May 2018 20:10:04 -0700 (PDT)
+MIME-Version: 1.0
+Received: by 2002:a19:3bd4:0:0:0:0:0 with HTTP; Tue, 29 May 2018 20:10:04
+ -0700 (PDT)
+In-Reply-To: <CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com>
+References: <CAPg+sBgywj6PgijmSNkYYkKKQuek2g9-cSy6GJBpV+=gom7LfQ@mail.gmail.com>
+ <CAAS2fgS5cnNZSp7DJdDEdt1ainezfg7aoAbga2Py7gqfe267kw@mail.gmail.com>
+ <CAAS2fgQSS7R+PtmpcXJqjeXWnLa8S8O_1nFgdCYiLqRYbQM3QQ@mail.gmail.com>
+ <CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com>
+From: Lucas Ontivero <lucasontivero@gmail.com>
+Date: Wed, 30 May 2018 00:10:04 -0300
+Message-ID: <CALHvQn0eSU4f_f-XgTefVxfGKTjHqMsQ+bQDOb9qxk_tfuPqqg@mail.gmail.com>
+To: Jim Posen <jim.posen@gmail.com>,
+ Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
+Content-Type: multipart/alternative; boundary="000000000000f5fefd056d63af9e"
+X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,
+ DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE,
+ RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
+X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
+ smtp1.linux-foundation.org
+X-Mailman-Approved-At: Wed, 30 May 2018 03:11:05 +0000
+Subject: Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets
+X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
+X-Mailman-Version: 2.1.12
+Precedence: list
+List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
+List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
+ <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
+List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
+List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
+List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
+List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
+ <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
+X-List-Received-Date: Wed, 30 May 2018 03:10:07 -0000
+
+--000000000000f5fefd056d63af9e
+Content-Type: text/plain; charset="UTF-8"
+
+Hi Jim,
+
+Yes please, could you share CSV? We are developing a Wallet that uses
+Golomb-Rice filters it would help a lot for determine the best P value
+depending on the estimated number of elements the client needs to watch.
+
+2018-05-29 19:38 GMT-03:00 Jim Posen via bitcoin-dev <
+bitcoin-dev@lists.linuxfoundation.org>:
+
+> This is a really cool finding, thanks Pieter!
+>
+> I did some more analysis on selecting a good P value to reduce total data
+> downloaded considering both filters themselves and blocks in the case of
+> false positive matches, using data from mainnet. The quantity it minimizes
+> is:
+>
+> filter_size(N, B) + block_size * false_positive_probability(C, N, B)
+>
+> N is the number of filter elements per block
+> B is the Golomb-Rice coding parameter
+> C is the number of filter elements watched by the client
+>
+> The main result is that:
+>
+> For C = 10, B = 13 is optimal
+> For C = 100, B = 16 is optimal
+> For C = 1,000, B = 20 is optimal
+> For C = 10,000, B = 23 is optimal
+>
+> So any value of B in the range 16 to 20 seems reasonable, with M = 1.4971
+> * 2^B for optimal compression, as Pieter derived. The selection of the
+> parameter depends on the target number of elements that a client may watch.
+>
+> I attached some of the results, and would be happy to share the CSV and
+> raw notebook if people are interested.
+>
+>
+> On Fri, May 25, 2018 at 2:14 PM Gregory Maxwell via bitcoin-dev <
+> bitcoin-dev@lists.linuxfoundation.org> wrote:
+>
+>> On Fri, May 25, 2018 at 6:42 PM, Gregory Maxwell <greg@xiph.org> wrote:
+>> > configuration is roughly right, then M=1569861 and rice parameter 19
+>> > should be used.
+>>
+>> That should have been M=784931 B=19 ... paste error.
+>> _______________________________________________
+>> bitcoin-dev mailing list
+>> bitcoin-dev@lists.linuxfoundation.org
+>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
+>>
+>
+> _______________________________________________
+> bitcoin-dev mailing list
+> bitcoin-dev@lists.linuxfoundation.org
+> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
+>
+>
+
+--000000000000f5fefd056d63af9e
+Content-Type: text/html; charset="UTF-8"
+Content-Transfer-Encoding: quoted-printable
+
+<div dir=3D"ltr"><div>Hi Jim,</div><div><br></div><div>Yes please, could yo=
+u share CSV? We are developing a Wallet that uses Golomb-Rice filters it wo=
+uld help a lot for determine the best P value depending on the estimated nu=
+mber of elements the client needs to watch. <br></div></div><div class=3D"g=
+mail_extra"><br><div class=3D"gmail_quote">2018-05-29 19:38 GMT-03:00 Jim P=
+osen via bitcoin-dev <span dir=3D"ltr">&lt;<a href=3D"mailto:bitcoin-dev@li=
+sts.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.linuxfoundatio=
+n.org</a>&gt;</span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:=
+0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Th=
+is is a really cool finding, thanks Pieter!<div><br></div><div>I did some m=
+ore analysis on selecting a good P value to reduce total data downloaded co=
+nsidering both filters themselves and blocks in the case of false positive =
+matches, using data from mainnet. The quantity it minimizes is:</div><div><=
+br></div><div>filter_size(N, B)=C2=A0+ block_size * false_positive_probabil=
+ity(C, N, B)</div><div><br></div><div>N is the number of filter elements pe=
+r block</div><div>B is the Golomb-Rice coding parameter</div><div>C is the =
+number of filter elements watched by the client</div><div><br></div><div>Th=
+e main result is that:</div><div><br></div><div>For C =3D 10, B =3D 13 is o=
+ptimal</div><div>For C =3D 100, B =3D 16 is optimal</div><div>For C =3D 1,0=
+00, B =3D 20 is optimal</div><div>For C =3D 10,000, B =3D 23 is optimal</di=
+v><div><br></div><div>So any value of B in the range 16 to 20 seems reasona=
+ble, with M =3D=C2=A0<span style=3D"color:rgb(34,34,34);font-family:sans-se=
+rif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-var=
+iant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;tex=
+t-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;backgr=
+ound-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-c=
+olor:initial;float:none;display:inline">1.4971 * 2^B for optimal compressio=
+n, as Pieter derived. The selection of the parameter depends on the target =
+number of elements that a client may watch.</span></div><div><span style=3D=
+"color:rgb(34,34,34);font-family:sans-serif;font-size:13px;font-style:norma=
+l;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;le=
+tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;wh=
+ite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-de=
+coration-style:initial;text-decoration-color:initial;float:none;display:inl=
+ine"><br></span></div><div><span style=3D"color:rgb(34,34,34);font-family:s=
+ans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;fo=
+nt-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:sta=
+rt;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;=
+background-color:rgb(255,255,255);text-decoration-style:initial;text-decora=
+tion-color:initial;float:none;display:inline">I attached some of the result=
+s, and would be happy to share the CSV and raw notebook if people are inter=
+ested.</span></div><div><br></div></div><div class=3D"HOEnZb"><div class=3D=
+"h5"><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Fri, May 25, 2018 a=
+t 2:14 PM Gregory Maxwell via bitcoin-dev &lt;<a href=3D"mailto:bitcoin-dev=
+@lists.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.<wbr>linuxf=
+oundation.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" sty=
+le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri=
+, May 25, 2018 at 6:42 PM, Gregory Maxwell &lt;<a href=3D"mailto:greg@xiph.=
+org" target=3D"_blank">greg@xiph.org</a>&gt; wrote:<br>
+&gt; configuration is roughly right, then M=3D1569861 and rice parameter 19=
+<br>
+&gt; should be used.<br>
+<br>
+That should have been M=3D784931 B=3D19=C2=A0 ... paste error.<br>
+______________________________<wbr>_________________<br>
+bitcoin-dev mailing list<br>
+<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">=
+bitcoin-dev@lists.<wbr>linuxfoundation.org</a><br>
+<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
+rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org=
+/mailman/listinfo/bitcoin-<wbr>dev</a><br>
+</blockquote></div>
+</div></div><br>______________________________<wbr>_________________<br>
+bitcoin-dev mailing list<br>
+<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.=
+<wbr>linuxfoundation.org</a><br>
+<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
+rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org=
+/mailman/listinfo/bitcoin-<wbr>dev</a><br>
+<br></blockquote></div><br></div>
+
+--000000000000f5fefd056d63af9e--
+