diff options
author | Lucas Ontivero <lucasontivero@gmail.com> | 2018-05-30 00:10:04 -0300 |
---|---|---|
committer | bitcoindev <bitcoindev@gnusha.org> | 2018-05-30 03:10:07 +0000 |
commit | 3b46091427498634633eeb7a8800916e40a5038e (patch) | |
tree | 99199594e16d8330ab7bd822d48403d343039732 | |
parent | d37c05cc1233677cf6878a17efcd72450a04deb2 (diff) | |
download | pi-bitcoindev-3b46091427498634633eeb7a8800916e40a5038e.tar.gz pi-bitcoindev-3b46091427498634633eeb7a8800916e40a5038e.zip |
Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets
-rw-r--r-- | 52/c3323b399b086ad402486ab80f7523f09f1f5f | 213 |
1 files changed, 213 insertions, 0 deletions
diff --git a/52/c3323b399b086ad402486ab80f7523f09f1f5f b/52/c3323b399b086ad402486ab80f7523f09f1f5f new file mode 100644 index 000000000..09dd20127 --- /dev/null +++ b/52/c3323b399b086ad402486ab80f7523f09f1f5f @@ -0,0 +1,213 @@ +Return-Path: <lucasontivero@gmail.com> +Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org + [172.17.192.35]) + by mail.linuxfoundation.org (Postfix) with ESMTPS id 99043E4F + for <bitcoin-dev@lists.linuxfoundation.org>; + Wed, 30 May 2018 03:10:07 +0000 (UTC) +X-Greylist: whitelisted by SQLgrey-1.7.6 +Received: from mail-lf0-f51.google.com (mail-lf0-f51.google.com + [209.85.215.51]) + by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 88FA3224 + for <bitcoin-dev@lists.linuxfoundation.org>; + Wed, 30 May 2018 03:10:06 +0000 (UTC) +Received: by mail-lf0-f51.google.com with SMTP id n18-v6so1897167lfh.10 + for <bitcoin-dev@lists.linuxfoundation.org>; + Tue, 29 May 2018 20:10:06 -0700 (PDT) +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; + h=mime-version:in-reply-to:references:from:date:message-id:subject:to; + bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=; + b=adfXPA8m2ZqSm5E0/6rGPxim5iao1hmZYWxAxgvWmCyCjChC2qRRojSHSMEGBadlbn + itwQ8MPmrczczmCECVhzjqOz6FB+t3yuv6py36HJ9PRpJZ5ZqatgoscJY5MW3ij6C4vz + v4fUkGcz2sTts3yHHfJvRz4dQj6xEpfK0KNiJ0XisRhsnctR5Z4U21yh4DC3PAzGyMKu + KgAqIzQ2H035Z3rl8g36+kO0qzgtNWLUTNpxNDlLNbeRJg8kPmQwdQgJ8UopX67n9RVV + UBVC6gI4riGwy05oq0qWnzVXHBvJNRW+XIRzmDw0QuFZczNwSXnkrJyqKMwpsTKNfHgV + Z8Jw== +X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=1e100.net; s=20161025; + h=x-gm-message-state:mime-version:in-reply-to:references:from:date + :message-id:subject:to; + bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=; + b=a46WNKwhB+iP8LzObZjcXbTwmaBWBfBeI1ISWqctDOYWtTCUJDZI4mKlkuwGzod2Em + bVqyG3y8TLqN/AQKUOiM2CfOuBvHu57aKvK4k2xSCIJsPTCWlafUw0G6h+fEUVaPYTdH + P/jszD8qXPGfKl9AJ8nVh2ediuKWCe1+pNVycDtoKPjZEQ+JnkbVpVsM38ukunVEfc3e + TaCgDoIdTEabIYT5rLXstVAXgV8V0IZMUHt0kzTtDICmPslBmgdde3qGeoG2BGbOI2Hm + 7Lv/4uh8wVvshipi7riDQK6E0hlevWUPUZqgFuxpzH/2X+5aYl2oGywqQxWg+JaCh365 + X2cA== +X-Gm-Message-State: ALKqPwdpMQC8/xON3yC4zH8KdjvH2xBX3oNPiZ2lyBFjvhCimoCKY4pf + TWz3Az+eufSN40vuvgv8z26yt33wyGnBacwDTsY= +X-Google-Smtp-Source: ADUXVKKU4P6/h2mgWIj3cMoO/x0vHXAgoW9C8Ee3HB/5TJEox9vNxb8gFC9B0/MTVz3eUSSun5rvxnVagI8blMpSoKc= +X-Received: by 2002:a19:97cb:: with SMTP id + z194-v6mr498101lfd.17.1527649804876; + Tue, 29 May 2018 20:10:04 -0700 (PDT) +MIME-Version: 1.0 +Received: by 2002:a19:3bd4:0:0:0:0:0 with HTTP; Tue, 29 May 2018 20:10:04 + -0700 (PDT) +In-Reply-To: <CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com> +References: <CAPg+sBgywj6PgijmSNkYYkKKQuek2g9-cSy6GJBpV+=gom7LfQ@mail.gmail.com> + <CAAS2fgS5cnNZSp7DJdDEdt1ainezfg7aoAbga2Py7gqfe267kw@mail.gmail.com> + <CAAS2fgQSS7R+PtmpcXJqjeXWnLa8S8O_1nFgdCYiLqRYbQM3QQ@mail.gmail.com> + <CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com> +From: Lucas Ontivero <lucasontivero@gmail.com> +Date: Wed, 30 May 2018 00:10:04 -0300 +Message-ID: <CALHvQn0eSU4f_f-XgTefVxfGKTjHqMsQ+bQDOb9qxk_tfuPqqg@mail.gmail.com> +To: Jim Posen <jim.posen@gmail.com>, + Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org> +Content-Type: multipart/alternative; boundary="000000000000f5fefd056d63af9e" +X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, + DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, + RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 +X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on + smtp1.linux-foundation.org +X-Mailman-Approved-At: Wed, 30 May 2018 03:11:05 +0000 +Subject: Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets +X-BeenThere: bitcoin-dev@lists.linuxfoundation.org +X-Mailman-Version: 2.1.12 +Precedence: list +List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org> +List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, + <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe> +List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/> +List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org> +List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help> +List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, + <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe> +X-List-Received-Date: Wed, 30 May 2018 03:10:07 -0000 + +--000000000000f5fefd056d63af9e +Content-Type: text/plain; charset="UTF-8" + +Hi Jim, + +Yes please, could you share CSV? We are developing a Wallet that uses +Golomb-Rice filters it would help a lot for determine the best P value +depending on the estimated number of elements the client needs to watch. + +2018-05-29 19:38 GMT-03:00 Jim Posen via bitcoin-dev < +bitcoin-dev@lists.linuxfoundation.org>: + +> This is a really cool finding, thanks Pieter! +> +> I did some more analysis on selecting a good P value to reduce total data +> downloaded considering both filters themselves and blocks in the case of +> false positive matches, using data from mainnet. The quantity it minimizes +> is: +> +> filter_size(N, B) + block_size * false_positive_probability(C, N, B) +> +> N is the number of filter elements per block +> B is the Golomb-Rice coding parameter +> C is the number of filter elements watched by the client +> +> The main result is that: +> +> For C = 10, B = 13 is optimal +> For C = 100, B = 16 is optimal +> For C = 1,000, B = 20 is optimal +> For C = 10,000, B = 23 is optimal +> +> So any value of B in the range 16 to 20 seems reasonable, with M = 1.4971 +> * 2^B for optimal compression, as Pieter derived. The selection of the +> parameter depends on the target number of elements that a client may watch. +> +> I attached some of the results, and would be happy to share the CSV and +> raw notebook if people are interested. +> +> +> On Fri, May 25, 2018 at 2:14 PM Gregory Maxwell via bitcoin-dev < +> bitcoin-dev@lists.linuxfoundation.org> wrote: +> +>> On Fri, May 25, 2018 at 6:42 PM, Gregory Maxwell <greg@xiph.org> wrote: +>> > configuration is roughly right, then M=1569861 and rice parameter 19 +>> > should be used. +>> +>> That should have been M=784931 B=19 ... paste error. +>> _______________________________________________ +>> bitcoin-dev mailing list +>> bitcoin-dev@lists.linuxfoundation.org +>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev +>> +> +> _______________________________________________ +> bitcoin-dev mailing list +> bitcoin-dev@lists.linuxfoundation.org +> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev +> +> + +--000000000000f5fefd056d63af9e +Content-Type: text/html; charset="UTF-8" +Content-Transfer-Encoding: quoted-printable + +<div dir=3D"ltr"><div>Hi Jim,</div><div><br></div><div>Yes please, could yo= +u share CSV? We are developing a Wallet that uses Golomb-Rice filters it wo= +uld help a lot for determine the best P value depending on the estimated nu= +mber of elements the client needs to watch. <br></div></div><div class=3D"g= +mail_extra"><br><div class=3D"gmail_quote">2018-05-29 19:38 GMT-03:00 Jim P= +osen via bitcoin-dev <span dir=3D"ltr"><<a href=3D"mailto:bitcoin-dev@li= +sts.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.linuxfoundatio= +n.org</a>></span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:= +0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Th= +is is a really cool finding, thanks Pieter!<div><br></div><div>I did some m= +ore analysis on selecting a good P value to reduce total data downloaded co= +nsidering both filters themselves and blocks in the case of false positive = +matches, using data from mainnet. The quantity it minimizes is:</div><div><= +br></div><div>filter_size(N, B)=C2=A0+ block_size * false_positive_probabil= +ity(C, N, B)</div><div><br></div><div>N is the number of filter elements pe= +r block</div><div>B is the Golomb-Rice coding parameter</div><div>C is the = +number of filter elements watched by the client</div><div><br></div><div>Th= +e main result is that:</div><div><br></div><div>For C =3D 10, B =3D 13 is o= +ptimal</div><div>For C =3D 100, B =3D 16 is optimal</div><div>For C =3D 1,0= +00, B =3D 20 is optimal</div><div>For C =3D 10,000, B =3D 23 is optimal</di= +v><div><br></div><div>So any value of B in the range 16 to 20 seems reasona= +ble, with M =3D=C2=A0<span style=3D"color:rgb(34,34,34);font-family:sans-se= +rif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-var= +iant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;tex= +t-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;backgr= +ound-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-c= +olor:initial;float:none;display:inline">1.4971 * 2^B for optimal compressio= +n, as Pieter derived. The selection of the parameter depends on the target = +number of elements that a client may watch.</span></div><div><span style=3D= +"color:rgb(34,34,34);font-family:sans-serif;font-size:13px;font-style:norma= +l;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;le= +tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;wh= +ite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-de= +coration-style:initial;text-decoration-color:initial;float:none;display:inl= +ine"><br></span></div><div><span style=3D"color:rgb(34,34,34);font-family:s= +ans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;fo= +nt-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:sta= +rt;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;= +background-color:rgb(255,255,255);text-decoration-style:initial;text-decora= +tion-color:initial;float:none;display:inline">I attached some of the result= +s, and would be happy to share the CSV and raw notebook if people are inter= +ested.</span></div><div><br></div></div><div class=3D"HOEnZb"><div class=3D= +"h5"><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Fri, May 25, 2018 a= +t 2:14 PM Gregory Maxwell via bitcoin-dev <<a href=3D"mailto:bitcoin-dev= +@lists.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.<wbr>linuxf= +oundation.org</a>> wrote:<br></div><blockquote class=3D"gmail_quote" sty= +le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri= +, May 25, 2018 at 6:42 PM, Gregory Maxwell <<a href=3D"mailto:greg@xiph.= +org" target=3D"_blank">greg@xiph.org</a>> wrote:<br> +> configuration is roughly right, then M=3D1569861 and rice parameter 19= +<br> +> should be used.<br> +<br> +That should have been M=3D784931 B=3D19=C2=A0 ... paste error.<br> +______________________________<wbr>_________________<br> +bitcoin-dev mailing list<br> +<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">= +bitcoin-dev@lists.<wbr>linuxfoundation.org</a><br> +<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" = +rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org= +/mailman/listinfo/bitcoin-<wbr>dev</a><br> +</blockquote></div> +</div></div><br>______________________________<wbr>_________________<br> +bitcoin-dev mailing list<br> +<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.= +<wbr>linuxfoundation.org</a><br> +<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" = +rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org= +/mailman/listinfo/bitcoin-<wbr>dev</a><br> +<br></blockquote></div><br></div> + +--000000000000f5fefd056d63af9e-- + |