Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 99043E4F for ; Wed, 30 May 2018 03:10:07 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-lf0-f51.google.com (mail-lf0-f51.google.com [209.85.215.51]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 88FA3224 for ; Wed, 30 May 2018 03:10:06 +0000 (UTC) Received: by mail-lf0-f51.google.com with SMTP id n18-v6so1897167lfh.10 for ; Tue, 29 May 2018 20:10:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=; b=adfXPA8m2ZqSm5E0/6rGPxim5iao1hmZYWxAxgvWmCyCjChC2qRRojSHSMEGBadlbn itwQ8MPmrczczmCECVhzjqOz6FB+t3yuv6py36HJ9PRpJZ5ZqatgoscJY5MW3ij6C4vz v4fUkGcz2sTts3yHHfJvRz4dQj6xEpfK0KNiJ0XisRhsnctR5Z4U21yh4DC3PAzGyMKu KgAqIzQ2H035Z3rl8g36+kO0qzgtNWLUTNpxNDlLNbeRJg8kPmQwdQgJ8UopX67n9RVV UBVC6gI4riGwy05oq0qWnzVXHBvJNRW+XIRzmDw0QuFZczNwSXnkrJyqKMwpsTKNfHgV Z8Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=; b=a46WNKwhB+iP8LzObZjcXbTwmaBWBfBeI1ISWqctDOYWtTCUJDZI4mKlkuwGzod2Em bVqyG3y8TLqN/AQKUOiM2CfOuBvHu57aKvK4k2xSCIJsPTCWlafUw0G6h+fEUVaPYTdH P/jszD8qXPGfKl9AJ8nVh2ediuKWCe1+pNVycDtoKPjZEQ+JnkbVpVsM38ukunVEfc3e TaCgDoIdTEabIYT5rLXstVAXgV8V0IZMUHt0kzTtDICmPslBmgdde3qGeoG2BGbOI2Hm 7Lv/4uh8wVvshipi7riDQK6E0hlevWUPUZqgFuxpzH/2X+5aYl2oGywqQxWg+JaCh365 X2cA== X-Gm-Message-State: ALKqPwdpMQC8/xON3yC4zH8KdjvH2xBX3oNPiZ2lyBFjvhCimoCKY4pf TWz3Az+eufSN40vuvgv8z26yt33wyGnBacwDTsY= X-Google-Smtp-Source: ADUXVKKU4P6/h2mgWIj3cMoO/x0vHXAgoW9C8Ee3HB/5TJEox9vNxb8gFC9B0/MTVz3eUSSun5rvxnVagI8blMpSoKc= X-Received: by 2002:a19:97cb:: with SMTP id z194-v6mr498101lfd.17.1527649804876; Tue, 29 May 2018 20:10:04 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:3bd4:0:0:0:0:0 with HTTP; Tue, 29 May 2018 20:10:04 -0700 (PDT) In-Reply-To: References: From: Lucas Ontivero Date: Wed, 30 May 2018 00:10:04 -0300 Message-ID: To: Jim Posen , Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="000000000000f5fefd056d63af9e" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Wed, 30 May 2018 03:11:05 +0000 Subject: Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 May 2018 03:10:07 -0000 --000000000000f5fefd056d63af9e Content-Type: text/plain; charset="UTF-8" Hi Jim, Yes please, could you share CSV? We are developing a Wallet that uses Golomb-Rice filters it would help a lot for determine the best P value depending on the estimated number of elements the client needs to watch. 2018-05-29 19:38 GMT-03:00 Jim Posen via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org>: > This is a really cool finding, thanks Pieter! > > I did some more analysis on selecting a good P value to reduce total data > downloaded considering both filters themselves and blocks in the case of > false positive matches, using data from mainnet. The quantity it minimizes > is: > > filter_size(N, B) + block_size * false_positive_probability(C, N, B) > > N is the number of filter elements per block > B is the Golomb-Rice coding parameter > C is the number of filter elements watched by the client > > The main result is that: > > For C = 10, B = 13 is optimal > For C = 100, B = 16 is optimal > For C = 1,000, B = 20 is optimal > For C = 10,000, B = 23 is optimal > > So any value of B in the range 16 to 20 seems reasonable, with M = 1.4971 > * 2^B for optimal compression, as Pieter derived. The selection of the > parameter depends on the target number of elements that a client may watch. > > I attached some of the results, and would be happy to share the CSV and > raw notebook if people are interested. > > > On Fri, May 25, 2018 at 2:14 PM Gregory Maxwell via bitcoin-dev < > bitcoin-dev@lists.linuxfoundation.org> wrote: > >> On Fri, May 25, 2018 at 6:42 PM, Gregory Maxwell wrote: >> > configuration is roughly right, then M=1569861 and rice parameter 19 >> > should be used. >> >> That should have been M=784931 B=19 ... paste error. >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >> > > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > > --000000000000f5fefd056d63af9e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Jim,

Yes please, could yo= u share CSV? We are developing a Wallet that uses Golomb-Rice filters it wo= uld help a lot for determine the best P value depending on the estimated nu= mber of elements the client needs to watch.

2018-05-29 19:38 GMT-03:00 Jim P= osen via bitcoin-dev <bitcoin-dev@lists.linuxfoundatio= n.org>:
Th= is is a really cool finding, thanks Pieter!

I did some m= ore analysis on selecting a good P value to reduce total data downloaded co= nsidering both filters themselves and blocks in the case of false positive = matches, using data from mainnet. The quantity it minimizes is:
<= br>
filter_size(N, B)=C2=A0+ block_size * false_positive_probabil= ity(C, N, B)

N is the number of filter elements pe= r block
B is the Golomb-Rice coding parameter
C is the = number of filter elements watched by the client

Th= e main result is that:

For C =3D 10, B =3D 13 is o= ptimal
For C =3D 100, B =3D 16 is optimal
For C =3D 1,0= 00, B =3D 20 is optimal
For C =3D 10,000, B =3D 23 is optimal

So any value of B in the range 16 to 20 seems reasona= ble, with M =3D=C2=A01.4971 * 2^B for optimal compressio= n, as Pieter derived. The selection of the parameter depends on the target = number of elements that a client may watch.

I attached some of the result= s, and would be happy to share the CSV and raw notebook if people are inter= ested.


On Fri, May 25, 2018 a= t 2:14 PM Gregory Maxwell via bitcoin-dev <bitcoin-dev@lists.linuxf= oundation.org> wrote:
On Fri= , May 25, 2018 at 6:42 PM, Gregory Maxwell <greg@xiph.org> wrote:
> configuration is roughly right, then M=3D1569861 and rice parameter 19=
> should be used.

That should have been M=3D784931 B=3D19=C2=A0 ... paste error.
_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org= /mailman/listinfo/bitcoin-dev

_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.= linuxfoundation.org
https://lists.linuxfoundation.org= /mailman/listinfo/bitcoin-dev


--000000000000f5fefd056d63af9e--