MIME-Version: 1.0
References: <d43c6082-1b2c-c95b-5144-99ad0021ea6c@mattcorallo.com>
	<CALJw2w7+VUYtMtdTexW6iE3mc0DsS9DME_ynP8skg_+-bv_tPA@mail.gmail.com>
	<CADabwBDG2_2syU0AnjbEfqTL=5ERRQkL6NOyVN7gAyJTAaf7UA@mail.gmail.com>
	<CADZtCSjsZ=_C+cFUXbAim=56QG4p0UdE4HEo9ZKJtNgEH_DqhQ@mail.gmail.com>
	<CALJw2w5FY3EoqtA4HTJr8CCwZ-Dyf=XVbO8Hd=TDjxBEgwLULQ@mail.gmail.com>
In-Reply-To: <CALJw2w5FY3EoqtA4HTJr8CCwZ-Dyf=XVbO8Hd=TDjxBEgwLULQ@mail.gmail.com>
From: Jim Posen <jim.posen@gmail.com>
Date: Tue, 5 Jun 2018 10:22:04 -0700
Message-ID: <CADZtCShgQ-Jho3kH2Gy+CCiCeNX01UF0oo5AGKMaRw=SaOfwmw@mail.gmail.com>
To: Karl Johan Alm <karljohan-alm@garage.co.jp>
Content-Type: multipart/alternative; boundary="000000000000ac700f056de84aa9"
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size
Precedence: list

--000000000000ac700f056de84aa9
Content-Type: text/plain; charset="UTF-8"

>
> I don't understand this comment. The bandwidth gains are not from
> address reuse, they are from the observed property that false
> positives are independent between two filters. I.e. clients that
> connect once a day will probably download 2-3 filters at most, if they
> had nothing relevant in the last ~144 blocks.
>

Your multi-layer digest proposal (https://bc-2.jp/bfd-profile.pdf) uses a
different type of filter which seems more like a compressed Bloom filter if
I understand it correctly. Appendix A shows how the FP rate increases with
the number of elements.

With the Golomb-Coded Sets, the filter size increases linearly in the
number of elements for a fixed FP rate. So currently we are targeting an
~1/2^20 rate (actually 1/784931 now), and filter sizes are ~20 bits * N for
N elements. With a 1-layer digest covering let's say 16 blocks, you could
drop the FP rate on the digest filters and the block filters each to ~10
bits per element, I think, to get the same FP rate for a given block by
your argument of independence. But the digest is only half the size of the
16 combined filters and there's a high probability of downloading the other
half anyway. So unless there is greater duplication of elements in the
digest filters, it's not clear to me that there are great bandwidth
savings. But maybe there are. Even so, I think we should just ship the
block filters and consider multi-layer digests later.

--000000000000ac700f056de84aa9
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quot=
e" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)=
;padding-left:1ex">I don&#39;t understand this comment. The bandwidth gains=
 are not from<br>
address reuse, they are from the observed property that false<br>
positives are independent between two filters. I.e. clients that<br>
connect once a day will probably download 2-3 filters at most, if they<br>
had nothing relevant in the last ~144 blocks.<br></blockquote><div><br></di=
v><div>Your multi-layer digest proposal (<a href=3D"https://bc-2.jp/bfd-pro=
file.pdf">https://bc-2.jp/bfd-profile.pdf</a>) uses a different type of fil=
ter which seems more like a compressed Bloom filter if I understand it corr=
ectly. Appendix A shows how the FP rate increases with the number of elemen=
ts.</div><div><br></div><div>With the Golomb-Coded Sets, the filter size in=
creases linearly in the number of elements for a fixed FP rate. So currentl=
y we are targeting an ~1/2^20 rate (actually 1/<span style=3D"color:rgb(34,=
34,34);font-family:sans-serif;font-size:13px;font-style:normal;font-variant=
-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:n=
ormal;text-align:start;text-indent:0px;text-transform:none;white-space:norm=
al;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style=
:initial;text-decoration-color:initial;float:none;display:inline">784931 no=
w), and filter sizes are ~20 bits * N for N elements. With a 1-layer digest=
 covering let&#39;s say 16 blocks, you could drop the FP rate on the digest=
 filters and the block filters each to ~10 bits per element, I think, to ge=
t the same FP rate for a given block by your argument of independence. But =
the digest is only half the size of the 16 combined filters and there&#39;s=
 a high probability of downloading the other half anyway. So unless there i=
s greater duplication of elements in the digest filters, it&#39;s not clear=
 to me that there are great bandwidth savings. But maybe there are. Even so=
, I think we should just ship the block filters and consider multi-layer di=
gests later.</span></div></div></div>

--000000000000ac700f056de84aa9--