Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 5495C86A for ; Wed, 6 Jun 2018 15:14:31 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-it0-f42.google.com (mail-it0-f42.google.com [209.85.214.42]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 5015175A for ; Wed, 6 Jun 2018 15:14:30 +0000 (UTC) Received: by mail-it0-f42.google.com with SMTP id 76-v6so8613605itx.4 for ; Wed, 06 Jun 2018 08:14:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bHOkZbN2eSEedICd+Vh0nAhmZxLc4+oyxGY9imO8an0=; b=YGciQQh8P4UTF6DjaZ3ga5HB2fDEVxANx+mAMy5amH1S5K1ddl6FBo2BiK3nZdrRZv S/Fhf/RlD45Bq3IpiwVn/ivMs1inLV/Wo6Aw9LGggcy8Q5o1m+OVxpgN01JVJ1FPgEDz yDPljiBGu4lddx1o4L80Rhs7r6ykb+s+4TFuHYIRW+3Sc9b2jSHRgvXe+Avm3ttXT4iD vaXZO6lrfMX0LXiEKTAe8k4SLdhYpfiZfGefDvf+JEahQWLmipU3Ktn65In7ubN5rSdi 42S6bevNjtw7a7y5RmKPR7MWrEO6NRpng+lebvv8ZrFZ4yvc96VmfPijIa2jn9dtTW5g VM3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bHOkZbN2eSEedICd+Vh0nAhmZxLc4+oyxGY9imO8an0=; b=MR3FVsdLNSpA6towY3AwC0aThTdi6+M9OrEhA55UgUsu586yGhIxIxJk10Pw0UIeFJ MSaRL/PGvPSyYVgehjFn6ujrX0p5ad6Z8Kq6mOO+ahdJE3qdSLluTt3M00qFg3Wfhjjl fC1BPkVbwL3t/AP1ofh5NcAqYVd/vYd5nTjjkJ3XjIhUAkl4AYK0BiivZ98HMyyy5rR9 TYSrhHoBT27t/0+l+kKbs5/mDGNFPKYHn9Fe+Te1lI+1SNzI3h+V0g6rpzn6Hd0EZu5X kV9V7M9q3wHM6Ko/EwXgmaCpTfAT4lieXDnXOZ+3O9qS253jDZtutnNRPq8qlqutv4kB bibA== X-Gm-Message-State: APt69E3EcwgnBNTs5BpPTjEw23FVzoh9aGFA/pTcvbN2PNRYeSnHcOjK 5fNv5ozp55E6eKyq+yNsJZrE7F1ShYU7K6lI3QVQXg== X-Google-Smtp-Source: ADUXVKIRiOqc7UxFvLRPiWxmN215JaPer0/mVx3b3VOC4ZqnBFFIWA4f4/73PiIoR2xa+o8JdPpw900lZ/Z6amTvDVc= X-Received: by 2002:a24:9f06:: with SMTP id c6-v6mr2731752ite.133.1528298069507; Wed, 06 Jun 2018 08:14:29 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Riccardo Casatta Date: Wed, 6 Jun 2018 17:14:17 +0200 Message-ID: To: laolu32@gmail.com Content-Type: multipart/alternative; boundary="0000000000008b4602056dfa9faf" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Wed, 06 Jun 2018 15:26:20 +0000 Cc: Bitcoin Protocol Discussion Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2018 15:14:31 -0000 --0000000000008b4602056dfa9faf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sorry if I continue on the subject even if =E2=80=8Bcustom filter types are considered in BIP 157/158 . I am doing it because =E2=80=8B: 1)=E2=80=8B with a fixed target FP=3D2^-20 (or 1/784931) =E2=80=8B and the multi layer filtering maybe it's reasonable to consider l= ess than ~20 bits for the golomb encoding of the per-block filter (one day committed in the blockchain) 2) based on the answer received, privacy leak if downloading a subset of filters doesn't look a concern 3) As far as I know, anyone is considering to use a map instead of a filter for the upper layers of the filter=E2=80=8B. Simplistic example: Suppose to have a 2 blocks blockchain, every block contains N items for the filter: 1) In the current discussed filter we have 2 filters of 20N bits 2) In a two layer solution, we have 1 map of (10+1)2N bits and 2 filters of 10N bits The additional bit in the map discriminate if the match is in the first or in the second block. Supposing to have 1 match in the two blocks, the filter size downloaded in the first case is always 40N bits, while the expected downloaded size in the second case is 22N+2^-10*10N+10N ~=3D 32N with the same FP because independence. This obviously isn't a full analysis of the methodology, the expected downloaded size in the second case could go from the best case 22N bits to the worst case of 42N bits... @Gregory > I don't know what portion of coins created are spent in the same 144 bloc= k window... About 50% source code From block 393216 to 458752 (still waiting for results on all the blockchain) Total outputs 264185587 size: 2 spent: 11791058 ratio:0.04463172322871649 size: 4 spent: 29846090 ratio:0.11297395266305728 size: 16 spent: 72543182 ratio:0.2745917475051355 size: 64 spent: 113168726 ratio:0.4283682818775424 size: 144 spent: 134294070 ratio:0.508332311103709 size: 256 spent: 148824781 ratio:0.5633342177747191 size: 1024 spent: 179345566 ratio:0.6788620379960395 size: 4096 spent: 205755628 ratio:0.7788298761355213 size: 16384 spent: 224448158 ratio:0.849585174379706 Another point to consider is that if we don't want the full transaction history of our wallet but only the UTXO, the upper layer map could contain only the item which are not already spent in the considered window. As we can see from the previous result if the window is 16384 ~85% of the elements are already spent suggesting a very high time locality. (apart 144, I choose power of 2 windows so there are an integer number of bits in the map) It's possible we need ~20 bits anyway for the per-block filters because there are always connected wallets which one synced, always download the last filter, anyway the upper layer map looks very promising for longer sync. Il giorno mer 6 giu 2018 alle ore 03:13 Olaoluwa Osuntokun < laolu32@gmail.com> ha scritto: > It isn't being discussed atm (but was discussed 1 year ago when the BIP > draft was originally published), as we're in the process of removing item= s > or filters that aren't absolutely necessary. We're now at the point where > there're no longer any items we can remove w/o making the filters less > generally useful which signals a stopping point so we can begin widesprea= d > deployment. > > In terms of a future extension, BIP 158 already defines custom filter > types, > and BIP 157 allows filters to be fetched in batch based on the block heig= ht > and numerical range. The latter feature can later be modified to return a > single composite filter rather than several individual filters. > > -- Laolu > > > On Mon, Jun 4, 2018 at 7:28 AM Riccardo Casatta via bitcoin-dev < > bitcoin-dev@lists.linuxfoundation.org> wrote: > >> I was wondering why this multi-layer multi-block filter proposal isn't >> getting any comment, >> is it because not asking all filters is leaking information? >> >> Thanks >> >> Il giorno ven 18 mag 2018 alle ore 08:29 Karl-Johan Alm via bitcoin-dev = < >> bitcoin-dev@lists.linuxfoundation.org> ha scritto: >> >>> On Fri, May 18, 2018 at 12:25 AM, Matt Corallo via bitcoin-dev >>> wrote: >>> > In general, I'm concerned about the size of the filters making existi= ng >>> > SPV clients less willing to adopt BIP 158 instead of the existing blo= om >>> > filter garbage and would like to see a further exploration of ways to >>> > split out filters to make them less bandwidth intensive. Some further >>> > ideas we should probably play with before finalizing moving forward i= s >>> > providing filters for certain script templates, eg being able to only >>> > get outputs that are segwit version X or other similar ideas. >>> >>> There is also the idea of multi-block filters. The idea is that light >>> clients would download a pair of filters for blocks X..X+255 and >>> X+256..X+511, check if they have any matches and then grab pairs for >>> any that matched, e.g. X..X+127 & X+128..X+255 if left matched, and >>> iterate down until it ran out of hits-in-a-row or it got down to >>> single-block level. >>> >>> This has an added benefit where you can accept a slightly higher false >>> positive rate for bigger ranges, because the probability of a specific >>> entry having a false positive in each filter is (empirically speaking) >>> independent. I.e. with a FP probability of 1% in the 256 range block >>> and a FP probability of 0.1% in the 128 range block would mean the >>> probability is actually 0.001%. >>> >>> Wrote about this here: https://bc-2.jp/bfd-profile.pdf (but the filter >>> type is different in my experiments) >>> _______________________________________________ >>> bitcoin-dev mailing list >>> bitcoin-dev@lists.linuxfoundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >>> >> >> >> -- >> Riccardo Casatta - @RCasatta >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >> > --=20 Riccardo Casatta - @RCasatta --0000000000008b4602056dfa9faf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Sorry if I continue on the subject even if
=E2=80=8Bcustom filt= er types are considered in BIP 157/158
.
I am doing it
=C2=A0beca= use
= =E2=80=8B:
1)=E2=80=8B
with a fixed target FP=3D2^-20 =C2= =A0(or 1/784931)
=E2=80=8B and the multi layer filtering maybe it's reasonabl= e to consider less than ~20 bits for the golomb encoding of the per-block f= ilter (one day committed in the blockchain)=C2=A0
2) based on th= e answer received, privacy leak if downloading a subset of filters doesn= 9;t look a concern
3)=C2=A0
As far as I know, anyone is considering to use a map instead of a filter f= or the upper layers of the filter=E2=80=8B.

Simplistic example:
Suppose to have a 2 blocks blockchain, every block= contains N items for the filter:
1) In the current discussed filter we have 2 = filters of 20N bits
2) In a two layer solution, we have 1 map o= f (10+1)2N bits and 2 filters of 10N bits
The additional bit in the map discriminate if th= e match is in the first or in the second block.
Supposing to have 1 match in the two= blocks, the filter size downloaded in the first case is always 40N bits, w= hile the expected downloaded size in the second case is 22N+2^-10*10N+10N ~= =3D 32N with the same FP because independence.
This obviously isn't a full analysis of= the methodology, the expected downloaded size in the second case could go = from the best case 22N bits to the worst case of 42N bits...

@Gregory
= > I don't know=C2=A0what portion of coins = created are spent in the same 144 block
window...

About 50%

From block = 393216 to 458752=C2=A0 (still waiting for results on all the blockchain)
Total outpu= ts 264185587
si= ze: 2 spent: 11791058 ratio:0.04463172322871649
size: 4 spent: 29846090 ratio:0.1129739526= 6305728
size: 1= 6 spent: 72543182 ratio:0.2745917475051355
size: 64 spent: 113168726 ratio:0.4283682818775= 424
size: 144 s= pent: 134294070 ratio:0.508332311103709
size: 256 spent: 148824781 ratio:0.563334217774719= 1
size: 1024 sp= ent: 179345566 ratio:0.6788620379960395
size: 4096 spent: 205755628 ratio:0.77882987613552= 13
size: 16384 = spent: 224448158 ratio:0.849585174379706

Another point to consider is that if we don't want the = full transaction history of our wallet but only the UTXO, the upper layer m= ap could contain only the item which are not already spent in the considere= d window. As we can see from the previous result= if the window is 16384 ~85% of the elements are already spent suggesting a= very high time locality. (apart 144, I choose power of 2 windows so there = are an integer number of bits in the map)

It's possible we need ~20 bits anyway for the per-bloc= k filters because there are always connected wallets which one synced, alwa= ys download the last filter, anyway the upper layer map looks very promisin= g for longer sync.

Il giorno mer 6 giu 2018 alle ore 03:13 Olaoluwa Osun= tokun <laolu32@gm= ail.com> ha scritto:
It isn't being discussed atm (but was discussed 1 year a= go when the BIP
draft was originally published), as we're in = the process of removing items
or filters that aren't absolute= ly necessary. We're now at the point where
there're no lo= nger any items we can remove w/o making the filters less
generall= y useful which signals a stopping point so we can begin widespread
deployment.=C2=A0=C2=A0

In terms of a future ext= ension, BIP 158 already defines custom filter types,
and BIP 157 = allows filters to be fetched in batch based on the block height
a= nd numerical range. The latter feature can later be modified to return a
single composite filter rather than several individual filters.

-- Laolu


On Mon, Jun 4, 2018 at 7:28 AM Riccardo Casatta via= bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
I was wondering why this multi-layer multi-block filter proposal is= n't getting any comment,
is it beca= use not asking all filters is leaking information?

Thanks

=
Il giorno ven 18 mag 2018 alle = ore 08:29 Karl-Johan Alm via bitcoin-dev <bitcoin-dev@lists.linuxfoundat= ion.org> ha scritto:
On Fri,= May 18, 2018 at 12:25 AM, Matt Corallo via bitcoin-dev
<bitcoin-dev@lists.linuxfoundation.org> wrote:
> In general, I'm concerned about the size of the filters making exi= sting
> SPV clients less willing to adopt BIP 158 instead of the existing bloo= m
> filter garbage and would like to see a further exploration of ways to<= br> > split out filters to make them less bandwidth intensive. Some further<= br> > ideas we should probably play with before finalizing moving forward is=
> providing filters for certain script templates, eg being able to only<= br> > get outputs that are segwit version X or other similar ideas.

There is also the idea of multi-block filters. The idea is that light
clients would download a pair of filters for blocks X..X+255 and
X+256..X+511, check if they have any matches and then grab pairs for
any that matched, e.g. X..X+127 & X+128..X+255 if left matched, and
iterate down until it ran out of hits-in-a-row or it got down to
single-block level.

This has an added benefit where you can accept a slightly higher false
positive rate for bigger ranges, because the probability of a specific
entry having a false positive in each filter is (empirically speaking)
independent. I.e. with a FP probability of 1% in the 256 range block
and a FP probability of 0.1% in the 128 range block would mean the
probability is actually 0.001%.

Wrote about this here: https://bc-2.jp/bfd-profile.pdf (but the f= ilter
type is different in my experiments)
_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev


--
Riccardo Casatta - @RCasatta
_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev


--
Riccar= do Casatta - @RC= asatta
--0000000000008b4602056dfa9faf--