Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 91B0192F for ; Tue, 22 May 2018 09:23:33 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-wm0-f65.google.com (mail-wm0-f65.google.com [74.125.82.65]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id E00082C6 for ; Tue, 22 May 2018 09:23:31 +0000 (UTC) Received: by mail-wm0-f65.google.com with SMTP id m129-v6so31395791wmb.3 for ; Tue, 22 May 2018 02:23:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=CQpWOdWC5TJOZ+fnY5HPAUByCinZwRcXEZ6Zm4uzRgI=; b=YgKrKEFoJ+DPYTLAIr88oONYMTmtMPbawHtYdCOJ9Ab/2qVpFu3hcPkriN4KwieOD7 3ySBQ8PI817WNRHVYd9BXrlhML5QrB0oGlIhU1XWWbHP3ivIogR1wgb49mxxoB5UiaHa WHDWeYw4dS94bUvdkj7P+FX8MEhLv+Lz88m8a8l6u23tB2/Dnlams6+wq3Fj4S2UURqF 76lmYQ0NBNnStgPbskEEPDLm4JY0cl0sSiXyORauuzbKURhwStwLk/0Pb6SKkm3bhee/ nyWwLLqao1PQddO+UlphZXvuwsIYdPxEJzsa64uvQSTQ3mYrjDgH3VKhiQODj+o1WcOz vvUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=CQpWOdWC5TJOZ+fnY5HPAUByCinZwRcXEZ6Zm4uzRgI=; b=YZkmsJMz4kloGrJF1LMKTJifMhBAn8uBjWlNtofw9ni8MhZVp24S36agqWuIj63SyQ u7m9dX6zdcdoNq7xiQMMbyPlmgaTXKQ36ct97+B0ivsh5Min/spu4xx0oBTWwgCCCnCT 7pn+f0ahPo5O3yfrgivly3ED3Z5KpVLI6MARllthGnUvMrR5sensGAtgjkBULclTA7DY Q1GnUA2WajgiGs2ylvVaZHIZ6OMGo8OWjD6Xar1hVV3ybOf3uP8F7ZoNm3ugU7iGZ+3P aYamG99faMwehhU/TVaVS2NUsEXwV/8eJ6zbF/4UqcigIdRphiZ5QrC/0mXESRJ8zdjC bQIw== X-Gm-Message-State: ALKqPwc7aXMokPaGpDtSNmEs4FMyaT5ZrmLorBvyss63Amp/FC1zy7ed oxQaBK0gAeSe+Sy2TDP3OPxPI2bCyqoaWTwvhwo= X-Google-Smtp-Source: AB8JxZpGp2iy8+AntF9zRHieIkV7/J3pw0vt0A+HVoZNi95GwhzEYtnJg0raZVkJxMIncWX3nn3XBIvnWVL8UZL7BNU= X-Received: by 2002:a2e:5559:: with SMTP id j86-v6mr15288933ljb.147.1526981010346; Tue, 22 May 2018 02:23:30 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:d04d:0:0:0:0:0 with HTTP; Tue, 22 May 2018 02:23:29 -0700 (PDT) In-Reply-To: References: <22d375c7-a032-8691-98dc-0e6ee87a4b08@mattcorallo.com> From: =?UTF-8?Q?Johan_Tor=C3=A5s_Halseth?= Date: Tue, 22 May 2018 11:23:29 +0200 Message-ID: To: Olaoluwa Osuntokun Content-Type: multipart/alternative; boundary="000000000000b3566d056cc7f8ba" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Tue, 22 May 2018 13:09:05 +0000 Cc: Bitcoin Protocol Discussion Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 May 2018 09:23:33 -0000 --000000000000b3566d056cc7f8ba Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Maybe I didn't make it clear, but the distinction is that the current track allocates one service bit for each "filter type", where it has to be agreed upon up front what elements such a filter type contains. My suggestion was to advertise a bitfield for each filter type the node serves, where the bitfield indicates what elements are part of the filters. This essentially removes the notion of decided filter types and instead leaves the decision to full-nodes. This would require a "getcftypes" message, of course. - Johan On Tue, May 22, 2018 at 3:16 AM, Olaoluwa Osuntokun wrote: > > What if instead of trying to decide up front which subset of elements > will > > be most useful to include in the filters, and the size tradeoff, we let > the > > full-node decide which subsets of elements it serves filters for? > > This is already the case. The current "track" is to add new service bits > (while we're in the uncommitted phase) to introduce new fitler types. Lig= ht > clients can then filter out nodes before even connecting to them. > > -- Laolu > > On Mon, May 21, 2018 at 1:35 AM Johan Tor=C3=A5s Halseth > wrote: > >> Hi all, >> >> Most light wallets will want to download the minimum amount of data >> required to operate, which means they would ideally download the smalles= t >> possible filters containing the subset of elements they need. >> >> What if instead of trying to decide up front which subset of elements >> will be most useful to include in the filters, and the size tradeoff, we >> let the full-node decide which subsets of elements it serves filters for= ? >> >> For instance, a full node would advertise that it could serve filters fo= r >> the subsets 110 (txid+script+outpoint), 100 (txid only), 011 (script+out= point) >> etc. A light client could then choose to download the minimal filter typ= e >> covering its needs. >> >> The obvious benefit of this would be minimal bandwidth usage for the >> light client, but there are also some less obvious ones. We wouldn=E2=80= =99t have >> to decide up front what each filter type should contain, only the possib= le >> elements a filter can contain (more can be added later without breaking >> existing clients). This, I think, would let the most served filter types >> grow organically, with full-node implementations coming with sane defaul= ts >> for served filter types (maybe even all possible types as long as the >> number of elements is small), letting their operator add/remove types at >> will. >> >> The main disadvantage of this as I see it, is that there=E2=80=99s an ex= ponential >> blowup in the number of possible filter types in the number of element >> types. However, this would let us start out small with only the elements= we >> need, and in the worst case the node operators just choose to serve the >> subsets corresponding to what now is called =E2=80=9Cregular=E2=80=9D + = =E2=80=9Cextended=E2=80=9D filters >> anyway, requiring no more resources. >> >> This would also give us some data on what is the most widely used filter >> types, which could be useful in making the decision on what should be pa= rt >> of filters to eventually commit to in blocks. >> >> - Johan >> On Sat, May 19, 2018 at 5:12, Olaoluwa Osuntokun via bitcoin-dev < >> bitcoin-dev@lists.linuxfoundation.org> wrote: >> >> On Thu, May 17, 2018 at 2:44 PM Jim Posen via bitcoin-dev > >>> Monitoring inputs by scriptPubkey vs input-txid also has a massive >>>> advantage for parallel filtering: You can usually known your pubkeys >>>> well in advance, but if you have to change what you're watching block >>>> N+1 for based on the txids that paid you in N you can't filter them >>>> in parallel. >>>> >>> >>> Yes, I'll grant that this is a benefit of your suggestion. >>> >> >> Yeah parallel filtering would be pretty nice. We've implemented a serial >> filtering for btcwallet [1] for the use-case of rescanning after a seed >> phrase import. Parallel filtering would help here, but also we don't yet >> take advantage of batch querying for the filters themselves. This would >> speed up the scanning by quite a bit. >> >> I really like the filtering model though, it really simplifies the code, >> and we can leverage identical logic for btcd (which has RPCs to fetch th= e >> filters) as well. >> >> [1]: https://github.com/Roasbeef/btcwallet/blob/master/chain/ >> neutrino.go#L180 >> >> _______________________________________________ bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation. >> org/mailman/listinfo/bitcoin-dev >> >> --000000000000b3566d056cc7f8ba Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Maybe I didn't make it clear, but the distinction is t= hat the current track allocates
one service bit for each "filter t= ype", where it has to be agreed upon up front what
elements = such a filter type contains.

My suggestion was to = advertise a bitfield for each filter type the node serves,
where = the bitfield indicates what elements are part of the filters. This essentia= lly
removes the notion of decided filter types and instead leaves= the decision to=C2=A0
full-nodes.

This = would require a "getcftypes" message, of course.=C2=A0
<= div>
- Johan=C2=A0


On Tue, May 22, 2018 at 3:16 AM, O= laoluwa Osuntokun <laolu32@gmail.com> wrote:
> What if inst= ead of trying to decide up front which subset of elements will
&g= t; be most useful to include in the filters, and the size tradeoff, we let = the
> full-node decide which subsets of elements it serves fil= ters for?

This is already the case. The cur= rent "track" is to add new service bits
(while we'r= e in the uncommitted phase) to introduce new fitler types. Light
= clients can then filter out nodes before even connecting to them.

-- Laolu

On Mon, May 21, 2018 at 1:35 AM Johan Tor=C3=A5s Halseth <= johanth@gmail.com> wrote:
Hi all,

Most light wallets will want to download the minimum amount of data = required to operate, which means they would ideally download the smallest p= ossible filters containing the subset of elements they need.=C2=A0

What if instead of trying to dec= ide up front which subset of elements will be most useful to include in the= filters, and the size tradeoff, we let the full-node decide which subsets = of elements it serves filters for?

For instance, a full= node would advertise that it could serve filters for the subsets 110 (txid= +script+outpoint), 100 (txid only), 011 (
script+outpoint) etc. A lig= ht client could then choose to download the minimal filter type covering it= s needs.=C2=A0

The obvious benefit of this would b= e minimal bandwidth usage for the light client, but there are also some les= s obvious ones. We wouldn=E2=80=99t have to decide up front what each filte= r type should contain, only the possible elements a filter can contain (mor= e can be added later without breaking existing clients). This, I think, wou= ld let the most served filter types grow organically, with full-node implem= entations coming with sane defaults for served filter types (maybe even all= possible types as long as the number of elements is small), letting their = operator add/remove types at will.

The main disadv= antage of this as I see it, is that there=E2=80=99s an exponential blowup i= n the number of possible filter types in the number of element types. Howev= er, this would let us start out small with only the elements we need, and i= n the worst case the node operators just choose to serve the subsets corres= ponding to what now is called =E2=80=9Cregular=E2=80=9D + =E2=80=9Cextended= =E2=80=9D filters anyway, requiring no more resources.

=
This would also give us some data on what is the most widely used filt= er types, which could be useful in making the decision on what should be pa= rt of filters to eventually commit to in blocks.

- Johan
=
On Thu, May 17, 2018 at 2:44 PM Jim Posen via = bitcoin-dev <bitcoin-
Monitoring inputs by scriptPubkey vs input-txid also has a ma= ssive
advantage for parallel filtering: You can usually known your pubkeys
well in advance, but if you have to change what you're watching block N+1 for based on the txids that paid you in N you can't filter them in parallel.

Yes, I'l= l grant that this is a benefit of your suggestion.
<= /blockquote>

Yeah parallel filtering would be pretty nic= e. We've implemented a serial filtering for btcwallet [1] for the use-c= ase of rescanning after a seed phrase import. Parallel filtering would help= here, but also we don't yet take advantage of batch querying for the f= ilters themselves. This would speed up the scanning by quite a bit.
<= div>
I really like the filtering model though, it really simp= lifies the code, and we can leverage identical logic for btcd (which has RP= Cs to fetch the filters) as well.


________= _______________________________________ bitcoin-dev mailing list = bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/b= itcoin-dev
<= /div>
--000000000000b3566d056cc7f8ba--