Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 45FA2E4D for ; Sat, 19 May 2018 02:51:19 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-wm0-f65.google.com (mail-wm0-f65.google.com [74.125.82.65]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 176A5B0 for ; Sat, 19 May 2018 02:51:18 +0000 (UTC) Received: by mail-wm0-f65.google.com with SMTP id f8-v6so18229741wmc.4 for ; Fri, 18 May 2018 19:51:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=EP/poyr6jJxFayL3QKvwckUEb1mAVzJksnXmgtdUly0=; b=rgUWRl0LRhssyIG/hd6WuDkGEuAoXS3YcTUm5kgLESKm3hFf9yNa11onnvgvclIkdA YvuI5h6zQBPhfoZvvLpEo5J/8ZIgt/2DfSed8YzBwDLMqxMQTmaAeRljoQsykV1yX7XA S+pXA/jdP8Xnbj78zNd4elo/LsVb+4XBtO7ik8KAKbCtJn0MZy3D8ODN5EXj50KzmYIU mnuOT/J/+SqwTauztVgyzx/POw3jmJt3lz9kly3ogjnVtmAkbrlIz5BD+zUDRh0IMnEm iRoJX/8e2I+/tMIY2AXl5+yNszq1bSeCpl3Ncj3ukXXtGdJ4NA7yOe1tZ1xnC4cRaLJ6 UFgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=EP/poyr6jJxFayL3QKvwckUEb1mAVzJksnXmgtdUly0=; b=IzniSCW8bwdsb8jIsdT+yyM2JXMZun7KlzFVdj1My05yRqEJAn4yq8bCA65eEVT39T rdvXoLVKsvnyHceQBwjAcYWs16kNIUPVIoYE3GKofp0SM0H117bfsTOdMhh36FjRtdhS 1G+AR+AGnZ8DDMXrjK7RPrwJR5Ixct1h1hA6iUvyMRsU1cNO5Ww8bUrDebW4ses3U5QG umfemiyhMAAmfgBf+8wKG1HhWkUMw11Xo2ruHmxJMFEa/uPBrCXrRX5RvJy9JxKiy4Q7 aYBeaU8f+ZANHKs/w/rK1Dg/N43vIjtVwbeifPfAoW8mUt/FI4tkPlbO/T49l3DNUZdZ wnWQ== X-Gm-Message-State: ALKqPwdNELMkFdAbEKXWQQZU7lDb4A56ZIVhun0r3mgsatS9Z+rjyMEm lbijx8gAcI5/ozR45EMeYKZ65O8d1bin+gFX0o7w9g== X-Google-Smtp-Source: AB8JxZporD0M7ei6tH8kJI4JJH66mWSzGcrLA1bxTEFCQj4a+4BoPY/rCGQfGcuzIUcjksm/7DijYK1nntEjP5Jl2qY= X-Received: by 2002:aa7:c2d0:: with SMTP id m16-v6mr14427761edp.171.1526698276507; Fri, 18 May 2018 19:51:16 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Olaoluwa Osuntokun Date: Fri, 18 May 2018 19:51:02 -0700 Message-ID: To: Matt Corallo , Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="00000000000073544d056c862419" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 May 2018 02:51:19 -0000 --00000000000073544d056c862419 Content-Type: text/plain; charset="UTF-8" Matt wrote: > I believe (1) could be skipped entirely - there is almost no reason why > you'd not be able to filter for, eg, the set of output scripts in a > transaction you know about Depending on the use-case, the txid is more precise than searching for the output script as it doesn't need to deal with duplicated output scripts. To my knowledge, lnd is the only major project that currently utilizes BIP 157+158. At this point, we use the txid in the regular filter for confirmations (channel confirmed, sweep tx confirmed, cltv confirmed, etc). Switching to use output scripts instead wouldn't be _too_ invasive w.r.t changes required in the codebase, only the need to deal with output script duplication could be annoying. > (2) and (3) may want to be split out - many wallets may wish to just find > transactions paying to them, as transactions spending from their outputs > should generally be things they've created. FWIW, in the "rescan after importing by seed phrase" both are needed in order to ensure the wallet ends up with the proper output set after the scan. In lnd we actively use both (2) to detect deposits to the internal wallet, and (3) to be notified when our channel outputs are spent on-chain (and also generally when any of our special scripts are spent). > In general, I'm concerned about the size of the filters making existing SPV > clients less willing to adopt BIP 158 instead of the existing bloom filter > garbage and would like to see a further exploration of ways to split out > filters to make them less bandwidth intensive. Agreed that the current filter size may prevent adoption amongst wallets. However, the other factor that will likely prevent adoption amongst current BIP-37 mobile wallets is the lack of support for notifying _unconfirmed_ transactions. When we drafted up the protocol last year and asked around, this was one of the major points of contention amongst existing mobile wallets that utilize BIP 37. On the other hand, the two "popular" BIP 37 wallets I'm aware of (Breadwallet, and Andreas Schildbach's Bitcoin Wallet) have lagged massively behind the existing set of wallet related protocol upgrades. For example, neither of them have released versions of their applications that take advantage of segwit in any manner. Breadwallet has more or less "pivoted" (they did an ICO and have a token) and instead is prioritizing things like adding random ICO tokens over catching up with the latest protocol updates. Based on this behavior, even if the filter sizes were even _more_ bandwidth efficient that BIP 37, I don't think they'd adopt the protocol. > Some further ideas we should probably play with before finalizing moving > forward is providing filters for certain script templates, eg being able to > only get outputs that are segwit version X or other similar ideas. Why should this block active deployment of BIP 157+158 as is now? As defined, the protocol already allows future updates to add additional filter types. Before the filters are committed, each filter type requires a new filter header. We could move to a single filter header that commits to the hashes of _all_ filters, but that would mean that a node couldn't serve the headers unless they had all currently defined features, defeating the optionality offered. Additionally, more filters entails more disk utilization for nodes serving these filters. Nodes have the option to instead create the filters at "query time", but then this counters the benefit of simply slinging the filters from disk (or a memory map or w/e). IMO, it's a desirable feature that serving light clients no longer requires active CPU+I/O and instead just passive I/O (nodes could even write the filters to disk in protocol msg format). To get a feel for the current filter sizes, a txid-only filter size, and a regular filter w/o txid's, I ran some stats on the last 10k blocks: regular size: 217107653 bytes regular avg: 21710.7653 bytes regular median: 22332 bytes regular max: 61901 bytes txid-only size: 34518463 bytes txid-only avg: 3451.8463 bytes txid-only median: 3258 bytes txid-only max: 10193 bytes reg-no-txid size: 182663961 bytes reg-no-txid avg: 18266.3961 bytes reg-no-txid median: 19198 bytes reg-no-txid max: 60172 bytes So the median regular filter size over the past 10k blocks is 20KB. If we extract the txid from the regular filter and add a txid-only filter, the median size of that is 3.2KB. Finally, the median size of a modified regular filter (no txid) is 19KB. -- Laolu On Thu, May 17, 2018 at 8:33 AM Matt Corallo via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > BIP 158 currently includes the following in the "basic" filter: 1) > txids, 2) output scripts, 3) input prevouts. > > I believe (1) could be skipped entirely - there is almost no reason why > you'd not be able to filter for, eg, the set of output scripts in a > transaction you know about and (2) and (3) may want to be split out - > many wallets may wish to just find transactions paying to them, as > transactions spending from their outputs should generally be things > they've created. > > In general, I'm concerned about the size of the filters making existing > SPV clients less willing to adopt BIP 158 instead of the existing bloom > filter garbage and would like to see a further exploration of ways to > split out filters to make them less bandwidth intensive. Some further > ideas we should probably play with before finalizing moving forward is > providing filters for certain script templates, eg being able to only > get outputs that are segwit version X or other similar ideas. > > Matt > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > --00000000000073544d056c862419 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Matt wrote:
> I believe (1) could be ski= pped entirely - there is almost no reason why
> you'd not = be able to filter for, eg, the set of output scripts in a
> tr= ansaction you know about

Depending on the use-case= , the txid is more precise than searching for the
output script a= s it doesn't need to deal with duplicated output scripts. To
= my knowledge, lnd is the only major project that currently utilizes BIP
157+158. At this point, we use the txid in the regular filter for
confirmations (channel confirmed, sweep tx confirmed, cltv confirme= d, etc).
Switching to use output scripts instead wouldn't be = _too_ invasive w.r.t
changes required in the codebase, only the n= eed to deal with output script
duplication could be annoying.

> (2) and (3) may want to be split out - many wall= ets may wish to just find
> transactions paying to them, as tr= ansactions spending from their outputs
> should generally be t= hings they've created.

FWIW, in the "resc= an after importing by seed phrase" both are needed in
order = to ensure the wallet ends up with the proper output set after the
scan. In lnd we actively use both (2) to detect deposits to the internal
wallet, and (3) to be notified when our channel outputs are spent = on-chain
(and also generally when any of our special scripts are = spent).

> In general, I'm concerned about t= he size of the filters making existing SPV
> clients less will= ing to adopt BIP 158 instead of the existing bloom filter
> ga= rbage and would like to see a further exploration of ways to split out
> filters to make them less bandwidth intensive.

Agreed that the current filter size may prevent adoption amongst wa= llets.
However, the other factor that will likely prevent adoptio= n amongst current
BIP-37 mobile wallets is the lack of support fo= r notifying _unconfirmed_
transactions. When we drafted up the pr= otocol last year and asked around,
this was one of the major poin= ts of contention amongst existing mobile
wallets that utilize BIP= 37.

On the other hand, the two "popular"= ; BIP 37 wallets I'm aware of
(Breadwallet, and Andreas Schil= dbach's Bitcoin Wallet) have lagged massively
behind the exis= ting set of wallet related protocol upgrades. For example,
neithe= r of them have released versions of their applications that take
= advantage of segwit in any manner. Breadwallet has more or less "pivot= ed"
(they did an ICO and have a token) and instead is priori= tizing things like
adding random ICO tokens over catching up with= the latest protocol updates.
Based on this behavior, even if the= filter sizes were even _more_ bandwidth
efficient that BIP 37, I= don't think they'd adopt the protocol.

&g= t; Some further ideas we should probably play with before finalizing moving=
> forward is providing filters for certain script templates, = eg being able to
> only get outputs that are segwit version X = or other similar ideas.

Why should this block acti= ve deployment of BIP 157+158 as is now? As
defined, the protocol = already allows future updates to add additional filter
types. Bef= ore the filters are committed, each filter type requires a new
fi= lter header. We could move to a single filter header that commits to the
hashes of _all_ filters, but that would mean that a node couldn'= ;t serve the
headers unless they had all currently defined featur= es, defeating the
optionality offered.

A= dditionally, more filters entails more disk utilization for nodes serving
these filters. Nodes have the option to instead create the filters= at "query
time", but then this counters the benefit of= simply slinging the filters
from disk (or a memory map or w/e). = IMO, it's a desirable feature that
serving light clients no l= onger requires active CPU+I/O and instead just
passive I/O (nodes= could even write the filters to disk in protocol msg
format).

To get a feel for the current filter sizes, a txid-o= nly filter size, and a
regular filter w/o txid's, I ran some = stats on the last 10k blocks:

regular size:=C2=A0 = =C2=A0 217107653=C2=A0 bytes
regular avg:=C2=A0 =C2=A0 =C2=A02171= 0.7653 bytes
regular median:=C2=A0 22332=C2=A0 =C2=A0 =C2=A0 byte= s
regular max:=C2=A0 =C2=A0 =C2=A061901=C2=A0 =C2=A0 =C2=A0 bytes=

txid-only size:=C2=A0 =C2=A0 34518463=C2=A0 bytes=
txid-only avg:=C2=A0 =C2=A0 =C2=A03451.8463 bytes
txid= -only median:=C2=A0 3258=C2=A0 =C2=A0 =C2=A0 bytes
txid-only max:= =C2=A0 =C2=A0 =C2=A010193=C2=A0 =C2=A0 =C2=A0bytes

reg-no-txid size:=C2=A0 =C2=A0 182663961=C2=A0 bytes
reg-no-txid= avg:=C2=A0 =C2=A0 =C2=A018266.3961 bytes
reg-no-txid median:=C2= =A0 19198=C2=A0 =C2=A0 =C2=A0 bytes
reg-no-txid max:=C2=A0 =C2=A0= =C2=A060172=C2=A0 =C2=A0 =C2=A0 bytes

So the medi= an regular filter size over the past 10k blocks is 20KB. If we
ex= tract the txid from the regular filter and add a txid-only filter, the
median size of that is 3.2KB. Finally, the median size of a modified = regular
filter (no txid) is 19KB.

-- Lao= lu


On T= hu, May 17, 2018 at 8:33 AM Matt Corallo via bitcoin-dev <bitcoin-dev@lists.linuxfoundatio= n.org> wrote:
BIP 158 curren= tly includes the following in the "basic" filter: 1)
txids, 2) output scripts, 3) input prevouts.

I believe (1) could be skipped entirely - there is almost no reason why
you'd not be able to filter for, eg, the set of output scripts in a
transaction you know about and (2) and (3) may want to be split out -
many wallets may wish to just find transactions paying to them, as
transactions spending from their outputs should generally be things
they've created.

In general, I'm concerned about the size of the filters making existing=
SPV clients less willing to adopt BIP 158 instead of the existing bloom
filter garbage and would like to see a further exploration of ways to
split out filters to make them less bandwidth intensive. Some further
ideas we should probably play with before finalizing moving forward is
providing filters for certain script templates, eg being able to only
get outputs that are segwit version X or other similar ideas.

Matt
_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev
--00000000000073544d056c862419--