Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 8893C911B for ; Mon, 4 Feb 2019 20:59:49 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id DDAEC854 for ; Mon, 4 Feb 2019 20:59:47 +0000 (UTC) Received: by mail-wm1-f53.google.com with SMTP id m1so1387913wml.2 for ; Mon, 04 Feb 2019 12:59:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=wy/ay4aeKhy/woSHsQvGrCvRoPOa1O8M4gsOI6f3lDw=; b=g5bQtzb4GrvhDJ4oxvUo+dGhYMidmDIaSV0MA6ZHgub7PqXEnZ+yNZgDUksgVmsFT0 ZKqqBc1FMoPhs8DJgvFp/QOpVDHXtmwrrYsE/1GlNEfQ1ZOItZBgXHACJtq82psGM3Te PnbSOqP9NvMcf46Pb6o9tVXzOaTbRnMaPp7RJklyP/vdzq6AI62094iD9h4ydIuLvQQ/ fULe6ZCvnY7hgndr45S8LyvSMCYCc7byksmOeqKHE94ZnbwrDpsd1lE4AwpIhyCBWnkP 6Ankhi5Yeo4MrK8AJ9pYOkyDHJpsYiXCCtcfUeiNvDxpJWbXIdO5jjpPiqDqIvBWW96O vk/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=wy/ay4aeKhy/woSHsQvGrCvRoPOa1O8M4gsOI6f3lDw=; b=oEovEihwtgsrqo32jxbrmgRTqBbpoFcSpjxm7vOQHCd7109MdEkcUb6qUtUISKlOqv Q0dOoAkqC1PBcMiHOYfvb8NHezFFE9eaFJrRkJSw2W3Xl0QzsDE1QRu2j044yWbEzjJ3 2fU357LLkJz+Osdvmtcd+dNtT2KmYsE1gt67udbOe+T9Sc8ruAdaXnyR4GZl4fZ4fycX OAyw5LTEe23fBWLSdc8BDE+C7iWE8qBxgEn+xb8GItB0S5xbikXGnU38a+BsQvq32EDM rWMNDA74wndX4aa9Qb8xZM7c7qC4nuk5QbZ9lWMHtT/5BD1sGRXwBQc6n35r6F9B6zdm /U5g== X-Gm-Message-State: AHQUAuZ9FzdfWruENV+eYbETHCzx8MWBVJeRl6Xz21DAjuQ/1fKAKqbJ tqfMrxnp0S1WW7OHdzZq25g= X-Google-Smtp-Source: AHgI3IYNEax4ih5D/3TZX17nNESls0RM4OPaiWpoEmBIPM5rhCLUU2RVuXOomNE5Jz4uAxXvjHXhnA== X-Received: by 2002:a1c:e488:: with SMTP id b130mr965413wmh.124.1549313986412; Mon, 04 Feb 2019 12:59:46 -0800 (PST) Received: from p200300dd672d1a29c53eb98a3a84ee16.dip0.t-ipconnect.de (p200300DD672D1A29C53EB98A3A84EE16.dip0.t-ipconnect.de. [2003:dd:672d:1a29:c53e:b98a:3a84:ee16]) by smtp.gmail.com with ESMTPSA id q1sm4320525wrs.89.2019.02.04.12.59.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 Feb 2019 12:59:45 -0800 (PST) From: Tamas Blummer Message-Id: Content-Type: multipart/signed; boundary="Apple-Mail=_1D2D1DF2-D6DE-48E3-B512-FDB54AC57E32"; protocol="application/pgp-signature"; micalg=pgp-sha512 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Date: Mon, 4 Feb 2019 21:59:44 +0100 In-Reply-To: To: Jim Posen References: <6D57649F-0236-4FBA-8376-4815F5F39E8A@gmail.com> X-Mailer: Apple Mail (2.3273) X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Mon, 04 Feb 2019 22:27:12 +0000 Cc: Bitcoin Protocol Discussion , Jim Posen Subject: Re: [bitcoin-dev] Interrogating a BIP157 server, BIP158 change proposal X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2019 20:59:49 -0000 --Apple-Mail=_1D2D1DF2-D6DE-48E3-B512-FDB54AC57E32 Content-Type: multipart/alternative; boundary="Apple-Mail=_8079CD50-3F65-4D92-8CDA-402066251404" --Apple-Mail=_8079CD50-3F65-4D92-8CDA-402066251404 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 I participated in that discussion in 2018, but have not had the insight = gathered by now though writing both client and server implementation of = BIP157/158 Pieter Wuille considered the design choice I am now suggesting here as = alternative (a) in: = https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-June/016064.h= tml = In his evaluation he recognized that a filter having spent output and = output scripts would allow decision on filter correctness by knowing the = block only. He did not evaluate the usefulness in the context of checkpoints, which = I think are an important shortcut here. Yes, a filter that is collecting input and output scripts is shorter if = script re-use is frequent, but I showed back in 2018 in the same thread = that this saving is not that significant in recent history as address = reuse is no longer that frequent. A filter on spent outpoint is just as useful for wallets as is one on = spent script, since they naturally scan the blockchain forward and = thereby learn about their coins by the output script before they need to = check spends of those outpoints. It seems to me that implementing an interrogation by evtl. downloading = blocks at checkpoints is much simpler than following multiple possible = filter paths. A spent outpoint filter allows us to decide on coin availability based = on immutable store, without updated and eventually rolled back UTXO = store. The availability could be decided by following the filter path to = current tip to genesis and check is the outpoint was spent earlier. False positives can be sorted = out with a block download. Murmel implements this if running in server = mode, where blocks are already there. Therefore I ask for a BIP change based on better insight gained through = implementation. Tamas Blummer > On Feb 4, 2019, at 21:18, Jim Posen wrote: >=20 > Please see the thread "BIP 158 Flexibility and Filter Size" from 2018 = regarding the decision to remove outpoints from the filter [1]. >=20 > Thanks for bringing this up though, because more discussion is needed = on the client protocol given that clients cannot reliably determine the = integrity of a block filter in a bandwidth-efficient manner (due to the = inclusion of input scripts). >=20 > I see three possibilities: > 1) Introduce a new P2P message to retrieve all prev-outputs for a = given block (essentially the undo data in Core), and verify the scripts = against the block by executing them. While this permits some forms of = input script malleability (and thus cannot discriminate between all = valid and invalid filters), it restricts what an attacker can do. This = was proposed by Laolu AFAIK, and I believe this is how btcd is = proceeding. > 2) Clients track multiple possible filter header chains and = essentially consider the union of their matches. So if any filter = received for a particular block header matches, the client downloads the = block. The client can ban a peer if they 1) ever return a filter = omitting some data that is observed in the downloaded block, 2) = repeatedly serve filters that trigger false positive block downloads = where such a number of false positives is statistically unlikely, or 3) = repeatedly serves filters that are significantly larger than the = expected size (essentially padding the actual filters with garbage to = waste bandwidth). I have not done the analysis yet, but we should be = able to come up with some fairly simple banning heuristics using = Chernoff bounds. The main downside is that the client logic to track = multiple possible filter chains and filters per block is more complex = and bandwidth increases if connected to a malicious server. I first = heard about this idea from David Harding. > 3) Rush straight to committing the filters into the chain (via witness = reserved value or coinbase OP_RETURN) and give up on the pre-softfork = BIP 157 P2P mode. >=20 > I'm in favor of option #2 despite the downsides since it requires the = smallest number of changes and is supported by the BIP 157 P2P protocol = as currently written. (Though the recommended client protocol in the BIP = needs to be updated to account for this). Another benefit of it is that = it removes some synchronicity assumptions where a peer with the correct = filters keeps timing out and is assumed to be dishonest, while the = dishonest peer is assumed to be OK because it is responsive. >=20 > If anyone has other ideas, I'd love to hear them. >=20 > -jimpo >=20 > [1] = https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-June/016057.h= tml = >=20 >=20 >=20 > On Mon, Feb 4, 2019 at 10:53 AM Tamas Blummer via bitcoin-dev = > wrote: > TLDR: a change to BIP158 would allow decision on which filter chain is = correct at lower bandwith use >=20 > Assume there is a BIP157 client that learned a filter header chain = earlier and is now offered an alternate reality by a newly connected = BIP157 server. >=20 > The client notices the alternate reality by routinely asking for = filter chain checkpoints after connecting to a new BIP157 server. A = divergence at a checkpoint means that the server disagrees the client's = history at or before the first diverging checkpoint. The client would = then request the filter headers between the last matching and first = divergent checkpoint, and quickly figure which block=E2=80=99s filter is = the first that does not match previous assumption, and request that = filter from the server. >=20 > The client downloads the corresponding block, checks that its header = fits the PoW secured best header chain, re-calculates merkle root of its = transaction list to know that it is complete and queries the filter to = see if every output script of every transaction is contained in there, = if not the server is lying, the case is closed, the server disconnected. >=20 > Having all output scripts in the filter does not however guarantee = that the filter is correct since it might omit input scripts. Inputs = scripts are not part of the downloaded block, but are in some blocks = before that. Checking those are out of reach for lightweight client with = tools given by the current BIP. >=20 > A remedy here would be an other filter chain on created and spent = outpoints as is implemented currently by Murmel. The outpoint filter = chain must offer a match for every spent output of the block with the = divergent filter, otherwise the interrogated server is lying since a PoW = secured block can not spend coins out of nowhere. Doing this check would = already force the client to download the outpoint filter history up-to = the point of divergence. Then the client would have to download and PoW = check every block that shows a match in outpoints until it figures that = one of the spent outputs has a script that was not in the server=E2=80=99s= filter, in which case the server is lying. If everything checks out = then the previous assumption on filter history was incorrect and should = be replaced by the history offered by the interrogated server. >=20 > As you see the interrogation works with this added filter but is = highly ineffective. A really light client should not be forced to = download lots of blocks just to uncover a lying filter server. This = would actually be an easy DoS on light BIP157 clients. >=20 > A better solution is a change to BIP158 such that the only filter = contains created scripts and spent outpoints. It appears to me that this = would serve well both wallets and interrogation of filter servers well: >=20 > Wallets would recognize payments to their addresses by the filter as = output scripts are included, spends from the wallet would be recognized = as a wallet already knows outpoints of its previously received coins, so = it can query the filters for them. >=20 > Interrogation of a filter server also simplifies, since the filter of = the block can be checked entirely against the contents of the same = block. The decision on filter correctness does not require more bandwith = then download of a block at the mismatching checkpoint. The client could = only be forced at max. to download 1/1000 th of the blockchain in = addition to the filter header history. >=20 > Therefore I suggest to change BIP158 to have a base filter, defined = as: >=20 > A basic filter MUST contain exactly the following items for each = transaction in a block: > =E2=80=A2 Spent outpoints > =E2=80=A2 The scriptPubKey of each output, aside from all = OP_RETURN output scripts. >=20 > Tamas Blummer > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org = > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev = --Apple-Mail=_8079CD50-3F65-4D92-8CDA-402066251404 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
I participated in that discussion in 2018, = but have not had the insight gathered by now though writing both client = and server implementation of BIP157/158

Pieter Wuille considered the design = choice I am now suggesting here as alternative (a) in: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-Ju= ne/016064.html
In his evaluation he recognized = that a filter having spent output and output scripts would allow = decision on filter correctness by knowing the block only.
He did not evaluate the usefulness in the context of = checkpoints, which I think are an important shortcut here.

Yes, a filter that is = collecting input and output scripts is shorter if script re-use is = frequent, but I showed back in 2018 in the same thread that this saving = is not that significant in recent history as address reuse is no longer = that frequent.

A= filter on spent outpoint is just as useful for wallets as is one on = spent script, since they naturally scan the blockchain forward and = thereby learn about their coins by the output script before they need to = check spends of those outpoints.

It seems to me that implementing an = interrogation by evtl. downloading blocks at checkpoints is much simpler = than following multiple possible filter paths.

A spent outpoint filter allows us to = decide on coin availability based on immutable store, without updated = and eventually rolled back UTXO store. The availability could be decided = by following the filter path to current tip to genesis and
check is the outpoint was spent earlier. False positives can = be sorted out with a block download. Murmel implements this if running = in server mode, where blocks are already there.

Therefore I ask for a BIP change based = on better insight gained through implementation.

Tamas Blummer

On = Feb 4, 2019, at 21:18, Jim Posen <jim.posen@gmail.com>= wrote:

Please see the thread = "BIP 158 Flexibility and Filter Size" from 2018 regarding the decision = to remove outpoints from the filter [1].

Thanks for bringing this = up though, because more discussion is needed on the client protocol = given that clients cannot reliably determine the integrity of a block = filter in a bandwidth-efficient manner (due to the inclusion of input = scripts).

I = see three possibilities:
1) Introduce a new P2P = message to retrieve all prev-outputs for a given block (essentially the = undo data in Core), and verify the scripts against the block by = executing them. While this permits some forms of input script = malleability (and thus cannot discriminate between all valid and invalid = filters), it restricts what an attacker can do. This was proposed by = Laolu AFAIK, and I believe this is how btcd is proceeding.
2) Clients track multiple possible filter header chains and = essentially consider the union of their matches. So if any filter = received for a particular block header matches, the client downloads the = block. The client can ban a peer if they 1) ever return a filter = omitting some data that is observed in the downloaded block, 2) = repeatedly serve filters that trigger false positive block downloads = where such a number of false positives is statistically unlikely, or 3) = repeatedly serves filters that are significantly larger than the = expected size (essentially padding the actual filters with garbage to = waste bandwidth). I have not done the analysis yet, but we should be = able to come up with some fairly simple banning heuristics using = Chernoff bounds. The main downside is that the client logic to track = multiple possible filter chains and filters per block is more complex = and bandwidth increases if connected to a malicious server. I first = heard about this idea from David Harding.
3) Rush = straight to committing the filters into the chain (via witness reserved = value or coinbase OP_RETURN) and give up on the pre-softfork BIP 157 P2P = mode.

I'm in = favor of option #2 despite the downsides since it requires the smallest = number of changes and is supported by the BIP 157 P2P protocol as = currently written. (Though the recommended client protocol in the BIP = needs to be updated to account for this). Another benefit of it is that = it removes some synchronicity assumptions where a peer with the correct = filters keeps timing out and is assumed to be dishonest, while the = dishonest peer is assumed to be OK because it is responsive.

If anyone has other = ideas, I'd love to hear them.

-jimpo




On Mon, Feb 4, 2019 at 10:53 AM Tamas = Blummer via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
TLDR: a change to BIP158 would allow = decision on which filter chain is correct at lower bandwith use

Assume there is a BIP157 client that learned a filter header chain = earlier and is now offered an alternate reality by a newly connected = BIP157 server.

The client notices the alternate reality by routinely asking for filter = chain checkpoints after connecting to a new BIP157 server. A divergence = at a checkpoint means that the server disagrees the client's history at = or before the first diverging checkpoint. The client would then request = the filter headers between the last matching and first divergent = checkpoint, and quickly figure which block=E2=80=99s filter is the first = that does not match previous assumption, and request that filter from = the server.

The client downloads the corresponding block, checks that its header = fits the PoW secured best header chain, re-calculates merkle root of its = transaction list to know that it is complete and queries the filter to = see if every output script of every transaction is contained in there, = if not the server is lying, the case is closed, the server = disconnected.

Having all output scripts in the filter does not however guarantee that = the filter is correct since it might omit input scripts. Inputs scripts = are not part of the downloaded block, but are in some blocks before = that. Checking those are out of reach for lightweight client with tools = given by the current BIP.

A remedy here would be an other filter chain on created and spent = outpoints as is implemented currently by Murmel. The outpoint filter = chain must offer a match for every spent output of the block with the = divergent filter, otherwise the interrogated server is lying since a PoW = secured block can not spend coins out of nowhere. Doing this check would = already force the client to download the outpoint filter history up-to = the point of divergence. Then the client would have to download and PoW = check every block that shows a match in outpoints until it figures that = one of the spent outputs has a script that was not in the server=E2=80=99s= filter, in which case the server is lying. If everything checks out = then the previous assumption on filter history was incorrect and should = be replaced by the history offered by the interrogated server.

As you see the interrogation works with this added filter but is highly = ineffective. A really light client should not be forced to download lots = of blocks just to uncover a lying filter server. This would actually be = an easy DoS on light BIP157 clients.

A better solution is a change to BIP158 such that the only filter = contains created scripts and spent outpoints. It appears to me that this = would serve well both wallets and interrogation of filter servers = well:

Wallets would recognize payments to their addresses by the filter as = output scripts are included, spends from the wallet would be recognized = as a wallet already knows outpoints of its previously received coins, so = it can query the filters for them.

Interrogation of a filter server also simplifies, since the filter of = the block can be checked entirely against the contents of the same = block. The decision on filter correctness does not require more bandwith = then download of a block at the mismatching checkpoint. The client could = only be forced at max. to download 1/1000 th of the blockchain in = addition to the filter header history.

Therefore I suggest to change BIP158 to have a base filter, defined = as:

A basic filter MUST contain exactly the following items for each = transaction in a block:
        =E2=80=A2 Spent outpoints
        =E2=80=A2 The scriptPubKey of each output, = aside from all OP_RETURN output scripts.

Tamas Blummer
_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev<= /a>

= --Apple-Mail=_8079CD50-3F65-4D92-8CDA-402066251404-- --Apple-Mail=_1D2D1DF2-D6DE-48E3-B512-FDB54AC57E32 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- iQEzBAEBCgAdFiEE6YNJViYMM6Iv5f9e9nKRxRdxORwFAlxYp8AACgkQ9nKRxRdx ORxibAf/fywU33aJ4sLDELgEEGOK5S4U2g0RTMvmI/gaMCz6KDU1xdH56uIpfLzX 0+6iXVJ5hrShrX8Aa3mtlBEa/5j4jT4jmXQUaBHHNgUaV7jUwTWy0257srJVZVdD OwuK+a3bPuVb7+M8uCpYU8uFm8ESyeiwcDXCqzXkQYZp2UcdkTpm1a1IXuT4g9zM 9msbbz1ukBQ1NWZN/hJUcMc97GIJHAWs4FWJ5Mmco7+LixBywk+Z2oqMlHBfi+2I At8QVxC9ckVirADW9zo0HlUIa5IS2X2sxpZjjgdJL9AolPIkkYuJTh4unupCHBTn 06EgzWMG/zLpjgyYLmS6/9aHgHvKdA== =YF/8 -----END PGP SIGNATURE----- --Apple-Mail=_1D2D1DF2-D6DE-48E3-B512-FDB54AC57E32--