Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Tamas Blummer <tamas.blummer@gmail.com>
In-Reply-To: <CAO3Pvs_Ai9d_uHC2a3ndGXhBoV-PDp2y_NShkbn=hRuzu=wNFw@mail.gmail.com>
Date: Wed, 6 Feb 2019 09:09:55 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <5A850549-B6C9-4590-BA9B-0D69BBE531F9@gmail.com>
References: <6D57649F-0236-4FBA-8376-4815F5F39E8A@gmail.com>
	<CADZtCSgKu1LvjePNPT=0C0UYQvb47Ca0YN+B_AfgVNTpcOno4w@mail.gmail.com>
	<CDAFC2F7-A0AD-460B-B5B1-A717F7EC700E@gmail.com>
	<CAO3Pvs_gvYy99Bch=7RwVszM_0PFTKUyqDVok=xfm4OOcqwaaQ@mail.gmail.com>
	<6D36035C-A675-4845-9292-3BC16CD19B41@gmail.com>
	<CAO3Pvs_Ai9d_uHC2a3ndGXhBoV-PDp2y_NShkbn=hRuzu=wNFw@mail.gmail.com>
To: Olaoluwa Osuntokun <laolu32@gmail.com>
Cc: Jim Posen <jimpo@coinbase.com>,
	Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Interrogating a BIP157 server,
 BIP158 change proposal
Precedence: list

Hi Laolu,

space savings come with the rather serious current disadvantage, that a =
light client is not=20
in the position to check the filter. Also the advanced uses you mention =
are subject to this, for now.=20
Building more on a shaky fundament does not make it look better.

Now that we have seen advantages of both filters, what keeps us from =
offering both by Core?

Computing the addional spent-outpoint output-script filter is cheaper =
than the current one as=20
it can be done with the block as only context, it does not need UTXO nor =
undo blocks no journals or=20
whatever else. I do not see how my statement regarding this was =
incorrect.

There is a political issue though, why I favor better provable =
uncommitted filter:

I am skeptical that commitment of any filter will come into Core soon.
The reason of my skepticism is political, not technical.

A committed filter makes light clients much more reliable and =
attractive, for some taste too much more.

Clients that follow PoW are not significant on the current network. Core =
nodes enforce many more rules,
some as important as miners' reward. A committed filter would strengthen =
light clients=20
significantly, such that perhabs too many were compelled using them =
instead of a Core node.=20
Would the remaining Core nodes be sufficient to enforce checks not =
covered? I see how this is a dilemma.

Is this a dilemma because we think black-white? Light(er) clients might =
implement checks that are more=20
than blind PoW trust even if less than all Core checks. Has the time =
come to allow for this?

Tamas Blummer


> On Feb 6, 2019, at 01:17, Olaoluwa Osuntokun <laolu32@gmail.com> =
wrote:
>=20
> Hi Tamas,=20
>=20
> > The only advantage I see in the current design choice is filter =
size, but
> > even that is less impressive in recent history and going forward, as =
address
> > re-use is much less frequent nowadays than it was Bitcoin=E2=80=99s =
early days.
>=20
> Gains aren't only had with address re-use, it's also the case that if =
an
> input is spent in the same block as it was created, then only a single =
items
> is inserted into the filter. Filters spanning across several blocks =
would
> also see savings due to the usage of input scripts.
>=20
> Another advantage of using input scripts is that it allows rescans =
where all
> keys are known ahead of time to proceed in parallel, which can serve =
to
> greatly speed up rescans in bitcoind. Additionally, it allows light =
clients
> to participate in protocols like atomic swaps using the input scripts =
as
> triggers for state transitions. If outpoints were used, then the party =
that
> initiated the swap would need to send the cooperating party all =
possible
> txid's that may be generated due to fee bumps (RBF or sighash single
> tricks). Using the script, the light client simply waits for it to be
> revealed in a block (P2WSH) and then it can carry on the protocol.
>=20
> > Clear advantages of moving to spent outpoint + output script filter:
>=20
> > 1. Filter correctness can be proven by downloading the block in =
question only.
>=20
> Yep, as is they can verify half the filter. With auxiliary data, they =
can
> verify the entire thing. Once committed, they don't need to verify at =
all.
> We're repeating a discussion that played out 7 months ago with no new
> information or context.
>=20
> > 2. Calculation of the filter on server side does not need UTXO.
>=20
> This is incorrect. Filter calculation can use the spentness journal =
(or undo
> blocks) that many full node implementations utilize.
>=20
> > This certainly improves with a commitment, but that is not even on =
the
> > roadmap yet, or is it?
>=20
> I don't really know of any sort of roadmaps in Bitcoin development. =
However,
> I think there's relatively strong support to adding a commitment, once =
the
> current protocol gets more usage in the wild, which it already is =
today on
> mainnet.
>=20
> > Should a filter be committed that contains spent outpoints, then =
such
> > filter would be even more useful
>=20
> Indeed, this can be added as a new filter type, optionally adding =
created
> outpoints as you referenced in your prior email.
>=20
> > Since Bitcoin Core is not yet serving any filters, I do not think =
this
> > discussion is too late.
>=20
> See my reply to Matt on the current state of deployment. It's also the =
case
> that bitcoind isn't the only full node implementation used in the =
wild.
> Further changes would also serve to delay inclusion into bitcoind. The
> individuals proposing these PRs to bitcoind has participated in this
> discussion 7 months ago (along with many of the contributors to this
> project). Based in this conversation 7 months ago, it's my =
understanding
> that all parties are aware of the options and tradeoffs to be had.
>=20
> -- Laolu
>=20
>=20
> On Tue, Feb 5, 2019 at 12:10 PM Tamas Blummer =
<tamas.blummer@gmail.com> wrote:
> Hi Laolu,
>=20
> The only advantage I see in the current design choice is filter size, =
but even that is less
> impressive in recent history and going forward, as address re-use is =
much less frequent nowadays
> than it was Bitcoin=E2=80=99s early days.
>=20
> I calculated total filter sizes since block 500,000:
>=20
> input script + output script (current BIP): 1.09 GB=20
> spent outpoint + output script: 1.26 GB
>=20
> Both filters are equally useful for a wallet to discover relevant =
transactions, but the current design
> choice seriously limits, practically disables a light client, to prove =
that the filter is correct.=20
>=20
> Clear advantages of moving to spent outpoint + output script filter:
>=20
> 1. Filter correctness can be proven by downloading the block in =
question only.
> 2. Calculation of the filter on server side does not need UTXO.
> 3. Spent outpoints in the filter enable light clients to do further =
probabilistic checks and even more if committed.
>=20
> The current design choice offers lower security than now attainable. =
This certainly improves with=20
> a commitment, but that is not even on the roadmap yet, or is it?
>=20
> Should a filter be committed that contains spent outpoints, then such =
filter would be even more useful:
> A client could decide on availability of spent coins of a transaction =
without maintaining the UTXO set, by=20
> checking the filters if the coin was spent after its origin proven in =
an SPV manner, evtl. eliminating false positives=20
> with a block download. This would be slower than having UTXO but =
require only immutable store, no unwinds and=20
> only download of a few blocks.
>=20
> Since Bitcoin Core is not yet serving any filters, I do not think this =
discussion is too late.
>=20
> Tamas Blummer
>=20
>=20
> > On Feb 5, 2019, at 02:42, Olaoluwa Osuntokun <laolu32@gmail.com> =
wrote:
> >=20
> > Hi Tamas,=20
> >=20
> > This is how the filter worked before the switch over to optimize for =
a
> > filter containing the minimal items needed for a regular wallet to =
function.
> > When this was proposed, I had already implemented the entire =
proposal from
> > wallet to full-node. At that point, we all more or less decided that =
the
> > space savings (along with intra-block compression) were worthwhile, =
we
> > weren't cutting off any anticipated application level use cases (at =
that
> > point we had already comprehensively integrated both filters into =
lnd), and
> > that once committed the security loss would disappear.
> >=20
> > I think it's too late into the current deployment of the BIPs to =
change
> > things around yet again. Instead, the BIP already has measures in =
place for
> > adding _new_ filter types in the future. This along with a few other =
filter
> > types may be worthwhile additions as new filter types.
> >=20
> > -- Laolu
> >=20
> > On Mon, Feb 4, 2019 at 12:59 PM Tamas Blummer =
<tamas.blummer@gmail.com> wrote:
> > I participated in that discussion in 2018, but have not had the =
insight gathered by now though writing both client and server =
implementation of BIP157/158
> >=20
> > Pieter Wuille considered the design choice I am now suggesting here =
as alternative (a) in: =
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-June/016064.h=
tml
> > In his evaluation he recognized that a filter having spent output =
and output scripts would allow decision on filter correctness by knowing =
the block only.
> > He did not evaluate the usefulness in the context of checkpoints, =
which I think are an important shortcut here.
> >=20
> > Yes, a filter that is collecting input and output scripts is shorter =
if script re-use is frequent, but I showed back in 2018 in the same =
thread that this saving is not that significant in recent history as =
address reuse is no longer that frequent.
> >=20
> > A filter on spent outpoint is just as useful for wallets as is one =
on spent script, since they naturally scan the blockchain forward and =
thereby learn about their coins by the output script before they need to =
check spends of those outpoints.
> >=20
> > It seems to me that implementing an interrogation by evtl. =
downloading blocks at checkpoints is much simpler than following =
multiple possible filter paths.
> >=20
> > A spent outpoint filter allows us to decide on coin availability =
based on immutable store, without updated and eventually rolled back =
UTXO store. The availability could be decided by following the filter =
path to current tip to genesis and
> > check is the outpoint was spent earlier. False positives can be =
sorted out with a block download. Murmel implements this if running in =
server mode, where blocks are already there.
> >=20
> > Therefore I ask for a BIP change based on better insight gained =
through implementation.
> >=20
> > Tamas Blummer
> >=20
> >> On Feb 4, 2019, at 21:18, Jim Posen <jim.posen@gmail.com> wrote:
> >>=20
> >> Please see the thread "BIP 158 Flexibility and Filter Size" from =
2018 regarding the decision to remove outpoints from the filter [1].
> >>=20
> >> Thanks for bringing this up though, because more discussion is =
needed on the client protocol given that clients cannot reliably =
determine the integrity of a block filter in a bandwidth-efficient =
manner (due to the inclusion of input scripts).
> >>=20
> >> I see three possibilities:
> >> 1) Introduce a new P2P message to retrieve all prev-outputs for a =
given block (essentially the undo data in Core), and verify the scripts =
against the block by executing them. While this permits some forms of =
input script malleability (and thus cannot discriminate between all =
valid and invalid filters), it restricts what an attacker can do. This =
was proposed by Laolu AFAIK, and I believe this is how btcd is =
proceeding.
> >> 2) Clients track multiple possible filter header chains and =
essentially consider the union of their matches. So if any filter =
received for a particular block header matches, the client downloads the =
block. The client can ban a peer if they 1) ever return a filter =
omitting some data that is observed in the downloaded block, 2) =
repeatedly serve filters that trigger false positive block downloads =
where such a number of false positives is statistically unlikely, or 3) =
repeatedly serves filters that are significantly larger than the =
expected size (essentially padding the actual filters with garbage to =
waste bandwidth). I have not done the analysis yet, but we should be =
able to come up with some fairly simple banning heuristics using =
Chernoff bounds. The main downside is that the client logic to track =
multiple possible filter chains and filters per block is more complex =
and bandwidth increases if connected to a malicious server. I first =
heard about this idea from David Harding.
> >> 3) Rush straight to committing the filters into the chain (via =
witness reserved value or coinbase OP_RETURN) and give up on the =
pre-softfork BIP 157 P2P mode.
> >>=20
> >> I'm in favor of option #2 despite the downsides since it requires =
the smallest number of changes and is supported by the BIP 157 P2P =
protocol as currently written. (Though the recommended client protocol =
in the BIP needs to be updated to account for this). Another benefit of =
it is that it removes some synchronicity assumptions where a peer with =
the correct filters keeps timing out and is assumed to be dishonest, =
while the dishonest peer is assumed to be OK because it is responsive.
> >>=20
> >> If anyone has other ideas, I'd love to hear them.
> >>=20
> >> -jimpo
> >>=20
> >> [1] =
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-June/016057.h=
tml
> >>=20
> >>=20
> >>=20
> >> On Mon, Feb 4, 2019 at 10:53 AM Tamas Blummer via bitcoin-dev =
<bitcoin-dev@lists.linuxfoundation.org> wrote:
> >> TLDR: a change to BIP158 would allow decision on which filter chain =
is correct at lower bandwith use
> >>=20
> >> Assume there is a BIP157 client that learned a filter header chain =
earlier and is now offered an alternate reality by a newly connected =
BIP157 server.
> >>=20
> >> The client notices the alternate reality by routinely asking for =
filter chain checkpoints after connecting to a new BIP157 server. A =
divergence at a checkpoint means that the server disagrees the client's =
history at or before the first diverging checkpoint. The client would =
then request the filter headers between the last matching and first =
divergent checkpoint, and quickly figure which block=E2=80=99s filter is =
the first that does not match previous assumption, and request that =
filter from the server.
> >>=20
> >> The client downloads the corresponding block, checks that its =
header fits the PoW secured best header chain, re-calculates merkle root =
of its transaction list to know that it is complete and queries the =
filter to see if every output script of every transaction is contained =
in there, if not the server is lying, the case is closed, the server =
disconnected.
> >>=20
> >> Having all output scripts in the filter does not however guarantee =
that the filter is correct since it might omit input scripts. Inputs =
scripts are not part of the downloaded block, but are in some blocks =
before that. Checking those are out of reach for lightweight client with =
tools given by the current BIP.
> >>=20
> >> A remedy here would be an other filter chain on created and spent =
outpoints as is implemented currently by Murmel. The outpoint filter =
chain must offer a match for every spent output of the block with the =
divergent filter, otherwise the interrogated server is lying since a PoW =
secured block can not spend coins out of nowhere. Doing this check would =
already force the client to download the outpoint filter history up-to =
the point of divergence. Then the client would have to download and PoW =
check every block that shows a match in outpoints until it figures that =
one of the spent outputs has a script that was not in the server=E2=80=99s=
 filter, in which case the server is lying. If everything checks out =
then the previous assumption on filter history was incorrect and should =
be replaced by the history offered by the interrogated server.=20
> >>=20
> >> As you see the interrogation works with this added filter but is =
highly ineffective. A really light client should not be forced to =
download lots of blocks just to uncover a lying filter server. This =
would actually be an easy DoS on light BIP157 clients.
> >>=20
> >> A better solution is a change to BIP158 such that the only filter =
contains created scripts and spent outpoints. It appears to me that this =
would serve well both wallets and interrogation of filter servers well:
> >>=20
> >> Wallets would recognize payments to their addresses by the filter =
as output scripts are included, spends from the wallet would be =
recognized as a wallet already knows outpoints of its previously =
received coins, so it can query the filters for them.
> >>=20
> >> Interrogation of a filter server also simplifies, since the filter =
of the block can be checked entirely against the contents of the same =
block. The decision on filter correctness does not require more bandwith =
then download of a block at the mismatching checkpoint. The client could =
only be forced at max. to download 1/1000 th of the blockchain in =
addition to the filter header history.
> >>=20
> >> Therefore I suggest to change BIP158 to have a base filter, defined =
as:
> >>=20
> >> A basic filter MUST contain exactly the following items for each =
transaction in a block:
> >>         =E2=80=A2 Spent outpoints
> >>         =E2=80=A2 The scriptPubKey of each output, aside from all =
OP_RETURN output scripts.
> >>=20
> >> Tamas Blummer
> >> _______________________________________________
> >> bitcoin-dev mailing list
> >> bitcoin-dev@lists.linuxfoundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
> >=20
>=20