Date: Sun, 30 Oct 2022 12:24:43 +1000
From: Anthony Towns <aj@erisian.com.au>
To: Antoine Riard <antoine.riard@gmail.com>,
 Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Message-ID: <Y13gawzYp9k/8oiB@erisian.com.au>
References: <Y1nIKjQC3DkiSGyw@erisian.com.au>
 <CALZpt+EyeL5JjE_bctrcqsRBQkZhu0X1=ChGbyekeqbms1GWtQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CALZpt+EyeL5JjE_bctrcqsRBQkZhu0X1=ChGbyekeqbms1GWtQ@mail.gmail.com>
Subject: Re: [bitcoin-dev] On mempool policy consistency
Precedence: list

On Thu, Oct 27, 2022 at 09:29:47PM +0100, Antoine Riard via bitcoin-dev wrote:
> Let's take the contra.

(I don't think I know that phrase? Is it like "play devil's advocate"?)

> I would say the current post describes the state of Bitcoin Core and
> beyond policy
> rules with a high-degree of exhaustivity and completeness, though itt what
> is, mostly a description. While I think it ends with a set of
> recommendations

It was only intended as a description, not a recommendation for anything.

At this point, the only thing I think I could honestly recommend
that doesn't seem like it comes with massive downsides, is for core to
recommend and implement a particular mempool policy, and only have options
that either make it feasible to scale that policy to different hardware
limitations, and provide options that users can activate en-masse if it
turns out people are doing crazy things in the mempool (eg, a new policy
turns out to be ill-conceived, and it's better to revert to a previous
policy; or a potential spam vector gets exploited at scale).

> What should be actually the design goals and
> principles of Core's transaction-relay propagation rules
> of which mempool accepts ones is a subset?

I think the goals of mempool/relay policy are _really_ simple; namely:

 * relay transactions from users to all potential miners, so that
   non-mining nodes don't have to directly contact miners to announce
   their tx, both for efficiency (your tx can appear in the next block
   anyone mines, rather than just the next block you mine) and privacy
   (so that miners don't know who a transaction belongs to, so that
   users don't have to know who miners are, and so there's no/minimal
   correlation between who proposed a tx and who mined the block it
   appears in)

 * having most of the data that makes up the next block pre-validated
   and pre-distributed throughout the network, so that block validation
   and relay is much more efficient

> By such design goals, I'm
> thinking either, a qualitative framework, like attacks game for a concrete
> application ("Can we prevent pinning against multi-party Coinjoin ?").

I don't think that even makes sense as a question at that level: you can
only ask questions like that if you already have known mempool policies
across the majority of nodes and miners. If you don't, you have to allow
for the possibility that 99% of hashrate is receiving private blacklists
from OFAC and that one of your coinjoin counterparties is on that list,
eg, and at that point, I don't think pinning is even conceivably solvable.

> I believe we would come up with a
> second-order observation. That we might not be able to satisfy every
> use-case with the standard set of policy rules. E.g, a contracting protocol
> could look for package size beyond the upper bound anti-Dos limit.

One reason that limit is in place is that it the larger the tx is
compared to the block limit, the more likely you are to hit corner cases
where greedily filling a block with the highest fee ratex txs first
is significantly suboptimal. That might mean, eg, that there's 410kvB
of higher fee rate txs than your 600kvB large package, and that your
stuff gets delayed, because the miner isn't clever enough to realise
dropping the 10kvB is worthwhile. Or it might mean that your tx gets
delayed because the complicated analysis takes a minute to run and a
block was mined using the simpler algorithm first. Or it might mean that
some mining startup with clever proprietary software that can calculate
this stuff quickly make substantially more profit than everyone else,
so they start dominating block generation, despite the fact that they're
censoring transactions due to OFAC rules.

> Or even the
> global ressources offered by the network of full-nodes are not high enough
> to handle some application event.

Blocks are limited on average to 4MB-per-10-minutes (6.7kB/second),
and applications certainly shouldn't be being designed to only work if
they can monopolise the entirety of the next few blocks. I don't think
it makes any sense to imagine application events in Bitcoin that exceed
the capacity of a random full node. And equally, even if you're talking
about some other blockchain with higher capacity; I don't really think
it makes sense to call it a "full" node if it can't actually cope with
the demands placed on it by any application that works on that network.

> E.g a Lightning Service Provider doing a
> liquidity maintenance round of all its counterparties, and as such
> force-closing and broadcasting more transactions than can be handled at the
> transaction-relay layer due to default MAX_PEER_TX_ANNOUNCEMENTS value.

MAX_PEER_TX_ANNOUNCEMENTS is 5000 txs, and it's per-peer. If you're an
LSP that's doing that much work, it seems likely that you'd at least
be running a long-lived listening node, so likely have 100+ peers, and
could conceivably simultaneously announce 500k txs distributed across
them, which at 130vB each (1-taproot-in, 2-p2wpkh out, which I think is
pretty minimal) adds up to 65 blocks worth of transactions. And then,
you could run more than one node, as well.

Your real limitation is likely that most nodes on the network
will only relay your txs onwards at an average rate of ~7/second
(INVENTORY_BROADCAST_PER_SECOND), so even 5000 txs will likely take over
700s to propagate anyway.

> My personal take on those subjects, we might have to realize we're facing
> an heterogeneity of Bitcoin applications and use-cases [1].

Sure; but that's why you make your policy generic, rather than having
to introduce a new, different policy targeted at each new use case.

> And this sounds
> like a long-term upward trend, akin to the history of the Internet: mail
> clients, web browser, streaming applications, etc, all with different
> service-level requirements in terms of latency, jitters and bandwidth.

Back in the mid/late 90s, people argued that real-time communication,
(like audio chat, let alone streaming video) wasn't suitable for IP,
but would require a different network like ATM where dedicated circuits
were established between the sender and recipient to avoid latency,
jitter and bandwidth competition. Turns out that separate networks
weren't optimal for that.

> To put it simply, some advanced Bitcoin
> applications might have to outgrow the "mempool policy rules" game,

I think if you stick to the fundamentals -- that relay/mempool is about
getting transactions to miners and widely preseeding the contents of
whatever the next block will be -- then it's pretty unlikely that any
Bitcoin application will outgrow the mempool policy game.

> I think this has been historically the case with
> some miners deciding to join FIBER, to improve their view of mined blocks.

FIBRE (it doesn't use the US spelling) is a speedup on top of compact
block relay -- it still gets you exactly the same data if you don't use,
it's just everything is slightly faster if you do. Even better, if you
get a block via FIBRE, then you relay it on to your peers over regular
p2p, helping them get it faster too.

Doing something similar with mempool txs -- having some high bandwidth
overlay network where the edges then feed txs back into the main p2p
network at a slower rate that filters out spam or whatever -- would
probably likewise be a fine addition to bitcoin, provided it had the
same policy rules as regular bitcoin nodes employ for tx relay. If it
had different ones, it would become a signficant centralisation risk: app
developers who make use of the different rules would need to comply with
the overlay networks ToS to avoid getting banned, and miners would need
to subscribe to the feed to avoid missing out on txs and thus fee income.

> What I'm expressing is a long-term perspective, and we might be too early
> in the evolutionary process that is Bitcoin Core development to abandon yet
> the "one-size-fits-all" policy rules conception that I understand from
> your post.

I don't think "one-size-fits-all" is a principle at all; I think
decentralisation/censorship-resistance, privacy, and efficiency are the
fundamental principles. As far as I can see, a one-size-fits-all approach
(or, more precisely, an approach where >90% of the network converges to
the same set of rules) is far better at achieving those principles than
a heterogenous policy approach.

> After exposure and exploration of more Bitcoin use-cases and applications,
> and even among the different requirement among types of use-cases nodes
> (e.g LN mobile vs LSP), I believe more heterogeneity in the policy rules
> usage makes more sense

I think when you say "more heterogeneity" what you're actually assuming
is that miners will implement a superset of all those policies, so that
if *any* node on the network accepts a tx X, *every* miner will also
accept a tx X, with the only exception being if there's some conflicting
tx Y that allows the miner to collect more fees.

But that's not really a heterogenous policy: in that case all miners
are employing exactly the same policy.

In that scenario, I don't think you'll end up with nodes running
heteregenous policies either: part of the point of having mempool
policies is to predict the next block, so if all miners really do have
a common policy, it makes sense for nodes to have the same policy. The
only potential difference is miners might be willing to dedicate more
resources, so might set some knobs related to memory/bandwidth/cpu
consumption differently.

I think what you're actually assuming is that this scenario will mean
that miners will quickly expand their shared policy to accept *any*
set of txs that are accepted by a small minority of relay nodes: after
all, if there are some txs out there that pay fees, why wouldn't miners
want to include them? That's what incentive compatible means, right? And
that's great from a protocol reasearch point-of-view: it allows you to
handwave away people complaining that your idea is bad -- by assumption,
all you need to do is deploy it, and it immediately starts working,
without anyone else needing to adopt it.

I don't think that's actually a realistic assumption though: first,
updating miners' policy rules requires updated software to be tested
and deployed, so isn't trivial enough that it should be handwaved away,
second, as in the "big packages" example above, constructing an efficient
block becomes harder the more mempool rules you throw away, so even if
there are txs violating those rules that are offering extra fees, they
may not actually cover the extra costs to generate a block when you're
no longer able to rely on those rules to reduce the complexity of the
problem space.

Note also that "relay nodes will want to use the same policy as mining
nodes" goes both ways -- if that doesn't happen, and compact block
relay requires an extra round trip to reconstruct the block, miners'
blocks won't relay as quickly, and they'll have an increased orphan rate.

Cheers,
aj