MIME-Version: 1.0
From: Mark Friedenbach <mark@friedenbach.org>
Date: Wed, 4 Nov 2015 14:47:35 -0800
Message-ID: <CAOG=w-tR89zm_VDCR-MR_+F9bRNvm4TCZSQcTmYGKRW1JQhkUg@mail.gmail.com>
To: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: multipart/alternative; boundary=001a1140f8d41797e70523becd4b
Subject: [bitcoin-dev] A validation-cost metric for aggregate limits and fee
	determination
Precedence: list

--001a1140f8d41797e70523becd4b
Content-Type: text/plain; charset=UTF-8

At the first Scaling Bitcoin workshop in Montreal I presented on the topic
of "bad blocks" that take an excessive amount of time to validate. You can
read a transcript of this talk here:

http://diyhpl.us/wiki/transcripts/scalingbitcoin/alternatives-to-block-size-as-aggregate-resource-limits/

The core message was that the assumption made by the design parameters of
the system, namely that validation costs scale linearly with transaction or
block size, is wrong. In particular, in certain kinds of transactions there
are validation costs which scale quadraticly with size. For example, the
construction of SIGHASH_ALL results in each input signing a different
message digest, meaning that the entire transaction (minus the scriptSigs)
is rehashed for each input. As another example, the number of signature
operation performed during block validation is unlimited if the validations
are contained within the scriptPubKey (this scales linearly but with a very
large constant factor). The severity of these issues increase as the
aggregate limits in place on maximum transaction and block size increase.

There have been various solutions suggested, and I would like to start a
public discussion to see if consensus can be reached over a viable approach.

Gavin, for example, has written code that tracks the number of bytes hashed
and enforces a separate limit for a block over this aggregate value. Other
costs could be constrained in a similar whack-a-mole way. I have two
concerns with this approach:

1. There would still exist a gap between the average-case validation cost
of a full block and the worst case validation cost of a block that was
specifically constructed to hit every limit.

2. Transaction selection and by extension fee determination would become
much more complicated multi-dimensional optimization problems. Since fee
management in particular is code replicated in a lot of infrastructure, I
would be very concerned over making optimal behavior greatly more difficult.

My own suggestion, which I submit for consideration, is to use a linear
function of the various costs involved (signatures verified, bytes hashed,
inputs consumed, script opcodes executed, etc.). The various algorithms
used for transaction selection and fee determination can then be reused,
using the output of this new linear function as the "size" of the
transaction.

Separately, many others including Greg Maxwell have advocated for a
"net-UTXO" metric instead of, or in combination with a validation-cost
metric. In the pure form the block size limit would be replaced with a
maximum UTXO set increase, thereby applying a cost in extra fee required to
create unspent outputs. This has the distinct advantage of making dust
outputs considerably more expensive than regular spend outputs.

For myself, I remain open to the possibility of adding a UTXO set size
corrective factor to a chiefly validation-cost metric. It would be nice to
reward users for cleaning up scattered small output, reward miners for
including dust-be-gone outputs, and make spam attacks more costly. But
doing so requires setting aside some unused validation resources in order
to reward miners who clean up the UTXO, which means it widens the gap
between average and worst case block validation times. Also, worry over the
size of the UTXO database is only a concern for how Bitcoin Core is
currently structured -- with e.g. UTXO or STXO commitments it could be the
case that in the future full nodes do not store the UTXO and instead carry
proofs of their inputs as prunable witness data. If we choose a net-UTXO
metric however, we will be stuck with it for some time.

I will be submitting a talk proposal for Scaling Bitcoin on this topic, but
I would like to get some feedback from the developer community first.
Anyone have any thoughts to add?

--001a1140f8d41797e70523becd4b
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>At the first Scaling Bitcoin workshop in Montreal I p=
resented on the topic of &quot;bad blocks&quot; that take an excessive amou=
nt of time to validate. You can read a transcript of this talk here:<br><br=
><a href=3D"http://diyhpl.us/wiki/transcripts/scalingbitcoin/alternatives-t=
o-block-size-as-aggregate-resource-limits/">http://diyhpl.us/wiki/transcrip=
ts/scalingbitcoin/alternatives-to-block-size-as-aggregate-resource-limits/<=
/a><br><br>The core message was that the assumption made by the design para=
meters of the system, namely that validation costs scale linearly with tran=
saction or block size, is wrong. In particular, in certain kinds of transac=
tions there are validation costs which scale quadraticly with size. For exa=
mple, the construction of SIGHASH_ALL results in each input signing a diffe=
rent message digest, meaning that the entire transaction (minus the scriptS=
igs) is rehashed for each input. As another example, the number of signatur=
e operation performed during block validation is unlimited if the validatio=
ns are contained within the scriptPubKey (this scales linearly but with a v=
ery large constant factor). The severity of these issues increase as the ag=
gregate limits in place on maximum transaction and block size increase.<br>=
<br>There have been various solutions suggested, and I would like to start =
a public discussion to see if consensus can be reached over a viable approa=
ch.<br><br>Gavin, for example, has written code that tracks the number of b=
ytes hashed and enforces a separate limit for a block over this aggregate v=
alue. Other costs could be constrained in a similar whack-a-mole way. I hav=
e two concerns with this approach:<br><br>1. There would still exist a gap =
between the average-case validation cost of a full block and the worst case=
 validation cost of a block that was specifically constructed to hit every =
limit.<br><br>2. Transaction selection and by extension fee determination w=
ould become much more complicated multi-dimensional optimization problems. =
Since fee management in particular is code replicated in a lot of infrastru=
cture, I would be very concerned over making optimal behavior greatly more =
difficult.<br><br>My own suggestion, which I submit for consideration, is t=
o use a linear function of the various costs involved (signatures verified,=
 bytes hashed, inputs consumed, script opcodes executed, etc.). The various=
 algorithms used for transaction selection and fee determination can then b=
e reused, using the output of this new linear function as the &quot;size&qu=
ot; of the transaction.<br><br>Separately, many others including Greg Maxwe=
ll have advocated for a &quot;net-UTXO&quot; metric instead of, or in combi=
nation with a validation-cost metric. In the pure form the block size limit=
 would be replaced with a maximum UTXO set increase, thereby applying a cos=
t in extra fee required to create unspent outputs. This has the distinct ad=
vantage of making dust outputs considerably more expensive than regular spe=
nd outputs.<br><br>For myself, I remain open to the possibility of adding a=
 UTXO set size corrective factor to a chiefly validation-cost metric. It wo=
uld be nice to reward users for cleaning up scattered small output, reward =
miners for including dust-be-gone outputs, and make spam attacks more costl=
y. But doing so requires setting aside some unused validation resources in =
order to reward miners who clean up the UTXO, which means it widens the gap=
 between average and worst case block validation times. Also, worry over th=
e size of the UTXO database is only a concern for how Bitcoin Core is curre=
ntly structured -- with e.g. UTXO or STXO commitments it could be the case =
that in the future full nodes do not store the UTXO and instead carry proof=
s of their inputs as prunable witness data. If we choose a net-UTXO metric =
however, we will be stuck with it for some time.<br><br></div>I will be sub=
mitting a talk proposal for Scaling Bitcoin on this topic, but I would like=
 to get some feedback from the developer community first. Anyone have any t=
houghts to add?<br></div>

--001a1140f8d41797e70523becd4b--