Date: Fri, 27 Oct 2023 17:00:36 +1000
From: Anthony Towns <aj@erisian.com.au>
To: Rusty Russell <rusty@rustcorp.com.au>,
 Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Message-ID: <ZTtgFPG4tTeZMnYn@erisian.com.au>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <87v8b2vu4q.fsf@rustcorp.com.au>
X-Spam_score: 0.0
X-Spam_bar: /
Subject: Re: [bitcoin-dev] Examining ScriptPubkeys in Bitcoin Script
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Oct 2023 07:00:52 -0000

On Fri, Oct 20, 2023 at 02:10:37PM +1030, Rusty Russell via bitcoin-dev wrote:
>         I've done an exploration of what would be required (given
> OP_TX/OP_TXHASH or equivalent way of pushing a scriptPubkey on the
> stack) to usefully validate Taproot outputs in Bitcoin Script.  Such
> functionality is required for usable vaults, at least.
> 
>         https://rusty.ozlabs.org/2023/10/20/examining-scriptpubkey-in-script.html
> 
> (If anyone wants to collaborate to produce a prototype, and debug my
> surely-wrong script examples, please ping me!)
> 
> TL;DR: if we have OP_TXHASH/OP_TX, and add OP_MULTISHA256 (or OP_CAT),
> OP_KEYADDTWEAK and OP_LESS (or OP_CONDSWAP), and soft-fork weaken the
> OP_SUCCESSx rule (or pop-script-from-stack), we can prove a two-leaf
> tapscript tree in about 110 bytes of Script.  This allows useful
> spending constraints based on a template approach.

I think there's two reasons to think about this approach:

 (a) we want to do vault operations specifically, and this approach is
     a good balance between being:
       - easy to specify and implement correctly, and
       - easy to use correctly.

 (b) we want to make bitcoin more programmable, so that we can do
     contracting experiments directly in wallet software, without needing
     to justify new soft forks for each experiment, and this approach
     provides a good balance amongst:
       - opening up a wide range of interesting experiments,
       - making it easy to understand the scope/consequences of opening up
         those experiments,
       - being easy to specify and implement correctly, and
       - being easy to use correctly.

Hopefully that's a fair summary? Obviously what balance is "good"
is always a matter of opinion -- if you consider it hard to do soft
forks, then it's perhaps better to err heavily towards being easy to
specify/implement, rather than easy to use, for example.

For (a) I'm pretty skeptical about this approach for vault operations
-- it's not terribly easy to specify/implement (needing 5 opcodes, one
of which has a dozen or so flags controlling how it behaves, then also
needs to change the way OP_SUCCESS works), and it seems super complicated
to use.

By comparison, while the bip 345 OP_VAULT proposal also proposes 3 new
opcodes (OP_CTV, OP_VAULT, OP_VAULT_RECOVER) [0], those opcodes can be
implemented fairly directly (without requiring different semantics for
OP_SUCCESS, eg) and can be used much more easily [1].

[0] Or perhaps 4, if OP_REVAULT were to be separated out from OP_VAULT, cf
    https://github.com/bitcoin/bips/pull/1421#discussion_r1357788739

[1] https://github.com/jamesob/opvault-demo/blob/57f3bb6b8717acc7ce1eae9d9d8a2661f6fa54e5/main.py#L125-L133

I'm not sure, but I think the "deferred check" setup might also
provide additional functionality beyond what you get from cross-input
introspection; that is, with it, you can allow multiple inputs to safely
contribute funds to common outputs, without someone being able to combine
multiple inputs into a tx where the output amount is less than the sum
of all the contributions. Without that feature, you can mimic it, but
only so long as all the input scripts follow known templates that you
can exactly match.

So to me, for the vault use case, the
TXHASH/MULTISHA256/KEYADDTWEAK/LESS/CAT/OP_SUCCESS approach just doesn't
really seem very appealing at all in practical terms: lots of complexity,
hard to use, and doesn't really seem like it works very well even after
you put in tonnes of effort to get it to work at all?


I think in the context of (b), ie enabling experimentation more generally,
it's much more interesting. eg, CAT alone would allow for various
interesting constraints on signatures ("you must sign this tx with the
given R value -- so attempting to double spend, eg via a feebump, will
reveal the corresponding private key"), and adding CSFS would allow you
to include authenticated data in a script, eg market data sourced from
a trusted oracle.

But even then, it still seems fairly crippled -- script is a very
limited programming language, and it just isn't really very helpful
if you want to do things that are novel. It doesn't allow you to (eg)
loop over the inputs and select just the ones you're interested in, you
need the opcode to do the looping for you, and that has to be hardcoded
as a matter of consensus (eg, Steven Roose's TXHASH [2] proposal allows
you to select the first-n inputs/outputs, but not the last-n).

[2] https://github.com/bitcoin/bips/pull/1500

I've said previously [3] that I think using a lisp variant would
be a promising solution here: you can replace script's "two stacks
of byte-strings" with "(recursive) lists of byte-strings", and go
from a fairly limited language, to a fairly complete one. I've been
experimenting with this on and off since then [4], and so far I haven't
seen anything much to dissuade me from that view. I think you can get
a pretty effective language with perhaps 43 opcodes [5] (compared to
script's ~60 active opcodes), and I don't think you need to do anything
too fancy to implement it.

[3] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-March/020036.html
[4] https://github.com/ajtowns/lisp-play/
[5] https://github.com/ajtowns/lisp-play/blob/5975870423f9dace902ef42208b965f9d8a0f005/btclisp.py#L738

Here's an example. I've included a "CSFS" equivalent opcode, namely
"(bip340_verify pk msg sig)" that validates a signature per BIP340,
and also a "(bip342_txmsg)" opcode that generates a "msg" corresponding
to the BIP 342 "Signature Validation" spec (just calling the bitcoind
core test framework code), which then allows me to verify existing
signatures on existing transactions via lisp code, rather than executing
the actual script.

But what if we wanted to experiment with a new SIGHASH mode? For that,
I've added an OP_TX like opcode, '(tx N)' that allows you to select
various information about the tx by choosing N -- '(tx 1)' gives you the
locktime, '(tx 10)' gives you your input's nSequence, '(tx (10 . 3))'
gives you the nSequence of the 4th input, eg. With that, it's possible
to select whichever bits of the transaction you like, in whatever order
you like, and pass the results through the '(sha256)' opcode, then pass
that into the signature check.

Unlike the OP_TXHASH proposals and the like, it's possible (though perhaps
not *easy*) to exactly mimic existing hash constructs, eg "(bip342_txmsg)"
(for SIGHASH_ALL) can be constructed manually via:

ENV=(a (i 14 '(a 8 8 12 (+ 10 '1) (- 14 '1) (cat 3 (a 12 10))) '3))

  ^-- (basically a for loop, so that "(a 1 1 'X '0 K)" will
       invoke "X" with values [0, K), and cat the results
       together; used with K=(tx '2) to do inputs, and (tx '3) to
       dou outputs)

PROGRAM=(a '(sha256 4 4 '0x00 6 3) (sha256 '\"TapSighash\") (cat '0x00 (tx '0) (tx '1) (sha256 (a 1 1 '(cat (tx (c '11 1)) (tx (c '12 1))) '0 (tx '2) 'nil)) (sha256 (a 1 1 '(tx (c '15 1)) '0 (tx '2) 'nil)) (sha256 (a 1 1 '(a '(cat (strlen 1) 1) (tx (c '16 '0))) '0 (tx '2) 'nil)) (sha256 (a 1 1 '(tx (c '10 1)) '0 (tx '2) 'nil)) (sha256 (a 1 1 '(cat (tx (c '20 1)) (a '(cat (strlen 1) 1) (tx (c '21 1)))) '0 (tx '3) 'nil)) (i (tx '7) '0x03 '0x01) (substr (cat (tx '4) '0x00000000) 'nil '4) (i (tx '7) (sha256 (a '(cat (strlen 1) 1) (tx '7))) 'nil)) (cat (tx '6) '0x00 '0xffffffff))

  ^-- (sha256's the sha256 of TapSighash twice, then the epoch, then
       the sigmsg, then the extension; with the SIGHASH_ALL logic
       being hardcoded)

That's obviously not easy to read, but it's also essentially programming
in assembler, and would be much improved by having a higher-level
macro-enabled lisp variation that allows you to define your own
symbols/variable names, and translate that down to the raw code. (Or
even just having a parser that allows you to add comments, I guess)

What I've implemented is essentially an eager interpretor with some tail
call optimisations to allow memory to be freed up a bit earlier. I think
it would be better to do it as a properly lazy iinterpretor though --
that way you can actually have the same memory efficiency as streaming
sha256 operators provide, even with the additional flexibility provided
by iteration/recursion/function calls.

There are various other tricks that aren't done in my python testbed,
eg encoding/decoding lists as a byte stream rather than a parenthesised
string; working out whether string comparison should be normal or reversed
(so that you can comapre proof-of-work) or both, providing other crypto
ops like ecdsa, doing bignum maths rather than just uint64, keeping track
of allocations when an exception occurs, providing an easy way to tell
how much computation will be required to evaluate an input script and
inflate the tx's weight correspondingly if necessary, etc.

I've also only done fairly toy-level problems: factorial and fibonacci
calculations, reimplementing an existing sighash, etc. I think doing
TLUV or VAULT or graftroot should be feasible (at least given opcodes
to provide secp256k1 tweaks and deferred-checks), but haven't actually
done it.

Anyway, this seems to me to be a much more promising approach for
experimentation than trying to fit everything into script's square hole
[6], and perhaps also more promising than Simplicity for the reasons
discussed at the end of [3]. Once you have the nicer structure that a
lisp-like language provides, compared to script, I think OP_TX, OP_CAT,
OP_CSFS etc all end up working pretty great.

[6] https://twitter.com/TiredActor/status/1609641593836822530

Cheers,
aj