Return-Path: Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id CC4ECC0881 for ; Wed, 11 Dec 2019 00:38:16 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id AE7718842A for ; Wed, 11 Dec 2019 00:38:16 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CPFjsFy2y-T2 for ; Wed, 11 Dec 2019 00:38:14 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by hemlock.osuosl.org (Postfix) with ESMTPS id 9904E879A6 for ; Wed, 11 Dec 2019 00:38:13 +0000 (UTC) Received: from mail-il1-f173.google.com (mail-il1-f173.google.com [209.85.166.173]) (authenticated bits=0) (User authenticated as jlrubin@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id xBB0cBN2009140 (version=TLSv1/SSLv3 cipher=AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 10 Dec 2019 19:38:12 -0500 Received: by mail-il1-f173.google.com with SMTP id z90so17865949ilc.8 for ; Tue, 10 Dec 2019 16:38:12 -0800 (PST) X-Gm-Message-State: APjAAAVPfcTGVB/QrdWvefAaJ9zYaae6vKNjRRuIiW8AYdkm2Ao7J8Zo gdqqgZDMn1phSmss8T5Aizy7z8PrtYtnPJPHNSk= X-Google-Smtp-Source: APXvYqxxg+BA+5vAkGeRkS23lTFIatfC6liq/94J7SEjhhrm80lfYKO5pXMNIvquWZFjzR/s5r9kgrSjYSb9hzxs0rI= X-Received: by 2002:a92:1547:: with SMTP id v68mr502594ilk.58.1576024691078; Tue, 10 Dec 2019 16:38:11 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Jeremy Date: Tue, 10 Dec 2019 16:37:59 -0800 X-Gmail-Original-Message-ID: Message-ID: To: Jeremy Content-Type: multipart/alternative; boundary="000000000000de7b9e059962d74b" X-Mailman-Approved-At: Wed, 11 Dec 2019 01:10:29 +0000 Cc: Bitcoin Protocol Discussion Subject: Re: [bitcoin-dev] BIP OP_CHECKTEMPLATEVERIFY X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Dec 2019 00:38:16 -0000 --000000000000de7b9e059962d74b Content-Type: text/plain; charset="UTF-8" Three changes I would like to make to the OP_CTV draft. I think this should put the draft in a very good place w.r.t. outstanding feedback. The changes can all be considered/merged independently, though, they are written below assuming all of them are reasonable. 1) *Make the hash commit to the INPUT_INDEX of the executing scriptPubKey.* *Motivation:* As previously specified, a CTV template constructed specifying a transaction with two or more inputs has a "half-spend" issue whereby if the template script is paid to more than once (a form of key-reuse), they can be joined in a single transaction leading to half of the intended outputs being created. *Example:* Suppose I have a UTXO with a CTV requiring two inputs. The first is set to be the CTV template, and the input has enough money to pay for all the outputs. The second input is added to allow the adding of a fee-only utxo. Now suppose someone creates an similar UTXO with this same CTV (even within the same transaction). TxA {vin: [/*elided...*/], vout: [TxOut{1 BTC, CTV}, TxOut {1 BTC, CTV}]} *Intended Behavior:* TxB0 {vin: [Outpoint{TxA.hash(), 0}, /*arbitrary fee utxo*/], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}] TxB1 {vin: [Outpoint{TxA.hash(), 1}, /*arbitrary fee utxo*/], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}] *Possible Unintended Behaviors:* * Half-Spend:* TxB {vin: [Outpoint{TxA.hash(), 1}, Outpoint{TxA.hash(), 0}], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}] *Order-malleation:* TxB0 {vin: [/*arbitrary fee utxo*/, Outpoint{TxA.hash(), 0}], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}] TxB1 {vin: [Outpoint{TxA.hash(), 1}, /*arbitrary fee utxo*/], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}] With the new rule, the CTV commits to the index in the vin array that it will appear. This prevents both the half-spend issue and the order-malleation issue. Thus, the only execution possible is: *Intended Behavior:* TxB0 {vin: [Outpoint{TxA.hash(), 0}, /*arbitrary fee utxo*/], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}] TxB1 {vin: [Outpoint{TxA.hash(), 1}, /*arbitrary fee utxo*/], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}] *Impact of Change:* This behavior change is minor -- in most cases we are expecting templates with a single input, so committing the input index has no effect. Only when we do specify multiple inputs, committing the INPUT_INDEX has the side effect of making reused-keys not susceptible to the "half-spend" issue. This change doesn't limit the technical capabilities of OP_CTV by much because cases where the half-spend construct is desired can be specified by selecting the correct inputs for the constituent transactions for the transaction-program. In the future, Taproot can make it easier to express contracts where the input can appear at any index by committing to a tree of positions. This change also has the benefit of reducing the miner-caused TXID malleability in certain applications (e.g., in a wallet vault you can reduce malleability from your deposit flow, preventing an exponential blow-up). However in such constructions the TXIDs are still malleable if someone decides to pay you Bitcoin that wasn't previously yours through a withdrawal path (a recoverable error, and on the bright side, someone paid you Bitcoin to do it). This change also has a minor impact on the cacheability of OP_CTV. In the reference implementation we currently precompute and store single hash for the StandardTemplateHash of the entire transaction. Making the hash vary per-input means that we would need to precompute one hash per-input, which is impractical. Given that we expect the 0-index to be the exceedingly common case, and it's not horribly expensive if we aren't cached (a constant sized SHA-256), the spec will be updated to precompute and cache only the hash for the 0th index. (The hash is also changed slightly to make it more efficient for un-cached values, as noted in change 3). *2) Remove Constexpr restriction* *Changes:* Currently it is checked that the template hash argument was not 'computed', but came from a preceding push. Remove all this logic and accept any argument. *Motivation:* I've had numerous conversations with Bitcoin developers (see above, see #bitcoin-wizards on Nov 28th 2019, in person at local meetups, and in private chats with ecosystem developers) about the constexpr restriction in OP_CTV. There have been a lot of folks asking to remove template constexpr restriction, for a few reasons: a) Parsing Simplification / no need for special-casing in optimizers like miniscript b) The types of script it disables aren't dangerous c) There are exciting things you can do were it not there and other features were enabled (OP_CAT) d) Without other features (like OP_CAT), there's not really too much you can do No one has expressed any strong justification to keep it. The main motivation for the constexpr restriction was to keep the CTV proposal very conservative in scope, increasing the likelihood that it is an acceptable change. It was also designed to be able to be easily lifted in a future soft-fork. There isn't a *specific* behavior the constexpr restriction is attempting to prevent, it's just a belt-and-suspenders measure to limit how creatively CTV could be used now or in the future. Future OpCodes + OP_CTV may introduce a broader set of functionality than possible if OP_CTV were to retain the constexpr rule. But I think given that these future op-codes will likely be introduced intentionally to introduce broader functionalities, we shouldn't limit the functionality of OP_CTV today. *Impact of Changes:* The only mildly interesting thing that could be done with this change (with no additional changes; that I could think of) would be to write a script like: SHA256 OP_CTV which would be a "self-describing" covenant (for no space savings). This could be useful in some protocols where "the public" should be able to execute some step with only chain-data. N.B. This cannot enable a case where the CTV is in the scriptSig like: scriptPubKey: CHECKSIG scriptSig: OP_SHA256 CTV because the serialization of the transaction contains a commitment to non-null scriptSigs, a self-reference/hash cycle. *3) Modify the template digest to be easier to cache and work with in script.* *Changes:* The current hash is: uint256 GetStandardTemplateHash(const CTransaction& tx) { auto h = TaggedHash("StandardTemplateHash") << tx.nVersion << tx.nLockTime << GetOutputsSHA256(tx) << GetSequenceSHA256(tx) << uint64_t(tx.vin.size()); for (const auto& in : tx.vin) h << in.scriptSig; return h.GetSHA256(); } I propose changing it to: uint256 GetStandardTemplateHash(const CTransaction& tx, uint32_t input_index) { uint256 scriptSig_hash{}; bool hash_scriptSigs = std::count(tx.vin.begin(), tx.vin.begin(), CScript()) != tx.vin().size(); if (hash_scriptSigs) { auto h = CHashWriter() for (const auto& in : tx.vin) h << in.scriptSig; scriptSig_hash = h.GetSHA256(); } auto h = CHashWriter() << tx.nVersion << tx.nLockTime; if (hash_scriptSigs) h << scriptSig_hash; h << uint64_t(tx.vin.size()) << GetSequenceSHA256(tx) << uint32_t(tx.vout.size()) << GetOutputsSHA256(tx) << input_index; return h.GetSHA256(); } This changes a few things: 1) Getting rid of the TaggedHash use 2) Re-ordering to put input_index last 3) Re-ordering to put Outputs SHA256 second-to-last 4) Only computing scriptSig_hash if any scriptSig is non-null 5) Making scriptSig data hashed in it's own hash-buffer 6) explicitly committing to the vout.size() 7) Casting vout.size() but not vin.size() to uint32_t (vout is capped by COutpoint indicies to 32 bits, vin is not) *Motivation:* The current digest algorithm is relatively arbitrarily ordered and set up. Modifying it makes it easier to cache (given the input index change) and makes it easier to construct templates in script (once OP_CAT, or OP_SUBSTR, or OP_STREAMSHA256 are added to core). *Impact of Changes:* *1) Getting rid of the TaggedHash use* Low-impact. TaggedHash didn't add any security to the template hashes, but did make it harder to "confuse" a StandardTemplateHash for a hash of another type. However, the tagged hash makes it slightly more difficult/costly to construct (with OP_CAT enabled) a template hash within script, so it is removed. *2) Re-ordering to put input_index last* The input index should be put last because this makes it easy to cache the intermediate hash state *just before* hashing the index, which makes recomputing for different indexes cheaper. It also allows (with OP_CAT or STREAMSHA256) to easily allow setting the accepted indexes from script. *3) Re-ordering to put Outputs SHA256 second-to-last* In the future, with OP_CAT/SHA256STREAM or similar, changing the outputs in the covenant is the most likely change. Placing it near the end simplifies this operation. *4) Only computing scriptSig_hash if any scriptSig is non-null* There is no need to hash the scriptSig data at all if they are all null. This is in most cases true, so we avoid extra hashing. But the bigger win is for scripted construction of templates, which can just completely ignore the scriptSig hashing if it is known to be using all bare CTV/non-p2sh segwit inputs (which should be the common case). *5) Making scriptSig data hashed in it's own hash-buffer, when hash is included.* This implies that there are two possible sizes for the hashed data, +/- 1 hash (for scripSig_hash). This eliminates concerns that directly hashing elements into the template hash buffer might expose some length extension issue when constructing a template in script. *6) explicitly committing to the vout.size()* This makes it easier, when OP_CAT or similar is added, to write restrictions that guarantee a limit on the number of inputs that may be created. *7) Casting vout.size() but not vin.size() to uint32_t (vout is capped by COutpoint indicies to 32 bits, vin is not)* This is just kind of annoying, but technically you can have more inputs in a transaction than outputs because more than 32-bits of outputs breaks the COutpoint class invariants. -- @JeremyRubin On Thu, Nov 28, 2019 at 11:59 AM Jeremy wrote: > Thanks for the feedback Russell, now and early. It deeply informed the > version I'm proposing here. > > I weighed carefully when selecting this design that I thought it would be > an acceptable tradeoff after our discussion, but I recognize this isn't > exactly what you had argued for. > > First off, with respect to the 'global state' issue, I figured it was > reasonable with this choice of constexpr rule given that a reasonable tail > recursive parser might look something like: > > parse (code : rest) stack alt_stack just_pushed = > match code with > OP_PUSH => parse rest (x:stack) alt_stack True > OP_DUP => parse rest (x:stack) alt_stack False > // ... > > So we're only adding one parameter which is a bool, and we only need to > ever set it to an exact value based on the current code path, no > complicated rules. I'm sensitive to the complexity added when formally > modeling script, but I think because it is only ever a literal, you could > re-write it as co-recursive: > > parse_non_constexpr (code : rest) stack alt_stack = > match code with > OP_PUSH => parse_constexpr rest (x:stack) alt_stack > OP_DUP => parse_non_constexpr rest (x:stack) alt_stack > // ... > > parse_constexpr (code : rest) stack alt_stack = > match code with > OP_CTV => ... > _ => parese_non_constexpr (code : rest) stack alt_stack > > > If I recall, this should help a bit with the proof automatability as it's > easier in the case by case breakdown to see the unconditional code paths. > > > In terms of upgrade-ability, one of the other reasons I liked this design > is that if we do enable OP_CTV for non-constexpr arguments, the issue > basically goes away and the OP becomes "pure" without any state tracking. > (I think the switching on argument size is much less a concern because we > already use similar upgrade mechanisms elsewhere, and it doesn't add > parsing context). > > > It's also possible, as I think *should be done* for tooling to treat an > unbalanced OP_CTV as a parsing error. This will always produce > consensus-valid scripts! However by keeping the consensus rules more > relaxed we keep our upgrade-ability paths open for OP_CTV, which as I > understand from speaking with other users is quite desirable. > > > Best (and happy thanksgiving to those celebrating), > > Jeremy > > -- > @JeremyRubin > > > > On Thu, Nov 28, 2019 at 6:33 AM Russell O'Connor > wrote: > >> Thanks for this work Jeremy. >> >> I know we've discussed this before, but I'll restate my concerns with >> adding a new "global" state variable to the Script interpreter for tracking >> whether the previous opcode was a push-data operation or not. While it >> isn't so hard to implement this in Bitcoin Core's Script interpreter, >> adding a new global state variable adds that much more complexity to anyone >> trying to formally model Script semantics. Perhaps one can argue that >> there is already (non-stack) state in Script, e.g. to deal with >> CODESEPARATOR, so why not add more? But I'd argue that we should avoid >> making bad problems worse. >> >> If we instead make the CHECKTEMPLATEVERIFY operation fail if it isn't >> preceded by (or alternatively followed by) an appropriate sized >> (canonical?) PUSHDATA constant, even in an unexecuted IF branch, then we >> can model the Script semantics by considering the >> PUSHDATA-CHECKTEMPLATEVERIFY pair as a single operation. This allows >> implementations to consider improper use of CHECKTEMPLATEVERIFY as a >> parsing error (just as today unbalanced IF-ENDIF pairs can be modeled as a >> parsing error, even though that isn't how it is implemented in Bitcoin >> Core). >> >> I admit we would lose your soft-fork upgrade path to reading values off >> the stack; however, in my opinion, this is a reasonable tradeoff. When we >> are ready to add programmable covenants to Script, we'll do so by adding >> CAT and operations to push transaction data right onto the stack, rather >> than posting a preimage to this template hash. >> >> Pleased to announce refinements to the BIP draft for >>> OP_CHECKTEMPLATEVERIFY (replaces previous OP_SECURETHEBAG BIP). Primarily: >>> >>> 1) Changed the name to something more fitting and acceptable to the >>> community >>> 2) Changed the opcode specification to use the argument off of the stack >>> with a primitive constexpr/literal tracker rather than script lookahead >>> 3) Permits future soft-fork updates to loosen or remove "constexpr" >>> restrictions >>> 4) More detailed comparison to alternatives in the BIP, and why >>> OP_CHECKTEMPLATEVERIFY should be favored even if a future technique may >>> make it semi-redundant. >>> >>> Please see: >>> BIP: https://github.com/JeremyRubin/bips/blob/ctv/bip-ctv.mediawiki >>> Reference Implementation: >>> https://github.com/JeremyRubin/bitcoin/tree/checktemplateverify >>> >>> I believe this addresses all outstanding feedback on the design of this >>> opcode, unless there are any new concerns with these changes. >>> >>> I'm also planning to host a review workshop in Q1 2020, most likely in >>> San Francisco. Please fill out the form here >>> https://forms.gle/pkevHNj2pXH9MGee9 if you're interested in >>> participating (even if you can't physically attend). >>> >>> And as a "but wait, there's more": >>> >>> 1) RPC functions are under preliminary development, to aid in testing >>> and evaluation of OP_CHECKTEMPLATEVERIFY. The new command >>> `sendmanycompacted` shows one way to use OP_CHECKTEMPLATEVERIFY. See: >>> https://github.com/JeremyRubin/bitcoin/tree/checktemplateverify-rpcs. >>> `sendmanycompacted` is still under early design. Standard practices for >>> using OP_CHECKTEMPLATEVERIFY & wallet behaviors may be codified into a >>> separate BIP. This work generalizes even if an alternative strategy is used >>> to achieve the scalability techniques of OP_CHECKTEMPLATEVERIFY. >>> 2) Also under development are improvements to the mempool which will, in >>> conjunction with improvements like package relay, help make it safe to lift >>> some of the mempool's restrictions on longchains specifically for >>> OP_CHECKTEMPLATEVERIFY output trees. See: https://github.com/bitcoin/bitcoin/pull/17268 >>> This work offers an improvement irrespective of OP_CHECKTEMPLATEVERIFY's >>> fate. >>> >>> >>> Neither of these are blockers for proceeding with the BIP, as they are >>> ergonomics and usability improvements needed once/if the BIP is activated. >>> >>> See prior mailing list discussions here: >>> >>> * >>> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-May/016934.html >>> * >>> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-June/016997.html >>> >>> Thanks to the many developers who have provided feedback on iterations >>> of this design. >>> >>> Best, >>> >>> Jeremy >>> >>> -- >>> @JeremyRubin >>> >> --000000000000de7b9e059962d74b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Three changes I would like to make to the OP_CTV draft.= I think this should put the draft in a very good place w.r.t. outstanding = feedback.

The changes can all be considered/merged independ= ently, though, they are written below assuming all of them are reasonable.<= br>


1) Make the hash commit to the INPUT_INDEX of the executing scriptPu= bKey.

Motivation: As previously specified, a= CTV template constructed specifying a transaction with two or more inputs = has a "half-spend" issue whereby if the template script is paid t= o more than once (a form of key-reuse), they can be joined in a single tran= saction leading to half of the intended outputs being created.
Example:
Suppose I have a UTXO with a CTV requiring two inputs. The first= is set to be the CTV template, and the input has enough money to pay for a= ll the outputs. The second input is added to allow the adding of a fee-only= utxo.
Now suppose someone creates= an similar UTXO with this same CTV (even within the same transaction).
=


TxA {vin: [/*eli= ded...*/], vout: [TxOut{1 BTC, <TxB template hash> CTV}, TxOut {1 BTC= , <TxB template hash> CTV}]}

Intended Behavior:
=C2=A0=C2=A0=C2=A0 TxB0 {vin: [Outpo= int{TxA.hash(), 0}, /*arbitrary fee utxo*/], vout : [TxOut {1 BTC, /* arbit= rary scriptPubKey */}]
=C2=A0=C2= =A0=C2=A0 TxB1 {vin: [Outpoint{TxA.hash(), 1}, /*arbitrary fee utxo*/], vou= t : [TxOut {1 BTC, /* arbitrary scriptPubKey */}]
Possible Unintended Behaviors:
=C2=A0=C2=A0=C2=A0 Half-Spend:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 TxB = {vin: [Outpoint{TxA.hash(), 1}, Outpoint{TxA.hash(), 0}], vout : [TxOut {1 = BTC, /* arbitrary scriptPubKey */}]
=C2=A0=C2=A0=C2=A0 <= i>Order-malleation:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Tx= B0 {vin: [/*arbitrary fee utxo*/, Outpoint{TxA.hash(), 0}], vout : [TxOut {= 1 BTC, /* arbitrary scriptPubKey */}]
=C2=A0=C2=A0 =C2=A0 =C2=A0=C2=A0 TxB1 {vin: [Outpoint{TxA.hash(), 1}, /= *arbitrary fee utxo*/], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}= ]

With the new rule, the CTV commits to the index in the = vin array that it will appear. This prevents both the half-spend issue and = the order-malleation issue.

Thus, the only execution possible i= s:

Intended Behavior= :
=C2=A0=C2=A0=C2=A0 TxB0 {vin= : [Outpoint{TxA.hash(), 0}, /*arbitrary fee utxo*/], vout : [TxOut {1 BTC, = /* arbitrary scriptPubKey */}]
=C2= =A0=C2=A0=C2=A0 TxB1 {vin: [Outpoint{TxA.hash(), 1}, /*arbitrary fee utxo*/= ], vout : [TxOut {1 BTC, /* arbitrary scriptPubKey */}]

<= b>Impact of Change:
This b= ehavior change is minor -- in most cases we are expecting templates with a = single input, so committing the input index has no effect.

Onl= y when we do specify multiple inputs, committing the INPUT_INDEX has the si= de effect of making reused-keys not susceptible to the "half-spend&quo= t; issue.

This change doesn't limit the technical capa= bilities of OP_CTV by much because cases where the half-spend construct is = desired can be specified by selecting the correct inputs for the constituen= t transactions for the transaction-program. In the future, Taproot can make= it easier to express contracts where the input can appear at any index by = committing to a tree of positions.

This change also has the= benefit of reducing the miner-caused TXID malleability in certain applicat= ions (e.g., in a wallet vault you can reduce malleability from your deposit= flow, preventing an exponential blow-up). However in such constructions th= e TXIDs are still malleable if someone decides to pay you Bitcoin that wasn= 't previously yours through a withdrawal path (a recoverable error, and= on the bright side, someone paid you Bitcoin to do it).

This change also has a= minor impact on the cacheability of OP_CTV. In the reference implementatio= n we currently precompute and store single hash for the StandardTemplateHas= h of the entire transaction. Making the hash vary per-input means that we w= ould need to precompute one hash per-input, which is impractical. Given tha= t we expect the 0-index to be the exceedingly common case, and it's not= horribly expensive if we aren't cached (a constant sized SHA-256), the= spec will be updated to precompute and cache only the hash for the 0th ind= ex. (The hash is also changed slightly to make it more efficient for un-cac= hed values, as noted in change 3).

2) Remove Constexpr r= estriction
Changes:=
Currently it is checked that = the template hash argument was not 'computed', but came from a prec= eding push. Remove all this logic and accept any argument.
Motivation:
I've had numerous conversations with Bitcoin develop= ers (see above, see #bitcoin-wizards on Nov 28th 2019, in person at local m= eetups, and in private chats with ecosystem developers) about the constexpr= restriction in OP_CTV. There have been a lot of folks asking to remove tem= plate constexpr restriction, for a few reasons:

a) Parsing Simp= lification / no need for special-casing in optimizers like miniscript
b) The types of script it disables = aren't dangerous
c) There are = exciting things you can do were it not there and other features were enable= d (OP_CAT)
d) Without other featur= es (like OP_CAT), there's not really too much you can do

No= one has expressed any strong justification to keep it.

The= main motivation for the constexpr restriction was to keep the CTV proposal= very conservative in scope, increasing the likelihood that it is an accept= able change. It was also designed to be able to be easily lifted in a futur= e soft-fork. There isn't a specific behavior the constexpr restr= iction is attempting to prevent, it's just a belt-and-suspenders measur= e to limit how creatively CTV could be used now or in the future.
=

Future OpCodes=C2=A0+ OP_CTV may introduce a broader set of function= ality than possible if OP_CTV were to retain the constexpr rule. But I thin= k given that these future op-codes will likely be introduced intentionally = to introduce broader functionalities, we shouldn't limit the functional= ity of OP_CTV today.

Impact of Changes:

= The only mildly interesting thing that could be done with this change (with= no additional changes; that I could think of) would be to write a script l= ike:

<serialization of transaction fields according to has= h spec> SHA256 OP_CTV

which would be a "self-describing= " covenant (for no space savings). This could be useful in some protoc= ols where "the public" should be able to execute some step with o= nly chain-data.

N.B. This cannot enable a case where the CTV is= in the scriptSig like:

=
scriptPubKey: <key> CHECKSIG
<= /div>
scriptSig: <serialization of tr= ansaction details> OP_SHA256 CTV <sig>

because the ser= ialization of the transaction contains a commitment to non-null scriptSigs,= a self-reference/hash cycle.
=
3) Modify the template dig= est to be easier to cache and work with in script.
Changes:
The current hash is:
<= div style=3D"font-family:arial,helvetica,sans-serif;font-size:small;color:r= gb(0,0,0)" class=3D"gmail_default">
    uint256 GetStandardTemplateHash=
(const CTransaction& tx) {
        auto h =3D  TaggedHash("StandardTemplateHash")
            << tx.nVersion << tx.nLockTime
            << GetOutputsSHA256(tx) << GetSequenceSHA256(tx)
            << uint64_t(tx.vin.size());
        for (const auto& in : tx.vin) h << in.scriptSig;
        return h.GetSHA256();
    }

I prop=
ose changing it to:
uint256 GetStandardTemplateHash(const CTransaction& tx, uint32_t = input_index) {
        uint256 scriptSig_hash{};
bool hash_scriptSigs =3D std::count(tx.vin.begin(), tx.vin.begin(= ), CScript()) !=3D tx.vin().size();
        if (hash_scriptSigs) =
{
=C2=A0 auto h =3D CHashWriter()
for (const a= uto& in : tx.vin) h << in.scriptSig;
            sc=
riptSig_hash =3D h.GetSHA256();
        }
=C2=A0aut= o h =3D CHashWriter()
=C2=A0<< tx.nVersion
= << tx.nLockTime;
            if (hash_scriptSigs) h=
 << scriptSig_hash;

h << uint64_t(tx.vin.siz= e())
            << GetSequenceSHA256(tx)
   =
         << uint32_t(tx.vout.size())
           =C2=A0&=
lt;< GetOutputsSHA256(tx)
            << input_index=
; 
       =C2=A0return h.GetSHA256();
    }

This c=
hanges a few things:
1) Getting rid of the TaggedHash use
2) Re-ordering to put i=
nput_index last
3) Re-ordering to put Outputs SHA256 second-to-last
=
4) Only computing scriptS=
ig_hash if any scriptSig is non-null
5) Making scriptSig data hashed in it's own=
 hash-buffer
6) explicitly committing to the vout.size()
7) Casting vout.size() but not vin.size() to uint32_t = (vout is capped by COutpoint indicies to 32 bits, vin is not)
Moti= vation:
The current digest alg= orithm is relatively arbitrarily ordered and set up. Modifying it makes it = easier to cache (given the input index change) and makes it easier to const= ruct templates in script (once OP_CAT, or OP_SUBSTR, or OP_STREAMSHA256 are= added to core).

=
Impact of Changes:
1) Getting rid of the TaggedHash=
 use
=
Low-impact. TaggedHash didn't add any security to the template hashes, =
but did make it harder to "confuse" a StandardTemplateHash for a =
hash of another type.
However, the tagged hash makes it slightly more difficult/cost=
ly to construct (with OP_CAT enabled) a template hash within script, so it =
is removed. 
2) Re-ordering to put input_index last
The input index should be put last=
 because this makes it easy to cache the intermediate hash state *just befo=
re* hashing the index, which makes recomputing for different indexes cheape=
r.

It also allows (with OP_CAT or STREAMSHA256) to easily allow sett= ing the accepted indexes from script.

3) Re-ordering to put Outpu= ts SHA256 second-to-last
In the future, with OP_CAT/SHA256STREAM or similar, cha=
nging the outputs in the covenant is the most likely change. Placing it nea=
r the end simplifies this operation.

4) Only computing scriptSig_hash if any = scriptSig is non-null
There is no need to hash the scriptSig data at all if they=
 are all null. This is in most cases true, so we avoid extra hashing.
But the bi=
gger win is for scripted construction of templates, which can just complete=
ly ignore the scriptSig hashing if it is known to be using all bare CTV/non=
-p2sh segwit inputs (which should be the common case).

5) Making scriptSig da= ta hashed in it's own hash-buffer, when hash is included.

This implies t=
hat there are two possible sizes for the hashed data, +/- 1 hash (for scrip=
Sig_hash). This eliminates concerns that directly hashing elements into the=
 template hash buffer might expose some length extension issue when constru=
cting a template in script.

6) explicitly committing to the vout.size()
This m=
akes it easier, when OP_CAT or similar is added, to write restrictions that=
 guarantee a limit on the number of inputs that may be created.
<= /pre>

7) Casting vo= ut.size() but not vin.size() to uint32_t (vout is capped by COutpoint indic= ies to 32 bits, vin is not)
This is just kind of annoying, = but technically you can have more inputs in a transaction than outputs beca= use more than 32-bits of outputs breaks the COutpoint class invariants.
=




On Thu, Nov 28, 2019 at 11:59 AM Jeremy <jlrubin@mit.edu> wrote:
=
Thanks for the feedback Russell, now and early. It deeply informe= d the version I'm proposing here.

I weighed carefully when selecting this design that I thought it woul= d be an acceptable tradeoff after our discussion, but I recognize this isn&= #39;t exactly what you had argued for.

First off, with respect to the 'global state' issue, I figur= ed it was reasonable with this choice of constexpr rule given that a reason= able tail recursive parser might look something like:

=
parse (code : rest) stack alt_stack just_pushed =3D
=
=C2=A0=C2=A0=C2=A0 match code with
=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 OP_PUSH =3D> parse rest (x:stack) alt_= stack True
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 OP_D= UP =3D> parse rest (x:stack) alt_stack False
=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // ...

So we're only adding one parameter which is a bool, and we only ne= ed to ever set it to an exact value based on the current code path, no comp= licated rules. I'm sensitive to the complexity added when formally mode= ling script, but I think because it is only ever a literal, you could re-wr= ite it as co-recursive:

parse_non_constexpr (code : rest) stack alt_stack =3D
= =C2=A0=C2=A0=C2=A0 match code with
=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 OP_PUSH =3D> parse_constexpr rest (x:stack) alt_st= ack
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 OP_DUP = =3D> parse_non_constexpr rest (x:stack) alt_stack
= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // ...

parse_constexpr (code : rest) stack alt_stack=C2= =A0 =3D
=C2=A0=C2=A0=C2=A0 match code with
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 OP_CTV =3D> ...
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 _ =3D> parese_non_const= expr (code : rest) stack alt_stack

<= br>
If I recall, this should help a bit with the proof aut= omatability as it's easier in the case by case breakdown to see the unc= onditional code paths.


In terms of upgrade-ability, one of the other reasons I liked this= design is that if we do enable OP_CTV for non-constexpr arguments, the iss= ue basically goes away and the OP becomes "pure" without any stat= e tracking. (I think the switching on argument size is much less a concern = because we already use similar upgrade mechanisms elsewhere, and it doesn&#= 39;t add parsing context).


It's also possible, as I think *should be done* for toolin= g to treat an unbalanced OP_CTV as a parsing error. This will always produc= e consensus-valid scripts! However by keeping the consensus rules more rela= xed we keep our upgrade-ability paths open for OP_CTV, which as I understan= d from speaking with other users is quite desirable.


Best (and happy thanksgiving = to those celebrating),

Jeremy



Thanks for this work Jeremy.
=
I know we've discussed this before, but I= 9;ll restate my concerns with adding a new "global" state variabl= e to the Script interpreter for tracking whether the previous opcode was a = push-data operation or not.=C2=A0 While it isn't so hard to implement t= his in Bitcoin Core's Script interpreter, adding a new global state var= iable adds that much more complexity to anyone trying to formally model Scr= ipt semantics.=C2=A0 Perhaps one can argue that there is already (non-stack= ) state in Script, e.g. to deal with CODESEPARATOR, so why not add more?=C2= =A0 But I'd argue that we should avoid making bad problems worse.
=

If we instead make the CHECKTEM= PLATEVERIFY operation fail if it isn't preceded by (or alternatively fo= llowed by) an appropriate sized (canonical?) PUSHDATA constant, even in an = unexecuted IF branch, then we can model the Script semantics by considering= the PUSHDATA-CHECKTEMPLATEVERIFY pair as a single operation.=C2=A0 This al= lows implementations to consider improper use of CHECKTEMPLATEVERIFY as a p= arsing error (just as today unbalanced IF-ENDIF pairs can be modeled as a p= arsing error, even though that isn't how it is implemented in Bitcoin C= ore).

I admit we would lose your s= oft-fork upgrade path to reading values off the stack; however, in my opini= on, this is a reasonable tradeoff.=C2=A0 When we are ready to add programma= ble covenants to Script, we'll do so by adding CAT and operations to pu= sh transaction data right onto the stack, rather than posting a preimage to= this template hash.

Pleased to announce refinements to th= e BIP draft for OP_CHECKTEMPLATEVERIFY (replaces previous OP_SECURETHEBAG B= IP). Primarily:

1) Changed the name = to something more fitting and acceptable to the community
2) Changed the opcode specification to use the argument off of the stac= k with a primitive constexpr/literal tracker rather than script lookahead
3) Permits future soft-fork updates to loosen or remove &qu= ot;constexpr" restrictions
4) More detailed compariso= n to alternatives in the BIP, and why OP_CHECKTEMPLATEVERIFY should be favo= red even if a future technique may make it semi-redundant.

Please see:
BIP: https://github.com/JeremyRubin/bips/blob/ctv/bip-ctv.mediawiki
I believe this addresses all outstanding feedback on th= e design of this opcode, unless there are any new concerns with these chang= es.

I'm also planning to hos= t a review workshop in Q1 2020, most likely in San Francisco. Please fill o= ut the form here https://forms.gle/pkevHNj2pXH9MGee9 if you're interested in= participating (even if you can't physically attend).

And as a "but wait, there's more":<= br>

1) RPC functions are under preli= minary development, to aid in testing and evaluation of OP_CHECKTEMPLATEVER= IFY. The new command `sendmanycompacted` shows one way to use OP_CHECKTEMPL= ATEVERIFY. See: https://github.com/JeremyRubin/bitc= oin/tree/checktemplateverify-rpcs. `sendmanycompacted` is still under e= arly design. Standard practices for using OP_CHECKTEMPLATEVERIFY & wall= et behaviors may be codified into a separate BIP. This work generalizes eve= n if an alternative strategy is used to achieve the scalability techniques = of OP_CHECKTEMPLATEVERIFY.
2) Also under development a= re improvements to the mempool which will, in conjunction with improvements= like package relay, help make it safe to lift some of the mempool's re= strictions on longchains specifically for OP_CHECKTEMPLATEVERIFY output tre= es. See: https://github.com/bitcoin/bitcoin/pull/17268 This work off= ers an improvement irrespective of OP_CHECKTEMPLATEVERIFY's fate.


Neither of thes= e are blockers for proceeding with the BIP, as they are ergonomics and usab= ility improvements needed once/if the BIP is activated.

=

Thanks to the many developers who ha= ve provided feedback on iterations of this design.
Best,

Jeremy
--000000000000de7b9e059962d74b--