Return-Path: Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 199F0C000B for ; Tue, 1 Feb 2022 01:16:53 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 02D9B41570 for ; Tue, 1 Feb 2022 01:16:53 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -1.621 X-Spam-Level: X-Spam-Status: No, score=-1.621 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.276, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001] autolearn=no autolearn_force=no Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1VipIoLeCaSk for ; Tue, 1 Feb 2022 01:16:49 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.8.0 Received: from azure.erisian.com.au (cerulean.erisian.com.au [139.162.42.226]) by smtp4.osuosl.org (Postfix) with ESMTPS id 197FD4156E for ; Tue, 1 Feb 2022 01:16:48 +0000 (UTC) Received: from aj@azure.erisian.com.au (helo=sapphire.erisian.com.au) by azure.erisian.com.au with esmtpsa (Exim 4.92 #3 (Debian)) id 1nEhmm-0001YW-44; Tue, 01 Feb 2022 11:16:46 +1000 Received: by sapphire.erisian.com.au (sSMTP sendmail emulation); Tue, 01 Feb 2022 11:16:39 +1000 Date: Tue, 1 Feb 2022 11:16:39 +1000 From: Anthony Towns To: Russell O'Connor , Bitcoin Protocol Discussion Message-ID: <20220201011639.GA4317@erisian.com.au> References: <20220128013436.GA2939@erisian.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Score-int: -18 X-Spam-Bar: - Subject: Re: [bitcoin-dev] TXHASH + CHECKSIGFROMSTACKVERIFY in lieu of CTV and ANYPREVOUT X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Feb 2022 01:16:53 -0000 On Fri, Jan 28, 2022 at 08:56:25AM -0500, Russell O'Connor via bitcoin-dev wrote: > > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019243.html > For more complex interactions, I was imagining combining this TXHASH > proposal with CAT and/or rolling SHA256 opcodes. If TXHASH ended up > supporting relative or absolute input/output indexes then users could > assemble the hashes of the particular inputs and outputs they care about > into a single signed message. That's certainly possible, but it sure seems overly complicated and error prone... > > While I see the appeal of this from a language design perspective; > > I'm not sure it's really the goal we want. When I look at bitcoin's > > existing script, I see a lot of basic opcodes to do simple arithmetic and > > manipulate the stack in various ways, but the opcodes that are actually > > useful are more "do everything at once" things like check(multi)sig or > > sha256. It seems like what's most useful on the blockchain is a higher > > level language, rather than more of blockchain assembly language made > > up of small generic pieces. I guess "program their own use cases from > > components" seems to be coming pretty close to "write your own crypto > > algorithms" here... > Which operations in Script are actually composable today? > There is one aspect of Bitcoin Script that is composable, which is > (monotone) boolean combinations of the few primitive transaction conditions > that do exist. The miniscript language captures nearly the entirety of > what is composable in Bitcoin Script today: which amounts to conjunctions, > disjunctions (and thresholds) of signatures, locktimes, and revealing hash > preimages. Yeah; I think miniscript captures everything bitcion script is actually useful for today, and if we were designing bitcoin from scratch and had known that was the feature set we were going to end up with, we'd have come up with something simpler and a fair bit more high level than bitcoin script for the interpreter. > I don't think there is much in the way of lessons to be drawn from how we > see Bitcoin Script used today with regards to programs built out of > reusable components. I guess I think one conclusion we should draw is some modesty in how good we are at creating general reusable components. That is, bitcoin script looks a lot like a relatively general expression language, that should allow you to write interesting things; but in practice a lot of it was buggy (OP_VER hardforks and resource exhaustion issues), or not powerful enough to actually be interesting, or too complicated to actually get enough use out of [0]. > TXHASH + CSFSV won't be enough by itself to allow for very interesting > programs Bitcoin Script yet, we still need CAT and friends for that, "CAT" and "CHECKSIGFROMSTACK" are both things that have been available in elements for a while; has anyone managed to build anything interesting with them in practice, or are they only useful for thought experiments and blog posts? To me, that suggests that while they're useful for theoretical discussion, they don't turn out to be a good design in practice. > but > CSFSV is at least a step in that direction. CSFSV can take arbitrary > messages and these messages can be fixed strings, or they can be hashes of > strings (that need to be revealed), or they can be hashes returned from > TXHASH, or they can be locktime values, or they can be values that are > added or subtracted from locktime values, or they can be values used for > thresholds, or they can be other pubkeys for delegation purposes, or they > can be other signatures ... for who knows what purpose. I mean, if you can't even think of a couple of uses, that doesn't seem very interesting to pursue in the near term? CTV has something like half a dozen fairly near-term use cases, but obviously those can all be done just with TXHASH without a need for CSFS, and likewise all the ANYPREVOUT things can obviously be done via CHECKSIG without either TXHASH or CSFS... To me, the point of having CSFS (as opposed to CHECKSIG) seems to be verifying that an oracle asserted something; but for really simply boolean decisions, doing that via a DLC seems better in general since that moves more of the work off-chain; and for the case where the signature is being used to authenticate input into the script rather than just gating a path, that feels a bit like a weaker version of graftroot? I guess I'd still be interested in the answer to: > > If we had CTV, POP_SIGDATA, and SIGHASH_NO_TX_DATA_AT_ALL but no OP_CAT, > > are there any practical use cases that wouldn't be covered that having > > TXHASH/CAT/CHECKSIGFROMSTACK instead would allow? Or where those would > > be significantly more convenient/efficient? > > > > (Assume "y x POP_SIGDATA POP_SIGDATA p CHECKSIGVERIFY q CHECKSIG" > > commits to a vector [x,y] via p but does not commit to either via q so > > that there's some "CAT"-like behaviour available) TXHASH seems to me to be clearly the more flexible opcode compared to CTV; but maybe all that flexibility is wasted, and all the real use cases actually just want CHECKSIG or CTV? I'd feel much better having some idea of what the advantage of being flexible there is... But all that aside, probably the real question is can we simplify CTV's transaction message algorithm, if we assume APO is enabled simultaneously? If it doesn't get simplified and needs its own hashing algorithm anyway, that would be probably be a good reason to keep the separate. First, since ANYPREVOUT commits to the scriptPubKey, you'd need to use ANYPREVOUTANYSCRIPT for CTV-like behaviour. ANYPRVOUTANYSCRIPT is specced as commiting to: nVersion nLockTime nSequence spend_type and annex present sha_annex (if present) sha_outputs (ALL) or sha_single_output (SINGLE) key_version codesep_pos CTV commits to: nVersion nLockTime scriptSig hash "(maybe!)" input count sequences hash output count outputs hash input index (CTV thus allows annex malleability, since it neither commits to the annex nor forbids inclusion of an annex) "output count" and "outputs index" would both be covered by sha_outputs with ANYPREVOUTANYSCRIPT|ALL. I think "scriptSig hash" is only covered to avoid txid malleability; but just adjusting your protocol to use APO signatures instead of relying on the txid of future transactions also solves that problem. I believe "sequences hash", "input count" and "input index" are all an important part of ensuring that if you have two UTXOs distributing 0.42 BTC to the same set of addresses via CTV, that you can't combine them in a single transaction and end up sending losing one of the UTXOs to fees. I don't believe there's a way to resolve that with bip 118 alone, however that does seem to be a similar problem to the one that SIGHASH_GROUP tries to solve. SIGHASH_GROUP [1] would be an alternative to ALL/SINGLE/NONE, with the exact group of outputs being committed to determined via the annex. ANYPREVOUTANYSCRIPT|GROUP would commit to: nVersion nLockTime nSequence spend_type and annex present sha_annex (if present) sha_group_outputs (GROUP) key_version codesep_pos So in that case if you have your two inputs: 0.42 [pays 0.21 to A, 0.10 to B, 0.10 to C] 0.42 [pays 0.21 to A, 0.10 to B, 0.10 to C] then, either: a) if they're both committed with GROUP and sig_group_count = 3, then the outputs must be [0.21 A, 0.10 B, 0.10 C, 0.21 A, 0.10 B, 0.10 C], and you don't lose funds b) if they're both committed with GROUP and the first is sig_group_count=3 and the second is sig_group_count=0, then the outputs can be [0.21 A, 0.10 B, 0.10 C, *anything] -- but in that case the second input is already signalling that it's meant to be paired with another input to fund the same three outputs, so any funds loss is at least intentional Note that this means txids are very unstable: if a tx is only protected by SIGHASH_GROUP commitments then miners/relayers can add outputs, or reorganise the groups without making the tx invalid. Beyond requiring the signatures to be APO/APOAS-based to deal with that, we'd also need to avoid txs getting rbf-pinned by some malicious third party who pulls apart the groups and assembles a new tx that's hard to rbf but also unlikely to confirm due to having a low feerate. Note also that not reusing addresses solves this case -- it's only a problem when you're paying the same amounts to the same addresses. Being able to combine additional inputs and outputs at a later date (which necessarily changes the txid) is an advantage though: it lets you add additional funds and claim change, which allows you to adjust to different fee rates. I don't think the SIGHASH_GROUP approach would work very well without access to the annex, ie if you're trying to do CTV encoded either in a plain scriptPubKey or via segwit/p2sh. I think that would give 16 different sighashes, choosing one of four options for outputs, ALL/NONE/SINGLE/GROUP -- which outputs are committed to and one of four options for inputs, -/ANYONECANPAY/ANYPREVOUT/ANYPREVOUTANYSCRIPT -- all inputs committed to, specific input committed to, scriptpubkey/tapscript committed to, or just the nseq/annex/codesep_pos vs the ~155,000 sighashes in the TXHASH proposal. I don't think there's an efficient way of doing SIGHASH_GROUP via tx introspection opcodes that doesn't also introduce a quadratic hashing risk -- you need to prevent different inputs from re-hashing distinct but overlapping sets of outputs, and if your opcodes only allow grabbing one output at a time to add to the message being signed you have to do a lot of coding if you want to let the signer choose how many outputs to commit to; if you provide an opcode that grabs man outputs to hash, it seems hard to do that generically in a way that avoids quadratic behaviour. So I think that suggests two alternative approaches, beyond the VERIFY-vs-PUSH semantic: - have a dedicated sighash type for CTV (either an explicit one for it, per bip119, or support thousands of options like the proposal in this thread, one of which happens to be about the same as the bip119 idea) - use ANYPREVOUTANYSCRIPT|GROUP for CTV, which means also implementing annex parsing and better RBF behaviour to avoid those txs being excessively vulnerable to pinning; with the advantage being that txs using "GROUP" sigs can be combined either for batching purposes or for adapting to the fee market after the signature has been made, and the disadvantage that you can't rely on stable txids when looking for CTV spends and have to continue using APO/APOAS when chaining signatures on top of unconfirmed CTV outputs Cheers, aj [0] Here's bitmatrix trying to multiply two numbers together: https://medium.com/bit-matrix/technical-how-does-bitmatrix-v1-multiply-two-integers-in-the-absence-of-op-mul-a58b7a3794a3 Likewise, doing a point preimage reveal via clever scripting pre-taproot never saw an implementation, despite seeming theoretically plausible. https://lists.linuxfoundation.org/pipermail/lightning-dev/2015-November/000344.html [1] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019243.html