Return-Path: Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9E349C0032 for ; Fri, 6 Oct 2023 17:38:29 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 7332E83BDA for ; Fri, 6 Oct 2023 17:38:29 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 7332E83BDA Authentication-Results: smtp1.osuosl.org; dkim=pass (2048-bit key) header.d=roose.io header.i=@roose.io header.a=rsa-sha256 header.s=default header.b=bb9U2QkJ X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -2.801 X-Spam-Level: X-Spam-Status: No, score=-2.801 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UNM2VGavnEFG for ; Fri, 6 Oct 2023 17:38:27 +0000 (UTC) Received: from hosted.mailcow.de (hosted.mailcow.de [5.1.76.202]) by smtp1.osuosl.org (Postfix) with ESMTPS id 5889B83BD8 for ; Fri, 6 Oct 2023 17:38:27 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 5889B83BD8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=roose.io; s=default; t=1696613904; h=from:subject:date:message-id:to:mime-version:content-type:in-reply-to: references; bh=5EmQDJBHQOghwJXNXqEtgqT1aNA0VHvU4y4xGx75uMU=; b=bb9U2QkJaowJlOCn+H3BCo361/Dy67j8+l3dpDNfKNpK82yhvNhCY/NJJx+Yze/9+pgojd pU8KvPSDRtKBIKoSWTKwswlc3RVriu8j5lyb9U8UThSHHxKF7I6EzWRMSb64Ej4yj1j7G2 CG9ip1k40DSI47n53TZqdfME4bOG89sGFkhGbK06xIK+9ePVWL2xSVmaHikN4NfITUImW0 XdsyHC1YsTEVXg8vJGZKa0X2mtQCrP+EprE+kDLpj87kObJkp9P6MSZ7+/4vuHIxkHA+nu Z66F9Przkt6DD7oTtLcPQPR6ZoByP8dRUTW9OVdZd+FntYHS818HwhY5t7NniA== Received: from [127.0.0.1] (localhost [127.0.0.1]) by localhost (Mailerdaemon) with ESMTPSA id D616F5C0514 for ; Fri, 6 Oct 2023 19:38:22 +0200 (CEST) Content-Type: multipart/alternative; boundary="------------6BOv2TMXW5HQXzS585uNkSQt" Message-ID: <211ab58c-707e-97bb-241b-6fe809fd2bdb@roose.io> Date: Fri, 6 Oct 2023 18:38:22 +0100 MIME-Version: 1.0 To: Steven Roose via bitcoin-dev References: Content-Language: en-US From: Steven Roose In-Reply-To: X-Mailman-Approved-At: Fri, 06 Oct 2023 21:21:55 +0000 Subject: Re: [bitcoin-dev] Draft BIP: OP_TXHASH and OP_CHECKTXHASHVERIFY X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Oct 2023 17:38:29 -0000 This is a multi-part message in MIME format. --------------6BOv2TMXW5HQXzS585uNkSQt Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit I updated the draft BIP with a proposed reference implementation and a link to an implementation of a caching strategy. It shows that it's possible to achieve TXHASH in a way that after each large tx element (scripts, annexes) has been hashed exactly once, invocations of TXHASH have clear constant upper limits on the number of bytes hashes. Link to the draft BIP in above e-mail and link to the cache impl here: https://github.com/stevenroose/rust-bitcoin/blob/txhash/bitcoin/src/blockdata/script/txhash.rs On 9/30/23 12:44, Steven Roose via bitcoin-dev wrote: > > Hey all > > > The idea of TXHASH has been around for a while, but AFAIK it was never > formalized. After conversations with Russell, I worked on a > specification and have been gathering some feedback in the last > several weeks. > > I think the draft is in a state where it's ready for wider feedback > and at the same time I'm curious about the sentiment in the community > about this idea. > > The full BIP text can be found in the attachment as well as at the > following link: > https://github.com/bitcoin/bips/pull/1500 > > I will summarize here in this writing. > > *What does the BIP specify?* > > * The concept of a TxFieldSelector, a serialized data structure for > selecting data inside a transaction. > o The following global fields are available: > + version > + locktime > + number of inputs > + number of outputs > + current input index > + current input control block > o For each input, the following fields are available: > + previous outpoint > + sequence > + scriptSig > + scriptPubkey of spending UTXO > + value of spending UTXO > + taproot annex > o For each output, the following fields are available: > + scriptPubkey > + value > o There is support for selecting inputs and outputs as follows: > + all in/outputs > + a single in/output at the same index as the input being > executed > + any number of leading in/outputs up to 2^14 - 1 (16,383) > + up to 64 individually selected in/outputs (up to 2^16 or > 65,536) > o The empty byte string is supported and functions as a default > value which commits to everything except the previous > outpoints and the scriptPubkeys of the spending UTXOs. > > * An opcode OP_TXHASH, enabled only in tapscript, that takes a > serialized TxFieldSelector from the stack and pushes on the stack > a hash committing to all the data selected. > > * An opcode OP_CHECKTXHASHVERIFY, enabled in all script contexts, > that expects a single item on the stack, interpreted as a 32-byte > hash value concatenated with (at the end) a serialized > TxFieldSelector. Execution fails is the hash value in the data > push doesn't equal the calculated hash value based on the > TxFieldSelector. > > * A consideration for resource usage trying to address concerns > around quadratic hashing. A potential caching strategy is outlined > so that users can't trigger excessive hashing. > o Individual selection is limited to 64 items. > o Selecting "all" in/outputs can mostly use the same caches as > sighash calculations. > o For prefix hashing, intermediate SHA256 contexts can be stored > every N items so that at most N-1 items have to be hashed when > called repeatedly. > o In non-tapscript contexts, at least 32 witness bytes are > required and because (given the lack of OP_CAT) subsequent > calls can only re-enforce the same TxFieldSelector, no > additional limitations are put in place. > o In tapscript, because OP_TXHASH doesn't require 32 witness > bytes and because of a potential addition of operations like > OP_CAT, the validation budget is decreased by 10 for every > OP_TXHASH or OP_CHECKTXHASHVERIFY operation. > > > *What does this achieve?* > > * Since the default TxFieldSelector is functionally equivalent to > OP_CHECKTEMPLATEVERIFY, with no extra bytes required, this > proposal is a strict upgrade of BIP-119. > > * The flexibility of selecting transaction fields and in/output > (ranges), makes this construction way more useful > o when designing protocols where users want to be able to add > fees to their transactions without breaking a transaction chain; > o when designing protocols where users construct transactions > together, each providing some of their own in- and outputs and > wanting to enforce conditions only on these in/outputs. > > * OP_TXHASH, together with OP_CHECKSIGFROMSTACK (and maybe OP_CAT*) > could be used as a replacement for almost arbitrarily complex > sighash constructions, like SIGHASH_ANYPREVOUT. > > * Apart from being able to enforce specific fields in the > transaction to have a pre-specified value, equality can also be > enforced, which can f.e. replace the desire for opcodes like > OP_IN_OUT_VALUE.* > > * The same TxFieldSelector construction would function equally well > with a hypothetical OP_TX opcode that directly pushes the selected > fields on the stack to enable direct introspection. > > > *What are still open questions?* > > * Does the proposal sufficiently address concerns around resource > usage and quadratic hashing? > > * *: Miraculously, once we came up with all possible fields that we > might consider interesting, we filled exactly 16 spots. There is > however one thing that I would have liked to be optionally > available and I am unsure of which side to take in the proposal. > This is including the TxFieldSelector as part of the hash. Doing > so no longer makes the hash only represent the value being hashed, > but also the field selector that was used; this would no longer > make it possible to proof equality of fields. If a txhash as > specified here would ever be used as a signature hash, it would > definitely have to be included, but this could be done after the > fact if OP_CAT was available. For signature hashes, the hash > should ideally be somehow tagged, so we might need OP_CAT, or > OP_CATSHA256 or something anyway. > > * A solution could be to take an additional bit from each of the > two "in/output selector" bytes, and assign to this bit "commit > to total number of in/outputs" (instead of having 2 bits for > this in the first byte). > o This would free up 2 bits in the first byte, one of which > could be used for including the TxFieldSelector in the > hash and the other one could be left free (OP_SUCCESS) to > potentially revisit later-on. > o This would limit the number of selectable leading > in/outputs to 8,191 and the number of individually > selectable in/outputs to 32, both of which seem reasonable > or maybe even more desirable from a resource usage > perspective. > > * General feedback of how people feel towards a proposal like this, > which could either be implemented in a softfork as is, like > BIP-119 or be combined in a single softfork with > OP_CHECKSIGFROMSTACK and perhaps OP_CAT, OP_TWEAKADD and/or a > hypothetical OP_TX. > > > This work is just an attempt to make some of the ideas that have been > floating around into a concrete proposal. If there is community > interest, I would be willing to spend time to adequately formalize > this BIP and to work on an implementation for Bitcoin Core. > > > Looking forward to your thoughts > > Steven > > > > > > > > > > > > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev --------------6BOv2TMXW5HQXzS585uNkSQt Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

I updated the draft BIP with a proposed reference implementation and a link to an implementation of a caching strategy.

It shows that it's possible to achieve TXHASH in a way that after each large tx element (scripts, annexes) has been hashed exactly once, invocations of TXHASH have clear constant upper limits on the number of bytes hashes.

Link to the draft BIP in above e-mail and link to the cache impl here: https://github.com/stevenroose/rust-bitcoin/blob/txhash/bitcoin/src/blockdata/script/txhash.rs


On 9/30/23 12:44, Steven Roose via bitcoin-dev wrote:

Hey all


The idea of TXHASH has been around for a while, but AFAIK it was never formalized. After conversations with Russell, I worked on a specification and have been gathering some feedback in the last several weeks.

I think the draft is in a state where it's ready for wider feedback and at the same time I'm curious about the sentiment in the community about this idea.

The full BIP text can be found in the attachment as well as at the following link:
https://github.com/bitcoin/bips/pull/1500

I will summarize here in this writing.

What does the BIP specify?

  • The concept of a TxFieldSelector, a serialized data structure for selecting data inside a transaction.
    • The following global fields are available:
      • version
      • locktime
      • number of inputs
      • number of outputs
      • current input index
      • current input control block
    • For each input, the following fields are available:
      • previous outpoint
      • sequence
      • scriptSig
      • scriptPubkey of spending UTXO
      • value of spending UTXO
      • taproot annex
    • For each output, the following fields are available:
      • scriptPubkey
      • value
    • There is support for selecting inputs and outputs as follows:
      • all in/outputs
      • a single in/output at the same index as the input being executed
      • any number of leading in/outputs up to 2^14 - 1 (16,383)
      • up to 64 individually selected in/outputs (up to 2^16 or 65,536)
    • The empty byte string is supported and functions as a default value which commits to everything except the previous outpoints and the scriptPubkeys of the spending UTXOs.
  • An opcode OP_TXHASH, enabled only in tapscript, that takes a serialized TxFieldSelector from the stack and pushes on the stack a hash committing to all the data selected.
  • An opcode OP_CHECKTXHASHVERIFY, enabled in all script contexts, that expects a single item on the stack, interpreted as a 32-byte hash value concatenated with (at the end) a serialized TxFieldSelector. Execution fails is the hash value in the data push doesn't equal the calculated hash value based on the TxFieldSelector.
  • A consideration for resource usage trying to address concerns around quadratic hashing. A potential caching strategy is outlined so that users can't trigger excessive hashing.
    • Individual selection is limited to 64 items.
    • Selecting "all" in/outputs can mostly use the same caches as sighash calculations.
    • For prefix hashing, intermediate SHA256 contexts can be stored every N items so that at most N-1 items have to be hashed when called repeatedly.
    • In non-tapscript contexts, at least 32 witness bytes are required and because (given the lack of OP_CAT) subsequent calls can only re-enforce the same TxFieldSelector, no additional limitations are put in place.
    • In tapscript, because OP_TXHASH doesn't require 32 witness bytes and because of a potential addition of operations like OP_CAT, the validation budget is decreased by 10 for every OP_TXHASH or OP_CHECKTXHASHVERIFY operation.


What does this achieve?

  • Since the default TxFieldSelector is functionally equivalent to OP_CHECKTEMPLATEVERIFY, with no extra bytes required, this proposal is a strict upgrade of BIP-119.
  • The flexibility of selecting transaction fields and in/output (ranges), makes this construction way more useful
    • when designing protocols where users want to be able to add fees to their transactions without breaking a transaction chain;
    • when designing protocols where users construct transactions together, each providing some of their own in- and outputs and wanting to enforce conditions only on these in/outputs.
  • OP_TXHASH, together with OP_CHECKSIGFROMSTACK (and maybe OP_CAT*) could be used as a replacement for almost arbitrarily complex sighash constructions, like SIGHASH_ANYPREVOUT.
  • Apart from being able to enforce specific fields in the transaction to have a pre-specified value, equality can also be enforced, which can f.e. replace the desire for opcodes like OP_IN_OUT_VALUE.*
  • The same TxFieldSelector construction would function equally well with a hypothetical OP_TX opcode that directly pushes the selected fields on the stack to enable direct introspection.


What are still open questions?

  • Does the proposal sufficiently address concerns around resource usage and quadratic hashing?
  • *: Miraculously, once we came up with all possible fields that we might consider interesting, we filled exactly 16 spots. There is however one thing that I would have liked to be optionally available and I am unsure of which side to take in the proposal. This is including the TxFieldSelector as part of the hash. Doing so no longer makes the hash only represent the value being hashed, but also the field selector that was used; this would no longer make it possible to proof equality of fields. If a txhash as specified here would ever be used as a signature hash, it would definitely have to be included, but this could be done after the fact if OP_CAT was available. For signature hashes, the hash should ideally be somehow tagged, so we might need OP_CAT, or OP_CATSHA256 or something anyway.
  • A solution could be to take an additional bit from each of the two "in/output selector" bytes, and assign to this bit "commit to total number of in/outputs" (instead of having 2 bits for this in the first byte).
    • This would free up 2 bits in the first byte, one of which could be used for including the TxFieldSelector in the hash and the other one could be left free (OP_SUCCESS) to potentially revisit later-on.
    • This would limit the number of selectable leading in/outputs to 8,191 and the number of individually selectable in/outputs to 32, both of which seem reasonable or maybe even more desirable from a resource usage perspective.
  • General feedback of how people feel towards a proposal like this, which could either be implemented in a softfork as is, like BIP-119 or be combined in a single softfork with OP_CHECKSIGFROMSTACK and perhaps OP_CAT, OP_TWEAKADD and/or a hypothetical OP_TX.


This work is just an attempt to make some of the ideas that have been floating around into a concrete proposal. If there is community interest, I would be willing to spend time to adequately formalize this BIP and to work on an implementation for Bitcoin Core.


Looking forward to your thoughts

Steven











_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
--------------6BOv2TMXW5HQXzS585uNkSQt--