Return-Path: Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 52B0EC000B for ; Mon, 7 Mar 2022 00:59:49 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 2DCD581357 for ; Mon, 7 Mar 2022 00:59:49 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -2.098 X-Spam-Level: X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no Authentication-Results: smtp1.osuosl.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MCwkMwDjo4jS for ; Mon, 7 Mar 2022 00:59:47 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.8.0 Received: from mail-yw1-x1136.google.com (mail-yw1-x1136.google.com [IPv6:2607:f8b0:4864:20::1136]) by smtp1.osuosl.org (Postfix) with ESMTPS id 45E19812E1 for ; Mon, 7 Mar 2022 00:59:47 +0000 (UTC) Received: by mail-yw1-x1136.google.com with SMTP id 00721157ae682-2dbfe58670cso147094457b3.3 for ; Sun, 06 Mar 2022 16:59:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=UGOks/IC0I9EfxjGn9rq8TFTfzH9wD1hryHgQyLSGqc=; b=TgAxsV4EH0d4UXeb3JU1uzv8np/yvuqUJ3lEI9iTg7Q/uhzD1+l+1QjBbWih9AkdqD erwCoH9/82f4pQzqTA45IZSEj7e6PeIs7CTavXlVGqflS7NwSCnKimC+d1AIFE5BNRZe 21zHT0lyGQHhWdA9YM90AM9oLTfpRB0z8AAtFRohaHEqi8EIdNzoUmCb5XFpG9wqqwaV BSOPfP6uemcaCk6A4uZLHqpSOEf3vO4Whm0m1EeS3Chg84BHbMV0eebQWHDL7wc21XlG GcRSgMyRDmG4r6+k7tnJ0ijf25t04GoLp1q8hue0mIbwVXxyGKlhj2Z5qJba/Sjy5WET LSng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=UGOks/IC0I9EfxjGn9rq8TFTfzH9wD1hryHgQyLSGqc=; b=fhIaS4H7Ra6/DtD1vIOthTqpVZ7s4yRK7FJDV3Sjms1Adc1/88XnGAuZ8y9k6AKfHf FbH3Pa3XUtzEVrYg2QZpGvdMhkMb66aEI4Mh+XKnq2dLM0qE0gYPQ3GcvzWaDogYY21f jLO7o4mZdlgfN67xwhby2zd+/h79JeIGr64L3UiVprZ4buPgwxSLUejVhsA2xY9Wz2Sr e3pHO0Yhn0+1mKNpOc4VZGR/Bn9TrzjozrGU7Np085uAQjFDMzxUjJ0TlzdfxN6wYlmk /NFkMPDhLRIRZYEKRBhtnEWCUnveTQPZYfcgT7QIy6QzJLEkpCYDmGV9HWrZaMC0H2Pp lj9A== X-Gm-Message-State: AOAM5338nnYI2N5QkRxVEslMxuFXujTCdA0nldaz0QnRV0Cx3wiRycpM vexJTRZHI9fwCvk0UlFbo/C/g5+NaRNKhfNktSA0S9p4bss= X-Google-Smtp-Source: ABdhPJy7rzdAT/q2TNfmAJHj7oFod5Fd0T4TXW/8P7U26XRDbWH4h34wsyO+vDCt+67thcgtqYt6sZMaz/ghawue+50= X-Received: by 2002:a0d:ea90:0:b0:2dc:2fdf:dd9a with SMTP id t138-20020a0dea90000000b002dc2fdfdd9amr6668588ywe.321.1646614786199; Sun, 06 Mar 2022 16:59:46 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Antoine Riard Date: Sun, 6 Mar 2022 19:59:33 -0500 Message-ID: To: Jeremy Rubin , Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="00000000000069d75705d9966154" X-Mailman-Approved-At: Mon, 07 Mar 2022 17:42:31 +0000 Subject: Re: [bitcoin-dev] Annex Purpose Discussion: OP_ANNEX, Turing Completeness, and other considerations X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2022 00:59:49 -0000 --00000000000069d75705d9966154 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Jeremy, > I've seen some discussion of what the Annex can be used for in Bitcoin. For > example, some people have discussed using the annex as a data field for > something like CHECKSIGFROMSTACK type stuff (additional authenticated data) > or for something like delegation (the delegation is to the annex). I thin= k > before devs get too excited, we should have an open discussion about what > this is actually for, and figure out if there are any constraints to usin= g > it however we may please. I think one interesting purpose of the annex is to serve as a transaction field extension, where we assign new consensus validity rules to the annex payloads. One could think about new types of locks, e.g where a transaction inclusion is constrained before the annex payload value is superior to the chain's `ChainWork`. This could be useful in case of contentious forks, where you want your transaction to confirm only when enough work is accumulated, and height isn't a reliable indicator anymore. Or a relative-timelock where the endpoint is the presence of a state number encumbering the spent transaction. This could be useful in the context of payment pools, where the user withdraw transactions are all encumbered by a bip68 relative-timelock, as you don't know which one is going to confirm first, but where you don't care about enforcement of the timelocks once the contestation delay has played once and no higher-state update transaction has confirmed. Of course, we could reuse the nSequence field for some of those new types of locks, though we would lose the flexibility of combining multiple locks encumbering the same input. Another use for the annex is locating there the SIGHASH_GROUP group count value. One advantage over placing the value as a script stack item could be to have annex payloads interdependency validity, where other annex payloads are reusing the group count value as part of their own semantics. Antoine Le ven. 4 mars 2022 =C3=A0 18:22, Jeremy Rubin via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> a =C3=A9crit : > I've seen some discussion of what the Annex can be used for in Bitcoin. > For example, some people have discussed using the annex as a data field f= or > something like CHECKSIGFROMSTACK type stuff (additional authenticated dat= a) > or for something like delegation (the delegation is to the annex). I thin= k > before devs get too excited, we should have an open discussion about what > this is actually for, and figure out if there are any constraints to usin= g > it however we may please. > > The BIP is tight lipped about it's purpose, saying mostly only: > > *What is the purpose of the annex? The annex is a reserved space for > future extensions, such as indicating the validation costs of > computationally expensive new opcodes in a way that is recognizable witho= ut > knowing the scriptPubKey of the output being spent. Until the meaning of > this field is defined by another softfork, users SHOULD NOT include annex > in transactions, or it may lead to PERMANENT FUND LOSS.* > > *The annex (or the lack of thereof) is always covered by the signature an= d > contributes to transaction weight, but is otherwise ignored during taproo= t > validation.* > > *Execute the script, according to the applicable script rules[11], using > the witness stack elements excluding the script s, the control block c, a= nd > the annex a if present, as initial stack.* > > Essentially, I read this as saying: The annex is the ability to pad a > transaction with an additional string of 0's that contribute to the virtu= al > weight of a transaction, but has no validation cost itself. Therefore, > somehow, if you needed to validate more signatures than 1 per 50 virtual > weight units, you could add padding to buy extra gas. Or, we might someho= w > make the witness a small language (e.g., run length encoded zeros) such > that we can very quickly compute an equivalent number of zeros to 'charge= ' > without actually consuming the space but still consuming a linearizable > resource... or something like that. We might also e.g. want to use the > annex to reserve something else, like the amount of memory. In general, w= e > are using the annex to express a resource constraint efficiently. This > might be useful for e.g. simplicity one day. > > Generating an Annex: One should write a tracing executor for a script, ru= n > it, measure the resource costs, and then generate an annex that captures > any externalized costs. > > ------------------- > > Introducing OP_ANNEX: Suppose there were some sort of annex pushing > opcode, OP_ANNEX which puts the annex on the stack as well as a 0 or 1 (t= o > differentiate annex is 0 from no annex, e.g. 0 1 means annex was 0 and 0 = 0 > means no annex). This would be equivalent to something based on flag> OP_TXHASH OP_TXHASH. > > Now suppose that I have a computation that I am running in a script as > follows: > > OP_ANNEX > OP_IF > `some operation that requires annex to be <1>` > OP_ELSE > OP_SIZE > `some operation that requires annex to be len(annex) + 1 or does a > checksig` > OP_ENDIF > > Now every time you run this, it requires one more resource unit than the > last time you ran it, which makes your satisfier use the annex as some so= rt > of "scratch space" for a looping construct, where you compute a new annex= , > loop with that value, and see if that annex is now accepted by the progra= m. > > In short, it kinda seems like being able to read the annex off of the > stack makes witness construction somehow turing complete, because we can > use it as a register/tape for some sort of computational model. > > ------------------- > > This seems at odds with using the annex as something that just helps you > heuristically guess computation costs, now it's somehow something that > acts to make script satisfiers recursive. > > Because the Annex is signed, and must be the same, this can also be > inconvenient: > > Suppose that you have a Miniscript that is something like: and(or(PK(A), > PK(A')), X, or(PK(B), PK(B'))). > > A or A' should sign with B or B'. X is some sort of fragment that might > require a value that is unknown (and maybe recursively defined?) so > therefore if we send the PSBT to A first, which commits to the annex, and > then X reads the annex and say it must be something else, A must sign > again. So you might say, run X first, and then sign with A and C or B. > However, what if the script somehow detects the bitstring WHICH_A WHICH_B > and has a different Annex per selection (e.g., interpret the bitstring as= a > int and annex must =3D=3D that int). Now, given and(or(K1, K1'),... or(Kn= , > Kn')) we end up with needing to pre-sign 2**n annex values somehow... thi= s > seems problematic theoretically. > > Of course this wouldn't be miniscript then. Because miniscript is just fo= r > the well behaved subset of script, and this seems ill behaved. So maybe > we're OK? > > But I think the issue still arises where suppose I have a simple thing > like: and(COLD_LOGIC, HOT_LOGIC) where both contains a signature, if > COLD_LOGIC and HOT_LOGIC can both have different costs, I need to decide > what logic each satisfier for the branch is going to use in advance, or > sign all possible sums of both our annex costs? This could come up if > cold/hot e.g. use different numbers of signatures / use checksigCISAadd > which maybe requires an annex argument. > > > > ------------ > > It seems like one good option is if we just go on and banish the OP_ANNEX= . > Maybe that solves some of this? I sort of think so. It definitely seems > like we're not supposed to access it via script, given the quote from abo= ve: > > *Execute the script, according to the applicable script rules[11], using > the witness stack elements excluding the script s, the control block c, a= nd > the annex a if present, as initial stack.* > > If we were meant to have it, we would have not nixed it from the stack, > no? Or would have made the opcode for it as a part of taproot... > > But recall that the annex is committed to by the signature. > > So it's only a matter of time till we see some sort of Cat and Schnorr > Tricks III the Annex Edition that lets you use G cleverly to get the anne= x > onto the stack again, and then it's like we had OP_ANNEX all along, or > without CAT, at least something that we can detect that the value has > changed and cause this satisfier looping issue somehow. > > Not to mention if we just got OP_TXHASH > > > > ----------- > > Is the annex bad? After writing this I sort of think so? > > One solution would be to... just soft-fork it out. Always must be 0. When > we come up with a use case for something like an annex, we can find a way > to add it back. Maybe this means somehow pushing multiple annexes and > having an annex stack, where only sub-segments are signed for the last > executed signature? That would solve looping... but would it break some > aggregation thing? Maybe. > > > Another solution would be to make it so the annex is never committed to > and unobservable from the script, but that the annex is always something > that you can run get_annex(stack) to generate the annex. Thus it is a hin= t > for validation rules, but not directly readable, and if it is modified yo= u > figure out the txn was cheaper sometime after you execute the scripts and > can decrease the value when you relay. But this sounds like something tha= t > needs to be a p2p only annex, because consensus we may not care (unless > it's something like preallocating memory for validation?). > > ----------------------- > > Overall my preference is -- perhaps sadly -- looking like we should > soft-fork it out of our current Checksig (making the policy that it must = 0 > a consensus rule) and redesign the annex technique later when we actually > know what it is for with a new checksig or other mechanism. But It's not = a > hard opinion! It just seems like you can't practically use the annex for > this worklimit type thing *and* observe it from the stack meaningfully. > > > > Thanks for coming to my ted-talk, > > Jeremy > > > -- > @JeremyRubin > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > --00000000000069d75705d9966154 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Jeremy,

> I've seen some discussion of wh= at the Annex can be used for in Bitcoin. For
> example, some people h= ave discussed using the annex as a data field for
> something like CH= ECKSIGFROMSTACK type stuff (additional authenticated data)
> or for s= omething like delegation (the delegation is to the annex). I think
> = before devs get too excited, we should have an open discussion about what> this is actually for, and figure out if there are any constraints to= using
> it however we may please.

I think one interesting pur= pose of the annex is to serve as a transaction field extension, where we as= sign new consensus validity rules to the annex payloads.

One could t= hink about new types of locks, e.g where a transaction inclusion is constra= ined before the annex payload value is superior to the chain's `ChainWo= rk`. This could be useful in case of contentious forks, where you want your= transaction to confirm only when enough work is accumulated, and height is= n't a reliable indicator anymore.

Or a relative-timelock where t= he endpoint is the presence of a state number encumbering the spent transac= tion. This could be useful in the context of payment pools, where the user = withdraw transactions are all encumbered by a bip68 relative-timelock, as y= ou don't know which one is going to confirm first, but where you don= 9;t care about enforcement of the timelocks once the contestation delay has= played once=C2=A0 and no higher-state update transaction has confirmed.
Of course, we could reuse the nSequence field for some of those new ty= pes of locks, though we would lose the flexibility of combining multiple lo= cks encumbering the same input.

Another use for the annex is locatin= g there the SIGHASH_GROUP group count value. One advantage over placing the= value as a script stack item could be to have annex payloads interdependen= cy validity, where other annex payloads are reusing the group count value a= s part of their own semantics.

Antoine

Le=C2=A0ven. 4 mars 2022 = =C3=A0=C2=A018:22, Jeremy Rubin via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org= > a =C3=A9crit=C2=A0:
I've se= en some discussion of what the Annex can be used for in Bitcoin. For exampl= e, some people have discussed using the annex as a data field for something= like CHECKSIGFROMSTACK type stuff (additional authenticated data) or for s= omething like delegation (the delegation is to the annex). I think before d= evs get too excited, we should have an open discussion about what this is a= ctually for, and figure out if there are any constraints to using it howeve= r we may please.

The BIP is tight lipped about it's purpose= , saying mostly only:

What is the = purpose of the annex? The annex is a reserved space for future extensions, = such as indicating the validation costs of computationally expensive new op= codes in a way that is recognizable without knowing the scriptPubKey of the= output being spent. Until the meaning of this field is defined by another = softfork, users SHOULD NOT include annex in transactions, or it may lead to= PERMANENT FUND LOSS.

The annex (or the lack of thereof) is alw= ays covered by the signature and contributes to transaction weight, but is = otherwise ignored during taproot validation.

Execute the script= , according to the applicable script rules[11], using the witness stack ele= ments excluding the script s, the control block c, and the annex a if prese= nt, as initial stack.

Essentially, I read this as saying: The annex is the ability to pad a= transaction with an additional string of 0's that contribute to the vi= rtual weight of a transaction, but has no validation cost itself. Therefore= , somehow, if you needed to validate more signatures than 1 per 50 virtual = weight units, you could add padding to buy extra gas. Or, we might somehow = make the witness a small language (e.g., run length encoded zeros) such tha= t we can very quickly compute an equivalent number of zeros to 'charge&= #39; without actually consuming the space but still consuming a linearizabl= e resource... or something like that. We might also e.g. want to use the an= nex to reserve something else, like the amount of memory. In general, we ar= e using the annex to express a resource constraint efficiently. This might = be useful for e.g. simplicity one day.

Generating an Annex: One should write a tracing ex= ecutor for a script, run it, measure the resource costs, and then generate = an annex that captures any externalized costs.
=
-------------------

Introducin= g OP_ANNEX: Suppose there were some sort of annex pushing opcode, OP_ANNEX = which puts the annex on the stack as well as a 0 or 1 (to differentiate ann= ex is 0 from no annex, e.g. 0 1 means annex was 0 and 0 0 means no annex). = This would be equivalent to something based on <annex flag> OP_TXHASH= <has annex flag> OP_TXHASH.

<= /div>
Now suppose that I have a computation that I am runnin= g in a script as follows:

OP_ANNEX
OP_IF
=C2=A0 =C2=A0 `some operation that requires annex to be <1= >`
OP_ELSE
=C2= =A0 =C2=A0 OP_SIZE
=C2=A0 =C2=A0 `some operatio= n that requires annex to be len(annex)=C2=A0+ 1 or does a checksig`<= /div>
OP_ENDIF

=
Now every time you run this, it requires one more resource = unit than the last time you ran it, which makes your satisfier use the anne= x as some sort of "scratch space" for a looping construct, where = you compute a new annex, loop with that value, and see if that annex is now= accepted by the program.

In short, it kinda seems like being able to read the annex off= of the stack makes witness construction somehow turing complete, because w= e can use it as a register/tape for some sort of computational model.

-------------------

=
This seems at odds with using the annex as something = that just helps you heuristically guess =C2=A0computation costs, now it'= ;s somehow something that acts to make script satisfiers recursive.<= /div>

Because the Annex is = signed, and must be the same, this can also be inconvenient:

Suppose that you have a Mini= script that is something like: and(or(PK(A), PK(A')), X, or(PK(B), PK(B= '))).

A or= A' should sign with B or B'. X is some sort of fragment that might= require a value that is unknown (and maybe recursively defined?) so theref= ore if we send the PSBT to A first, which commits to the annex, and then X = reads the annex and say it must be something else, A must sign again. So yo= u might say, run X first, and then sign with A and C or B. However, what if= the script somehow detects the bitstring WHICH_A WHICH_B and has a differe= nt Annex per selection (e.g., interpret the bitstring as a int and annex mu= st =3D=3D that int). Now, given and(or(K1, K1'),... or(Kn, Kn')) we= end up with needing to pre-sign 2**n annex values somehow... this seems pr= oblematic theoretically.

Of course this wouldn't be miniscript then. Because minisc= ript is just for the well behaved subset of script, and this seems ill beha= ved. So maybe we're OK?

But I think the issue still arises where suppose I have a sim= ple thing like: and(COLD_LOGIC, HOT_LOGIC) where both contains a signature,= if COLD_LOGIC and HOT_LOGIC can both have different costs, I need to decid= e what logic each satisfier for the branch is going to use in advance, or s= ign all possible sums of both our annex costs? This could come up if cold/h= ot e.g. use different numbers of signatures / use checksigCISAadd which may= be requires an annex argument.



------------

It seems like one good option is if we just go on and banish the = OP_ANNEX. Maybe that solves some of this? I sort of think so. It definitely= seems like we're not supposed to access it via script, given the quote= from above:

Execute the script, according t= o the applicable script rules[11], using the witness stack elements excludi= ng the script s, the control block c, and the annex a if present, as initia= l stack.

If we were meant t= o have it, we would have not nixed it from the stack, no? Or would have mad= e the opcode for it as a part of taproot...

But recall that the annex is committed=C2=A0to by=C2= =A0the signature.

So it's only a matter of time till we see some sort of Cat and Sc= hnorr Tricks III the Annex Edition that lets you use G cleverly to get the = annex onto the stack again, and then it's like we had OP_ANNEX all alon= g, or without CAT, at least something that we can detect that the value has= changed and cause this satisfier looping issue somehow.

Not to mention if we just got OP= _TXHASH



-----------

Is the anne= x bad? After writing this I sort of think so?

One solution would be to= ... just soft-fork it out. Always must be 0. When we come up with a use cas= e for something like an annex, we can find a way to add it back.=C2=A0 Mayb= e this means somehow pushing multiple annexes and having an annex stack, wh= ere only sub-segments are signed for the last executed signature? That woul= d solve looping... but would it break some aggregation thing? Maybe.=

=
Another solution would be to make it= so the annex is never committed=C2=A0to and unobservable from the script, = but that the annex is always something that you can run get_annex(stack) to= generate the annex. Thus it is a hint for validation rules, but not direct= ly readable, and if it is modified you figure out the txn was cheaper somet= ime after you execute the scripts and can decrease the value when you relay= . But this sounds like something that needs to be a p2p only annex, because= consensus we may not care (unless it's something like preallocating me= mory for validation?).

-----------------------

Overall my preference is -- perh= aps sadly -- looking like we should soft-fork it out of our current Checksi= g (making the policy that it must 0 a consensus rule) and redesign the anne= x technique later when we actually know what it is for with a new checksig = or other mechanism. But It's not a hard opinion! It just seems like you= can't practically use the annex for this worklimit type thing *and* ob= serve it from the stack meaningfully.

=
Thanks for coming to my ted-talk,

Jere= my


_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev
--00000000000069d75705d9966154--