Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 5915FB4A for ; Mon, 2 Oct 2017 17:16:01 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-ua0-f172.google.com (mail-ua0-f172.google.com [209.85.217.172]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id EF29E478 for ; Mon, 2 Oct 2017 17:15:59 +0000 (UTC) Received: by mail-ua0-f172.google.com with SMTP id i35so1029357uah.9 for ; Mon, 02 Oct 2017 10:15:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blockstream-io.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=enbTFmMcjMu0hFAYlZBG6nGxsja/MS7oUhAOrLflxl8=; b=UywB095ow2mDCkfVW7767YeSSAjC7HnLGFgKj7gBHb+VI+EtsFGY2ClHkfTcuI+P46 uBIYSBR+7SMEaGj/BjzI5px502it8agbxpcmOQMdH3AIdKhrpiHNc0yhhdy8bFDWumOi qsByWg+LKARWMf2agwCVA8J8bqpRQca3uDkDDshdgSoAjCgEp9+YJVr81EZuPpn8DQH9 Hg0+b/HYgT1hlmsVUA3c83/H38uXYptoSQfZCNXerG9iGMhRr1F5LdeaS3bRtfonOytZ 5r+p2nLsvrZH6FN3NC7T3DqG0WgwwCzGjzR5d7QnCCMRdQnB02WC23dUI2eEJY7xFRpE mxLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=enbTFmMcjMu0hFAYlZBG6nGxsja/MS7oUhAOrLflxl8=; b=CfPqLNyigp5ALPYMXNw/Q1fU3mhYkxtcyJqlnVYbrsJZXZJg6XlPXIZCD5dJKpBRhf qTf6oTzfqvdhxw2RzHgB3aWfS+Z+DcAlgCG+cqpn/WWtKRbYuVopM/f4MzIYIfEVv08S bEjm8WivZT4afH+EH1x+iG+xwABFFwd9PxtlVKn3NEbGTcwiuPzrBp3kgaFgO+Ba4Y27 WAd8kpUobf2GEAliaDX5gk9mOSvP1o/W3lAHhWwH+hHAUF6ETAO2rmYnyqHUfvbfevg5 zXRj3NE4q8DAkHpGDQzo1SMRgs2DK6pCdQECuD/k9fn2R9Ta4xeIavVfbTXyA5eyahgi x/TQ== X-Gm-Message-State: AHPjjUjCglBvp5DW67RXVjYG2HR9PG0ervZm6OYD8nh1q1vVHK/dzbWv hmLppVXiiYw9CQp7mLcZsGuWO1ZgHUImXa/Q43UJUA== X-Google-Smtp-Source: AOwi7QCDAvFv9GWyu3fpuBT5nhQDoZN75UZiLTnnYeQJqnouY3/Lg3UBa/glpotSKgxIuA56g4Nq/mP7pWk/jzk3NK0= X-Received: by 10.159.37.201 with SMTP id 67mr9193744uaf.59.1506964558926; Mon, 02 Oct 2017 10:15:58 -0700 (PDT) MIME-Version: 1.0 Received: by 10.159.41.161 with HTTP; Mon, 2 Oct 2017 10:15:38 -0700 (PDT) In-Reply-To: <921EB5CF-B472-4BD6-9493-1A681586FB51@friedenbach.org> References: <5B6756D0-6BEF-4A01-BDB8-52C646916E29@friedenbach.org> <201709302323.33004.luke@dashjr.org> <921EB5CF-B472-4BD6-9493-1A681586FB51@friedenbach.org> From: "Russell O'Connor" Date: Mon, 2 Oct 2017 13:15:38 -0400 Message-ID: To: Mark Friedenbach , Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="001a113c4f3c3975b8055a9387ec" X-Spam-Status: No, score=0.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HTML_MESSAGE,RCVD_IN_DNSWL_NONE autolearn=disabled version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] Merkle branch verification & tail-call semantics for generalized MAST X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Oct 2017 17:16:01 -0000 --001a113c4f3c3975b8055a9387ec Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable (Subject was: [bitcoin-dev] Version 1 witness programs (first draft)), but I'm moving part of that conversation to this thread. On Sun, Oct 1, 2017 at 5:32 PM, Johnson Lau wrote: > 3. Do we want to allow static analysis of sigop? > BIP114 and the related proposals are specifically designed to allow stati= c > analysis of sigop. I think this was one of the main reason of OP_EVAL not > being accepted. This was also the main reason of Ethereum failing to do a > DAO hacker softfork, leading to the ETH/ETC split. I=E2=80=99m not sure i= f we > really want to give up this property. Once we do it, we have to support i= t > forever. I would very much like to retain the ability to do static analysis. More generally, the idea of interpreting arbitrary data as code, as done in OP_EVAL and in TAILCALL, makes me quite anxious. This at the root of many security problems throughout the software industry, and I don't relish giving more fuel to the underhanded Bitcoin Script contestants. On Sun, Oct 1, 2017 at 8:45 PM, Luke Dashjr wrote: > > 3. Do we want to allow static analysis of sigop? > > BIP114 and the related proposals are specifically designed to allow > static > > analysis of sigop. I think this was one of the main reason of OP_EVAL n= ot > > being accepted. This was also the main reason of Ethereum failing to do= a > > DAO hacker softfork, leading to the ETH/ETC split. I=E2=80=99m not sure= if we > > really want to give up this property. Once we do it, we have to support > it > > forever. > > It seems inevitable at this point. Maybe we could add a separate > "executable- > witness" array (in the same manner as the current witness was softforked > in), > and require tail-call and condition scripts to merely reference these by > hash, > but I'm not sure it's worth the effort? > > Thinking further, we could avoid adding a separate executable-witness > commitment by either: > A) Define that all the witness elements in v1 are type-tagged (put the > minor > witness version on them all, and redefine minor 0 as a stack item?); o= r > B) Use an empty element as a delimiter between stack and executable items= . > > To avoid witness malleability, the executable items can be required to be > sorted in some manner. > > The downside of these approaches is that we now need an addition 20 or 32 > bytes per script reference... which IMO may possibly be worse than losing > static analysis. I wonder if there's a way to avoid that overhead? > Actually, I have a half-baked idea I've been thinking about along these lines. The idea is to add a flag to each stack item in the Script interpreter to mark whether the item in the stack is "executable" or "non-executable", not so different from how computers mark pages to implement executable space protection. By default, all stack items are marked "non-executable". We then redefine OP_PUSHDATA4 as OP_PUSHCODE within ScriptSigs. The operational semantics of OP_PUSHCODE would remain the same as OP_PUSHDATA4 except it would set the pushed item's associated flag to "executable". All data pushed by OP_PUSHCODE would be subject to the sigops limits and any other similar static analysis limits. Segwit v0 doesn't use OP_PUSHDATA codes to create the input stack, so we would have to mark executable input stack items using a new witness v1 format. But, IIUC, TAILCALL isn't going to be compatible with Segwit v0 anyway. During a TAILCALL, it is required that the top item on the stack have the "executable" flag, otherwise TAILCALL is not used (and the script succeeds or fails based on the top item's data value as usual). All other operations can treat "executable" items as data, including the merkle branch verification. None of the Script operations can create "executable" items; in particular, OP_PUSHDATA4 within the ScriptPubKey also would not create "executable" items. (We can talk about the behaviour of OP_CAT when that time comes). One last trick is that when "executable" values are duplicated, by OP_DUP, OP_IFDUP, OP_PICK. then the newly created copy of the value on top of the stack is marked "non-executable". Because we make the "executable" flag non-copyable, we are now free to allow unbounded uses of TAILCALL (i.e. TAILCALL can be used multiplie times in a single input). Why is this safe? Because the number of "executable" items decreases by at least one every time TAILCALL is invoked. the number of OP_PUSHCODE occurrences in the witness puts an upper bound on the number of invocations of TAILCALL allowed. Using static analysis of the script pubkey and the data within the OP_PUSHCODE data, we compute an upper bound on the number of operations (of any type) that can occur during execution. Unbounded TAILCALL should let us (in the presence of OP_CHECKSIGFROMSTACK) have unbounded delegation. Overall, I believe that OP_PUSHCODE 1. is fully backwards compatible. 2. maintains our ability to perform static analysis with TAILCALL. 3. never lets us interpret computed values as executable code. 4. extends TAILCALL to safely allow multiple TAILCALLs per script. --001a113c4f3c3975b8055a9387ec Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
(Subject was: [bitcoin-dev] Ver= sion 1 witness programs (first draft)), but I'm moving part of that con= versation to this thread.

On Sun, Oct 1, 2017 at 5:32 PM, Johnson Lau <jl2012@xb= t.hk> wrote:
3. Do we want to allow static analysis of sigop?
BIP114 and the related proposals are specifically designed to allow=20 static analysis of sigop. I think this was one of the main reason of=20 OP_EVAL not being accepted. This was also the main reason of Ethereum=20 failing to do a DAO hacker softfork, leading to the ETH/ETC split. I=E2=80= =99m=20 not sure if we really want to give up this property. Once we do it, we=20 have to support it forever.

I would very mu= ch like to retain the ability to do static analysis.=C2=A0 More generally, = the idea of interpreting arbitrary data as code, as done in OP_EVAL and in = TAILCALL, makes me quite anxious.=C2=A0 This at the root of many security p= roblems throughout the software industry, and I don't relish giving mor= e fuel to the underhanded Bitcoin Script contestants.
=C2=A0<= br>
On Sun, Oct 1= , 2017 at 8:45 PM, Luke Dashjr <luke@dashjr.org> wrote:
> 3. Do we want to allow static analysis of sigop?
> BIP114 and the related proposals are specifically designed to allow st= atic
> analysis of sigop. I think this was one of the main reason of OP_EVAL = not
> being accepted. This was also the main reason of Ethereum failing to d= o a
> DAO hacker softfork, leading to the ETH/ETC split. I=E2=80=99m not sur= e if we
> really want to give up this property. Once we do it, we have to suppor= t it
> forever.

It seems inevitable at this point. Maybe we could add a separate &qu= ot;executable-
witness" array (in the same manner as the current witness was softfork= ed in),
and require tail-call and condition scripts to merely reference these by ha= sh,
but I'm not sure it's worth the effort?

Thinking further, we could avoid adding a separate executable-witness
commitment by either:
A) Define that all the witness elements in v1 are type-tagged (put the mino= r
=C2=A0 =C2=A0witness version on them all, and redefine minor 0 as a stack i= tem?); or
B) Use an empty element as a delimiter between stack and executable items.<= br>
To avoid witness malleability, the executable items can be required to be sorted in some manner.

The downside of these approaches is that we now need an addition 20 or 32 bytes per script reference... which IMO may possibly be worse than losing static analysis. I wonder if there's a way to avoid that overhead?

Ac= tually, I have a half-baked idea I've been thinking about along these l= ines.

The idea is to add a flag to each stack item= in the Script interpreter to mark whether the item in the stack is "e= xecutable" or "non-executable", not so different from how co= mputers mark pages to implement executable space protection.=C2=A0 By defau= lt, all stack items are marked "non-executable".=C2=A0 We then re= define OP_PUSHDATA4 as OP_PUSHCODE within ScriptSigs.=C2=A0 The operational= semantics of OP_PUSHCODE would remain the same as OP_PUSHDATA4 except it w= ould set the pushed item's associated flag to "executable".= =C2=A0 All data pushed by OP_PUSHCODE would be subject to the sigops limits= and any other similar static analysis limits.

Seg= wit v0 doesn't use OP_PUSHDATA codes to create the input stack, so we w= ould have to mark executable input stack items using a new witness v1 forma= t. But, IIUC, TAILCALL isn't going to be compatible with Segwit v0 anyw= ay.

During a TAILCALL, it is required that the= top item on the stack have the "executable" flag, otherwise TAIL= CALL is not used (and the script succeeds or fails based on the top item= 9;s data value as usual).

All other operations can= treat "executable" items as data, including the merkle branch ve= rification.=C2=A0 None of the Script operations can create "executable= " items; in particular, OP_PUSHDATA4 within the ScriptPubKey also woul= d not create "executable" items.=C2=A0 (We can talk about the beh= aviour of OP_CAT when that time comes).

One la= st trick is that when "executable" values are duplicated, by OP_D= UP, OP_IFDUP, OP_PICK. then the newly created copy of the value on top of t= he stack is marked "non-executable".

Bec= ause we make the "executable" flag non-copyable, we are now free = to allow unbounded uses of TAILCALL (i.e. TAILCALL can be used multiplie ti= mes in a single input).=C2=A0 Why is this safe?=C2=A0 Because the number of= "executable" items decreases by at least one every time TAILCALL= is invoked. the number of OP_PUSHCODE occurrences in the witness puts an u= pper bound on the number of invocations of TAILCALL allowed.=C2=A0 Using st= atic analysis of the script pubkey and the data within the OP_PUSHCODE data= , we compute an upper bound on the number of operations (of any type) that = can occur during execution.

Unbounded TAILCALL sho= uld let us (in the presence of OP_CHECKSIGFROMSTACK) have unbounded delegat= ion.

Overall, I believe that OP_PUSHCODE
=

1. is fully backwards compatible.
2. main= tains our ability to perform static analysis with TAILCALL.
3. ne= ver lets us interpret computed values as executable code.
4. exte= nds TAILCALL to safely allow multiple TAILCALLs per script.
=
--001a113c4f3c3975b8055a9387ec--