Return-Path: Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 89C71C000B for ; Mon, 7 Mar 2022 06:27:02 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 78E298174C for ; Mon, 7 Mar 2022 06:27:02 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -2.099 X-Spam-Level: X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no Authentication-Results: smtp1.osuosl.org (amavisd-new); dkim=pass (2048-bit key) header.d=chia.net Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NaUQ81OypNQF for ; Mon, 7 Mar 2022 06:27:01 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.8.0 Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) by smtp1.osuosl.org (Postfix) with ESMTPS id B8CFB81678 for ; Mon, 7 Mar 2022 06:27:00 +0000 (UTC) Received: by mail-lj1-x229.google.com with SMTP id u3so2386037ljd.0 for ; Sun, 06 Mar 2022 22:27:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chia.net; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=05VqecFGdwx+9iHxGARLoa6gm2bRzTZ/aXFCvmpUrpk=; b=A4j4tk1mD41tttLRzHOBtjqfjDuFOQT7a8eB1TOpe8sWs/MVjJY946PV4/9hfGns2f jKpbVe0Ghu8rSjbVfMJNHD+XKN0ybA6rQSBVOuINSBPS63/iAk+YYCu2scTN9SDObZSf B065cOkr35aZ6Gm50hfd9m7bGK3/oHEao6mpCLPxF/DnRfUtjiOF5Z1KdeehgF5Bp35c M7FqoMRYKGu/euyMMf2gJ9JLTjevD2Xiawyvgr40UyOdspG+Bis8abZTXhekhyry+Auv hHyN9b8lla78YELCc2gAGk/svPZuRarFERVUo+ExXEGsjkdYOY21m5rTJjxAImJZ0tJZ akEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=05VqecFGdwx+9iHxGARLoa6gm2bRzTZ/aXFCvmpUrpk=; b=g7cgQhONcrmWp0n2mSPtri495gnYUpfnAuHxwzVh+zsBoAEihj/JCayYntEZN+gJye Gedhg1M5rSYV4rXF4mhwcQL9MJEZH2bKdVfPl+Qh86GPEUAzma0wenTD4VyxzBPljnb/ l5sffiA1N/olw5AFY8+7vt/migamwIe4OEjdqmhq6l/yrW9qGu+aw1G1EO/xjaJNYLNy nZL3ojD1TO7z42QTyEOUtWzOVdN8mH6SgKCoMzRcMJdIOZVoG7vgNaaAFd4F5s7cRLqV DLayUMUWGRcsKN7WjubKbtGO/t6KakYWpwp6LeEsn2TVeO3cmSdDfgGru3XFPPhRX1Qc Ge3Q== X-Gm-Message-State: AOAM533LMwrmFyMdFyOZmEEzIoriauva4KNOkMcXXKkKgdgG9EMM07uj RtC1KLaG/EpZ041ydYxfu/V3WR6uJkQurGr3umOmg3IMUJhfww== X-Google-Smtp-Source: ABdhPJx34AJue0zZKuhV0+lc7nAHd8CHfjNheUIPqoMkxlA3KJuARA+f9olvIVGg57X9Xd/CyX0S1rxAZSKVZliTRWQ= X-Received: by 2002:a2e:9915:0:b0:23c:ed8f:7dab with SMTP id v21-20020a2e9915000000b0023ced8f7dabmr6553379lji.190.1646634418191; Sun, 06 Mar 2022 22:26:58 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Bram Cohen Date: Sun, 6 Mar 2022 22:26:47 -0800 Message-ID: To: Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="00000000000092527505d99af377" X-Mailman-Approved-At: Mon, 07 Mar 2022 17:42:31 +0000 Subject: [bitcoin-dev] bitcoin scripting and lisp X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2022 06:27:02 -0000 --00000000000092527505d99af377 Content-Type: text/plain; charset="UTF-8" > > After looking into it, I actually think chia lisp [1] gets pretty much all > the major design decisions pretty much right. There are obviously a few > changes needed given the differences in design between chia and bitcoin: > > - having secp256k1 signatures (and curve operations), instead of > BLS12-381 ones > > - adding tx introspection instead of having bundle-oriented CREATE_COIN, > and CREATE/ASSERT results [10] > Bitcoin uses the UTXO model as opposed to Chia's Coin Set model. While these are close enough that it's often explained as Chia uses the UTXO model but that isn't technically true. Relevant to the above comment is that in the UTXO model transactions get passed to a scriptpubkey and it either assert fails or it doesn't, while in the coin set model each puzzle (scriptpubkey) gets run and either assert fails or returns a list of extra conditions it has, possibly including timelocks and creating new coins, paying fees, and other things. If you're doing everything from scratch it's cleaner to go with the coin set model, but retrofitting onto existing Bitcoin it may be best to leave the UTXO model intact and compensate by adding a bunch more opcodes which are special to parsing Bitcoin transactions. The transaction format itself can be mostly left alone but to enable some of the extra tricks (mostly implementing capabilities) it's probably a good idea to make new conventions for how a transaction can have advisory information which specifies which of the inputs to a transaction is the parent of a specific output and also info which is used for communication between the UTXOs in a transaction. But one could also make lisp-generated UTXOs be based off transactions which look completely trivial and have all their important information be stored separately in a new vbytes area. That works but results in a bit of a dual identity where some coins have both an old style id and a new style id which gunks up what > > - serialization seems to be a bit verbose -- 100kB of serialized clvm > code from a random block gzips to 60kB; optimising the serialization > for small lists, and perhaps also for small literal numbers might be > a feasible improvement; though it's not clear to me how frequently > serialization size would be the limiting factor for cost versus > execution time or memory usage. > A lot of this is because there's a hook for doing compression at the consensus layer which isn't being used aggressively yet. That one has the downside that the combined cost of transactions can add up very nonlinearly, but when you have constantly repeated bits of large boilerplate it gets close and there isn't much of an alternative. That said even with that form of compression maxxed out it's likely that gzip could still do some compression but that would be better done in the database and in wire protocol formats rather than changing the format which is hashed at the consensus layer. > Pretty much all the opcodes in the first section are directly from chia > lisp, while all the rest are to complete the "bitcoin" functionality. > The last two are extensions that are more food for thought than a real > proposal. > Are you thinking of this as a completely alternative script format or an extension to bitcoin script? They're radically different approaches and it's hard to see how they mix. Everything in lisp is completely sandboxed, and that functionality is important to a lot of things, and it's really normal to be given a reveal of a scriptpubkey and be able to rely on your parsing of it. > There's two ways to think about upgradability here; if someday we want > to add new opcodes to the language -- perhaps something to validate zero > knowledge proofs or calculate sha3 or use a different ECC curve, or some > way to support cross-input signature aggregation, or perhaps it's just > that some snippets are very widely used and we'd like to code them in > C++ directly so they validate quicker and don't use up as much block > weight. One approach is to just define a new version of the language > via the tapleaf version, defining new opcodes however we like. > A nice side benefit of sticking with the UTXO model is that the soft fork hook can be that all unknown opcodes make the entire thing automatically pass. > > The other is to use the "softfork" opcode -- chia defines it as: > > (softfork cost code) > > though I think it would probably be better if it were > > (softfork cost version code) > Since softfork has to date never been used that second parameter is technically completely ignored and could be anything at all. Most likely a convention including some kind of version information will be created the first time it's used. Also Chia shoves total cost into blocks at the consensus layer out of an abundance of caution although that isn't technically necessary. [10] [9] The CREATE/ASSERT bundling stuff is interesting; and could be > used to achieve functionality like the "transaction sponsorship" > stuff. It doesn't magically solve the issues with maintaining the > mempool and using that to speed up block acceptance, though, and > the chia chain has apparently suffered from mempool-flooding attacks > recently [11] so I don't think they've solved the broader problem, > Chia's approach to transaction fees is essentially identical to Bitcoin's although a lot fewer things in the ecosystem support fees due to a lack of having needed it yet. I don't think mempool issues have much to do with choice of scriptpubkey language. which is mostly about adding in covenants and capabilities. That said, Ethereum does have trivial aggregation of unrelated transactions, and the expense of almost everything else. There are a bunch of ways automatic aggregation functionality could be added to coin set mempools by giving them some understanding of the semantics of some transactions, but that hasn't been implemented yet. I previously posted some thoughts about this here: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019722.html --00000000000092527505d99af377 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
After looking into it, I actually think chia lisp [1] ge= ts pretty much all
the major design decisions pretty much right. There are obviously a few
changes needed given the differences in design between chia and bitcoin:
=C2=A0- having secp256k1 signatures (and curve operations), instead of
=C2=A0 =C2=A0BLS12-381 ones

=C2=A0- adding tx introspection instead of having bundle-oriented CREATE_CO= IN,
=C2=A0 =C2=A0and CREATE/ASSERT results [10]

=
Bitcoin uses the UTXO model as opposed to Chia's Coin Set model. W= hile these are close enough that it's often explained as Chia uses the = UTXO model but that isn't technically true. Relevant to the above comme= nt is that in the UTXO model transactions get passed to a scriptpubkey and = it either assert fails or it doesn't, while in the coin set model each = puzzle (scriptpubkey) gets run and either assert fails or returns a list of= extra conditions it has, possibly including timelocks and creating new coi= ns, paying fees, and other things.

If you're d= oing everything from scratch it's cleaner to go with the coin set model= , but retrofitting onto existing Bitcoin it may be best to leave the UTXO m= odel intact and compensate by adding a bunch more opcodes which are special= to parsing Bitcoin transactions. The transaction format itself can be most= ly left alone but to enable some of the extra tricks (mostly implementing c= apabilities) it's probably a good idea to make new conventions for how = a transaction can have advisory information which specifies which of the in= puts to a transaction is the parent of a specific output and also info whic= h is used for communication between the UTXOs in a transaction.=C2=A0
=

But one could also make lisp-generated UTXOs be based o= ff transactions which look completely trivial and have all their important = information be stored separately in a new vbytes area. That works but resul= ts in a bit of a dual identity where some coins have both an old style id a= nd a new style id which gunks up what=C2=A0
=C2=A0

=C2=A0- serialization seems to be a bit verbose -- 100kB of serialized clvm=
=C2=A0 =C2=A0code from a random block gzips to 60kB; optimising the seriali= zation
=C2=A0 =C2=A0for small lists, and perhaps also for small literal numbers mi= ght be
=C2=A0 =C2=A0a feasible improvement; though it's not clear to me how fr= equently
=C2=A0 =C2=A0serialization size would be the limiting factor for cost versu= s
=C2=A0 =C2=A0execution time or memory usage.

A lot of this is because there's a hook for doing compression at = the consensus layer which isn't being used aggressively yet. That one h= as the downside that the combined cost of transactions can add up very nonl= inearly, but when you have constantly repeated bits of large boilerplate it= gets close and there isn't much of an alternative. That said even with= that form of compression maxxed out it's likely that gzip could still = do some compression but that would be better done in the database and in wi= re protocol formats rather than changing the format which is hashed at the = consensus layer.
=C2=A0
Pretty much all the opcodes in the first section are directly= from chia
lisp, while all the rest are to complete the "bitcoin" functional= ity.
The last two are extensions that are more food for thought than a real
proposal.

Are you thinking of this as a= completely alternative script format or an extension to bitcoin script? Th= ey're radically different approaches and it's hard to see how they = mix. Everything in lisp is completely sandboxed, and that functionality is = important to a lot of things, and it's really normal to be given a reve= al of a scriptpubkey and be able to rely on your parsing of it.
= =C2=A0
There's t= wo ways to think about upgradability here; if someday we want
to add new opcodes to the language -- perhaps something to validate zero knowledge proofs or calculate sha3 or use a different ECC curve, or some way to support cross-input signature aggregation, or perhaps it's just<= br> that some snippets are very widely used and we'd like to code them in C++ directly so they validate quicker and don't use up as much block weight. One approach is to just define a new version of the language
via the tapleaf version, defining new opcodes however we like.

A nice side benefit of sticking with the UTXO model= is that the soft fork hook can be that all unknown opcodes make the entire= thing automatically pass.
=C2=A0

The other is to use the "softfork" opcode -- chia defines it as:<= br>
=C2=A0 (softfork cost code)

though I think it would probably be better if it were

=C2=A0 (softfork cost version code)

Sin= ce softfork has to date never been used that second parameter is technicall= y completely ignored and could be anything at all. Most likely a convention= including some kind of version information will be created the first time = it's used. Also Chia shoves total cost into blocks at the consensus lay= er out of an abundance of caution although that isn't technically neces= sary.

[10] [9] The CREATE/ASSERT bundling stuff is interesting; and could be
=C2=A0 =C2=A0 used to achieve functionality like the "transaction spon= sorship"
=C2=A0 =C2=A0 stuff. It doesn't magically solve the issues with maintai= ning the
=C2=A0 =C2=A0 mempool and using that to speed up block acceptance, though, = and
=C2=A0 =C2=A0 the chia chain has apparently suffered from mempool-flooding = attacks
=C2=A0 =C2=A0 recently [11] so I don't think they've solved the bro= ader problem,

Chia's approach to tr= ansaction fees is essentially identical to Bitcoin's although a lot few= er things in the ecosystem support fees due to a lack of having needed it y= et. I don't think mempool issues have much to do with choice of scriptp= ubkey language. which is mostly about adding in covenants and capabilities.=

That said, Ethereum does have trivial aggregation= of unrelated transactions, and the expense of almost everything else. Ther= e are a bunch of ways automatic aggregation functionality could be added to= coin set mempools by giving them some understanding of the semantics of so= me transactions, but that hasn't been implemented yet.

I previously posted some thoughts about this here:=C2=A0https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-= December/019722.html

--00000000000092527505d99af377--