summaryrefslogtreecommitdiff
path: root/2a/8a16a4991261a82562ec08a6dd074dd578afca
blob: 5209090eecf54e5a46f327dc1d871c5ddaf722a7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
Return-Path: <ZmnSCPxj@protonmail.com>
Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138])
 by lists.linuxfoundation.org (Postfix) with ESMTP id F4137C000B
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  4 Mar 2022 23:11:07 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by smtp1.osuosl.org (Postfix) with ESMTP id D4B8481D5C
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  4 Mar 2022 23:11:07 +0000 (UTC)
X-Virus-Scanned: amavisd-new at osuosl.org
X-Spam-Flag: NO
X-Spam-Score: -1.602
X-Spam-Level: 
X-Spam-Status: No, score=-1.602 tagged_above=-999 required=5
 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
 DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 FROM_LOCAL_NOVOWEL=0.5, RCVD_IN_MSPIKE_H2=-0.001,
 SPF_HELO_PASS=-0.001, SPF_PASS=-0.001]
 autolearn=ham autolearn_force=no
Authentication-Results: smtp1.osuosl.org (amavisd-new);
 dkim=pass (2048-bit key) header.d=protonmail.com
Received: from smtp1.osuosl.org ([127.0.0.1])
 by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id 7TW0MZ91gTUh
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  4 Mar 2022 23:11:02 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
Received: from mail-40130.protonmail.ch (mail-40130.protonmail.ch
 [185.70.40.130])
 by smtp1.osuosl.org (Postfix) with ESMTPS id 1F279818D0
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  4 Mar 2022 23:11:02 +0000 (UTC)
Date: Fri, 04 Mar 2022 23:10:48 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1646435458;
 bh=LDZ/IaXPrOQBbKTzraFDE+g98KrqgYQdX1T4sUc4EiI=;
 h=Date:To:From:Reply-To:Subject:Message-ID:In-Reply-To:References:
 From:To:Cc:Date:Subject:Reply-To:Feedback-ID:Message-ID;
 b=YeLizwtsqiwXb7JLxK862qpGx4Hk7lilhPrlw3Yl/CLJnYHdYnkLxd4eZ1dBy94RR
 yHgubXh5VFR0PcjJWWdEg15b2TfJhYTD2gqp1jq8o39oHlnmuuyUEyL5C3mM4emBw5
 eRM5fthb1ssN6Q9Yn0V8bdR8HUfONLbIlJi0qOIus0vE646+fUn4qY0a2JJSYyj6/N
 odph60hJ1MLrAWKr9mZ+hP93vx7a8aBDNt9S7XKxUGWJPUxK3YUjE10MwMBA9CzJlU
 bZcsuFLTYuq78/4NlM1TwXz9duZtV7tMF7zAd+1GdVcfI7gwe85vp9i/PF6MUcNECS
 7N/KGQLTSNMfg==
To: Anthony Towns <aj@erisian.com.au>,
 Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
From: ZmnSCPxj <ZmnSCPxj@protonmail.com>
Reply-To: ZmnSCPxj <ZmnSCPxj@protonmail.com>
Message-ID: <0yCTRKhBa9IPPg5J4HfKxraWJ4w6gUS5LRAoCPk01NpbYk-9R5zxAOmJO1Z8voUiatUJugYB6Oa9t1wFLbhQSgDie8hBzr0Z1EJVm6XGvMI=@protonmail.com>
In-Reply-To: <20220304010442.GC3869@erisian.com.au>
References: <20220304010442.GC3869@erisian.com.au>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [bitcoin-dev] bitcoin scripting and lisp
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Mar 2022 23:11:08 -0000


Good morning aj,

> On Sun, Feb 27, 2022 at 04:34:31PM +0000, ZmnSCPxj via bitcoin-dev wrote:
>
> > In reaction to this, AJ Towns mailed me privately about some of his
> > thoughts on this insane `OP_EVICT` proposal.
> > He observed that we could generalize the `OP_EVICT` opcode by
> > decomposing it into smaller parts, including an operation congruent
> > to the Scheme/Haskell/Scala `map` operation.
>
> At much the same time Zman was thinking about OP_FOLD and in exactly the
> same context, I was wondering what the simplest possible language that
> had some sort of map construction was -- I mean simplest in a "practical
> engineering" sense; I think Simplicity already has the Euclidean/Peano
> "least axioms" sense covered.
>
> The thing that's most appealing to me about bitcoin script as it stands
> (beyond "it works") is that it's really pretty simple in an engineering
> sense: it's just a "forth" like system, where you put byte strings on a
> stack and have a few operators to manipulate them. The alt-stack, and
> supporting "IF" and "CODESEPARATOR" add a little additional complexity,
> but really not very much.
>
> To level-up from that, instead of putting byte strings on a stack, you
> could have some other data structure than a stack -- eg one that allows
> nesting. Simple ones that come to mind are lists of (lists of) byte
> strings, or a binary tree of byte strings [0]. Both those essentially
> give you a lisp-like language -- lisp is obviously all about lists,
> and a binary tree is just made of things or pairs of things, and pairs
> of things are just another way of saying "car" and "cdr".
>
> A particular advantage of lisp-like approaches is that they treat code
> and data exactly the same -- so if we're trying to leave the option open
> for a transaction to supply some unexpected code on the witness stack,
> then lisp handles that really naturally: you were going to include data
> on the stack anyway, and code and data are the same, so you don't have
> to do anything special at all. And while I've never really coded in
> lisp at all, my understanding is that its biggest problems are all about
> doing things efficiently at large scales -- but script's problem space
> is for very small scale things, so there's at least reason to hope that
> any problems lisp might have won't actually show up for this use case.

I heartily endorse LISP --- it has a trivial implementation of `eval` that =
is easily implementable once you have defined a proper data type in preferr=
ed-language-here to represent LISP datums.
Combine it with your idea of committing to a max-number-of-operations (whic=
h increases the weight of the transaction) and you may very well have somet=
hing viable.
(In particular, even though `eval` is traditionally (re-)implemented in LIS=
P itself, the limit on max-number-of-operations means any `eval` implementa=
tion within the same language is also forcibly made total.)

Of note is that the supposed "problem at scale" of LISP is, as I understand=
 it, due precisely to its code and data being homoiconic to each other.
This homoiconicity greatly tempts LISP programmers to use macros, i.e. prog=
rams that generate other programs from some input syntax.
Homoiconicity means that one can manipulate code just as easily as the data=
, and thus LISP macros are a trivial extension on the language.
This allows each LISP programmer to just code up a macro to expand common p=
atterns.
However, each LISP programmer then ends up implementing *different*, but *s=
imilar* macros from each other.
Unfortunately, programming at scale requires multiple programmers speaking =
the same language.
Then programming at scale is hampered because each LISP programmer has thei=
r own private dialect of LISP (formed from the common LISP language and fro=
m their own extensive set of private macros) and intercommunication between=
 them is hindered by the fact that each one speaks their own private dialec=
t.
Some LISP-like languages (e.g. Scheme) have classically targeted a "small" =
subset of absolutely-necessary operations, and each implementation of the l=
anguage immediately becomes a new dialect due to having slightly different =
forms for roughly the same convenience function or macro, and *then* indivi=
dual programmers build their own private dialect on top.
For Scheme specifically, R7RS has targeted providing a "large" standard as =
well, as did R6RS (which only *had* a "large" standard), but individual Sch=
eme implementations have not always liked to implement *all* the "large" st=
andard.

Otherwise, every big C program contains a half-assed implementation of half=
 of Common LISP, so ----


> -   I don't think execution costing takes into account how much memory
>     is used at any one time, just how much was allocated in total; so
>     the equivalent of (OP_DUP OP_DROP OP_DUP OP_DROP ..) only has the
>     allocations accounted for, with no discount given for the immediate
>     freeing, so it gets treated as having the same cost as (OP_DUP
>     OP_DUP .. OP_DROP OP_DROP ..). Doing it that way would be a worse
>     than how bitcoin script is currently costed, but doing better might
>     mean locking in an evaluation method at the consensus level. Seems
>     worth looking into, at least.

This may depend on the language that the interpreter is written in.

For example, on a typical GC language, both the N * `OP_DUP OP_DROP` and th=
e N * `OP_DUP` + N * `OP_DROP` will have similar behavior when allocated at=
 the nursery.
Since the GC nursery acts as a large buffer of potential allocations, the a=
mount of work done in both cases would be the same, at least until the numb=
er of allocs exceeds the nursery size.

Alternately, the implementation may use immutable byte vectors, in which ca=
se `OP_DUP` is just a pointer copy.
Or alternately the implementation may use copy-on-write byte vectors, in wh=
ich case `OP_DUP` is just a pointer copy plus refcount increment, and `OP_D=
ROP` is just a refcount decrement, and the amount of memory used remains sm=
all.




>     There's two ways to think about upgradability here; if someday we wan=
t
>     to add new opcodes to the language -- perhaps something to validate z=
ero
>     knowledge proofs or calculate sha3 or use a different ECC curve, or s=
ome
>     way to support cross-input signature aggregation, or perhaps it's jus=
t
>     that some snippets are very widely used and we'd like to code them in
>     C++ directly so they validate quicker and don't use up as much block
>     weight. One approach is to just define a new version of the language
>     via the tapleaf version, defining new opcodes however we like.
>
>     The other is to use the "softfork" opcode -- chia defines it as:
>
>     (softfork cost code)
>
>     though I think it would probably be better if it were
>
>     (softfork cost version code)
>
>     where the idea is that "code" will use the "x" opcode if there's a
>     problem, and anyone supporting the "version" softfork can verify that
>     there aren't any problems at a cost of "cost". However, whether you
>     do or don't support that softfork, as far as the rest of the script i=
s
>     concerned, the expression will either fail entirely or evaluate as ze=
ro;
>     so anyone who doesn't support the softfork can just replace it with z=
ero
>     and continue on, treating it as if it had costed "cost" units.
>
>     One thing worth noting: "softfork" behaves more like OP_NOP than
>     tapscript's OP_SUCCESS -- I think it's just not possible in general t=
o
>     have OP_SUCCESS-like behaviour if you're trying to allow accepting co=
de
>     from the witness data -- otherwise as soon as you reveal that your sc=
ript
>     does accept arbitrary code supplied by the spender, someone could sti=
ck
>     in an OP_SUCCESS code, and remove all the restrictions on spending an=
d
>     steal your funds.

Oh no `1 RETURN`!

Well, the advantage of chialisp here is that it enables the opcode for a *b=
lock* of code, so the opcode *itself* could return arbitrary data, it is ju=
st the entire block of code that is restricted to returning `0`.
So it would be something like an `OP_BEGINSOFTFORK .... OP_ENDSOFTFORK` whe=
re any reserved opcodes in the middle have arbitrary behavior, the entire b=
lock gets a *copy* of the current stack and alt stack, with any changes to =
the stack / alt stack disappear after the `OP_ENDSOFTFORK`.
Thus, the entire block as a whole would act as an `OP_NOP`.
(OG Bitcoin SCRIPT FTW!)

I think the `softfork` form would have to be a syntax though and not a proc=
edure, as I think you want `cost` to be statically determined, and very lik=
ely also `version`.

Regards,
ZmnSCPxj