Return-Path: Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id CEBF9C000A for ; Fri, 16 Apr 2021 18:12:28 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id BB29D84C86 for ; Fri, 16 Apr 2021 18:12:28 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -4.199 X-Spam-Level: X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jrljF9F0xxCd for ; Fri, 16 Apr 2021 18:12:24 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by smtp1.osuosl.org (Postfix) with ESMTPS id 84EFA84C83 for ; Fri, 16 Apr 2021 18:12:24 +0000 (UTC) Received: from mail-il1-f170.google.com (mail-il1-f170.google.com [209.85.166.170]) (authenticated bits=0) (User authenticated as jlrubin@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 13GICMJZ005102 (version=TLSv1/SSLv3 cipher=AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 16 Apr 2021 14:12:23 -0400 Received: by mail-il1-f170.google.com with SMTP id b17so23852567ilh.6 for ; Fri, 16 Apr 2021 11:12:22 -0700 (PDT) X-Gm-Message-State: AOAM532W59ObkhgOxK9g5gIkBe9csuDWdSluU3t/6F7GnRvrkjQe6FjB mho8wDdJWWmMRGFnwQDOjaPLD996dMd/6Sw5EX4= X-Google-Smtp-Source: ABdhPJwKTkA/iuYF9EbTcdcsIPugUxBy3tjGjC7q1PLsbOKEiphxlM2nkaTpQE/bKNdmxKcLssnaCLwGmE8DXlamYrs= X-Received: by 2002:a05:6e02:156c:: with SMTP id k12mr5651195ilu.49.1618596742239; Fri, 16 Apr 2021 11:12:22 -0700 (PDT) MIME-Version: 1.0 References: <9jIB9EQTV4UY5OieVmn5P791NA3Oq3WJ2EXaG0fqscLxQx913zD6ds46YHXAie9JvKuZ0CpiRUUiXjHuosfyuzFdlQhgOUeiolZaca8_zQA=@protonmail.com> In-Reply-To: <9jIB9EQTV4UY5OieVmn5P791NA3Oq3WJ2EXaG0fqscLxQx913zD6ds46YHXAie9JvKuZ0CpiRUUiXjHuosfyuzFdlQhgOUeiolZaca8_zQA=@protonmail.com> From: Jeremy Date: Fri, 16 Apr 2021 11:12:10 -0700 X-Gmail-Original-Message-ID: Message-ID: To: ZmnSCPxj Content-Type: multipart/alternative; boundary="000000000000db1cda05c01aeb52" Cc: Bitcoin Protocol Discussion Subject: Re: [bitcoin-dev] Designing Bitcoin Smart Contracts with Sapio (available on Mainnet today) X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Apr 2021 18:12:28 -0000 --000000000000db1cda05c01aeb52 Content-Type: text/plain; charset="UTF-8" Hi ZmnSCPxj, Funny you mention BlueSpec -- I actually took 6.175[1] in my undergraduate studies with Arvind, BlueSpec's creator, and have often cited it as an inspiration for Sapio given that the target of program compilation is essentially a transaction circuit and I have a decent amount of experience working in BlueSpec. It's entirely the point that Sapio program expansion is Turing complete whereas the resultant program is "fixed". Via updatable finish clauses (which require some signature) it's also possible to track the logic for what states should be generated upon cooperation of parties, making Sapio turing complete description language for federated operators. I've previously described it as follows: *But how is Sapio different from Miniscript? From Solidity? A metaphor that I like is that tools like Miniscript operate at the physical-key level (house keys, car keys, train pass, office key). Sapio operates at the commute level. Miniscript answers, "how do I unlock my car? How do I pay for the train? How do I get in my office?"*. *Sapio lets you specify that, "my morning commute is to leave my house, go to my car, start it and drive to the office, and park in the lot; OR go to the train station, use the train pass to pay fare, take the train for an hour, then* *walk 3 blocks north to my office; and in either case unlock the office door, and enter". Sapio has plans to integrate Miniscript as it stabilizes *[currently integrated]* as the backend key description language. Solidity and Sapio are more similar. Sapio's metaprogramming language is Turing Complete and allows you to specify a rich set of constraints for the contract you're building, but can only ever produce a finite deterministic "binary" of transactions. Solidity on the other hand compiles deterministically, but the executed binary is Turing Complete. Further, Sapio contracts are "stateless", whereas Solidity has mutable state. Lastly, Solidity contracts are non-isolated from one another in the EVM. In Sapio, contracts execute only with the components you specify in scope. In sum, Sapio has a rich descriptive power for smart contract programming flows but a limited and safe execution semantics.* The DSL v.s. e-DSL is a great question, and while there are surely benefits to being a full-fledged standalone DSL, here's why the e-DSL approach is superior for everything Sapio cares about. Sapio is built as a shallow e-DSL in Rust. Everything in Sapio can be expressed (macro free -- they're relatively light conveniences) as pure rust. There's also a more "API Like" interface where Sapio objects can be built out dynamically by other rust code easily. This means that if you wanted to come up with a custom DSL for a subset of Sapio programs, you could still very easily target Sapio (at runtime) as the AST processor. Said "SapioScript" can be an entirely separate crate targeting this base. This is also nice because it keeps the Sapio core codebase relatively tight compared to what would be required were Sapio to be a "full language", and have tens of thousands of lines of custom lexing, parsing, type systems, etc etc. Yes, you have to learn Rust, but Rust is one of the most popular languages for systems programming. It means that there's a library for almost any functionality you could want. There's tooling for building for any platform -- Sapio targets WASM happily, which helps with compile-once run sandboxed anywhere (this really helps for using Sapio as a replicated state machine for channels). I agree with you that Haskell was previously a limiting factor for BlueSpec, but Rust is not Haskell. Rust is incredibly popular, and easier to learn (author's opinion) than Haskell, C++, or even Python (perhaps biased, but I always struggled with python for more than simple scripts knowing when objects were copied or referenced -- an initial version of Sapio was in python, but that codebase collapsed under its own complexity... and I had *great* MyPy coverage). It's certainly easier to learn Rust than to learn Sapio as a DSL -- Sapio requires learning an entirely novel way of thinking about structuring programs already, a DSL would require learning both the "Sapio Programming Model" and "The Sapio DSL". With embedded Rust, you can transfer all existing knowledge on Rust programming and add a veneer of Sapio. Further Sapio is designed to not just compile to smart contracts as Bitcoin addresses, but be able to be deeply integrated inside of an application. For example, suppose you wanted to fetch keys for a contract from a database, query a network oracle for a state resolution, or something else. A DSL would scope creep infinitely or require numerous hacks, and at that point we're largely better off benefitting from the larger Rust community's efforts at providing excellent APIs for any task. Therefore to make Sapio functional for building and deploying real bitcoin smart contract applications, a rust eDSL was not just the natural, it was the only choice. W.r.t. to succinctness and "extra concepts" I'll admit that there is some disadvantage to Rust. There's a borrow checker -- which can be mostly defeated if you don't care about performance with Arc / Clone. You have to manually impl some traits -- but the trait system ends up not being bloat, but central to making contract state machines https://learn.sapio-lang.org/ch08-01-state-machines.html. And if you do actually look at the Sapio programs themselves, they are still quite succinct comparatively. I can imagine them being a bit shorter, but I think optimizing for the shortest possible utterance is an anti-goal for safety -- I aim for clarity. And that's where I don't exactly hate the borrow checker, since it makes it easier to tell when sub-contracts are using the same or modified data. Best, Jeremy [1] http://csg.csail.mit.edu/6.175/archive/2014/index.html -- @JeremyRubin On Fri, Apr 16, 2021 at 7:36 AM ZmnSCPxj wrote: > Good morning Jeremy, et al., > > > > Bitcoin Developers, > > > > I'm very excited to introduce Sapio[0] formally to you all. > > This seems quite interesting to me as well! > > I broadly agree with the rant on monetary units. > In C-Lightning we always (except for some legacy fields that will > eventually be removed) output values as strings with an explicit `msat` > unit, even for onchain values (the smallest of which are satoshi, but for > consistency we always print as millisatoshi), and accept explicit `btc`, > `sat`, and `msat` units. > > -- > > Personally I would have used a non-embedded DSL. > > In practice an embedded DSL requires a user to learn two languages --- the > hosting language and the embedded language. > Whereas if you designed a non-embedded DSL, a new user would have to learn > only one language. > For instance, if an error is emitted, then the user has to know whether > the error comes from the hosting language compiler, or the embedded > language implementation. > > In a past career embedded DSLs for hardware description languages were > being pushed, and we found that one of the drawbacks was the need to learn > as well the hosting language --- at some point Haskell-embedded DSLs became > so unpopular that anything that was even Haskell-related had a negative > reaction in some hardware design shops. > For example BlueSpec originally was a Haskell-embedded DSL, and eventually > implemented a Verilog-like syntax that was not embedded in Haskell, > becoming BlueSpecSystemVerilog. > > Further, as per coding theory, the hosting language is often quite generic > and can talk about anything, including other embedded languages, thus we > expect (all other things being equal) that in general, an utterance in an > embedded DSL will be longer than an utterance in a non-embedded DSL (as > there is more things to talk about, more symbols are necessary, and thus we > expect things to be longer in the generic hosting language). > Whereas a non-embedded DSL can cut away most of the extra verbage needed > to introduce to the hosting language implementation, in order to indicate > the "entry" into the domain-specific language. > > -- > > If my understanding is correct, I seem, that the hosting language is a > full, general, Turing-complete language, that "builds up" a total > (non-Turing-complete) contract description. > > I have had (private) speculations before that it would be possible to > design a language with two layers: > > * A non-Turing-complete total "base language". > * A syntax meta-language similar to Scheme `syntax-rules`, which > constructs ASTs for the "base language". > > Note that Scheme `syntax-rules` is indeed Turing-complete, as a macro can > expand to a form with two lists that form two "ends" of a tape, and act as > a Turing machine on that tape, thus Turing-equivalent. > It is not a general language as it lacks many basic practicalities, but as > pure computation, indeed it is possible to compute anything in that > language. > > The advantage of this scheme is that the meta-language is executed at > language compile time, and the developer can see (by observing the > compilation process) whether the meta-program halts or not. > However, the end-user executing the program is assured that the program, > delivered as a compiled binary, will indeed terminate, as the base language > is total and non-Turing-complete (i.e. the halting problem is trivial for > the base language --- all programs halt). > > I even have started designing a syntax scheme that adds in infix notation > and indent-sensitivity to a Lisp-like syntax, at the cost of disallowing > typical Lisp-like names like `pair?`, e.g. > > foo x = value (bar x) > where > bar x = x > > is equivalent to: > > (`=` (foo x) > (value (bar x) > (where > (`=` (bar x) x)))) > > I can provide more details if interested. > > Note that the base language is not embedded in the meta-language, as the > meta-language is effectively only capable of talking about how the > utterance in the base language is constructed --- the meta-language is not > quite general enough (i.e. the meta-language cannot implement "Hello > World"). > Thus coding theory should imply that this should lead to more succinct > utterances (in general). > From this point of view, language design is about striking a balance > between the low input bandwidth of neurotypical human brains (thus > compression is needed, i.e. the language encourages succinct programs) and > the limited processing power of neurotypical human brains (thus > decompression speed is needed, i.e. it should be obvious what something > expands to). > > > Regards, > ZmnSCPxj > --000000000000db1cda05c01aeb52 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi ZmnSCPxj,

Funny = you mention BlueSpec -- I actually took 6.175[1] in my undergraduate studie= s with Arvind, BlueSpec's creator, and have often cited it as an inspir= ation for Sapio given that the target of program compilation is essentially= a transaction circuit and I have a decent amount of experience working in = BlueSpec.

It's entirely the point that Sapio program expansion is= Turing complete whereas the resultant program is "fixed". Via up= datable finish clauses (which require some signature) it's also possibl= e to track the logic for what states should be generated upon cooperation o= f parties, making Sapio turing complete description language for federated = operators. I've previously described it as follows:

But ho= w is Sapio different from Miniscript?=20 From Solidity? A metaphor that I like is that tools like Miniscript=20 operate at the physical-key level (house keys, car keys, train pass,=20 office key). Sapio operates at the commute=20 level. Miniscript answers, "how do I unlock my car? How do I pay for t= he train? How do I get in my office?". Sapio lets you specify that, "my morning commute is to leave my house, go t= o=20 my car, start it and drive to the office, and park in the lot; OR go to=20 the train station, use the train pass to pay fare, take the train for=20 an hour, then walk 3 blocks north to my=C2=A0 office; and in either = case unlock the office door, and enter". Sapio has plans = to integrate Miniscript as it stabilizes [currently integrated] as t= he backend key description language. Solidity and Sapio are mo= re similar. Sapio's metaprogramming language is Turi= ng Complete and allows you to specify a rich set of constraints for the contract=20 you're building, but can only ever produce a finite deterministic=20 "binary" of transactions. Solidity on the other hand compiles=20 deterministically, but the executed binary is Turing Com= plete. Further, Sapio contracts are "stateless", whereas Solidity has mutable state. L= astly,=20 Solidity contracts are non-isolated from one another in the EVM. In S= apio, contracts execute only with the components you specify in scop= e. In sum, Sapio has a rich descriptive power for smart contra= ct programming flows but a limited and safe execution semantics.
<= div class=3D"gmail_default" style=3D"font-family:arial,helvetica,sans-serif= ;font-size:small;color:#000000">
=
The DSL v.s. e-DSL is a great qu= estion, and while there are surely benefits to being a full-fledged standal= one DSL, here's why the e-DSL approach is superior for everything Sapio= cares about.

Sapio is built as a shallow e-DSL in Rust. Everything i= n Sapio can be expressed (macro free -- they're relatively light conven= iences) as pure rust. There's also a more "API Like" interfac= e where Sapio objects can be built out dynamically by other rust code easil= y. This means that if you wanted to come up with a custom DSL for a subset = of Sapio programs, you could still very easily target Sapio (at runtime) as= the AST processor. Said "SapioScript" can be an entirely separat= e crate targeting this base.
<= br>
This is also nice because it keep= s the Sapio core codebase relatively tight compared to what would be requir= ed were Sapio to be a "full language", and have tens of thousands= of lines of custom lexing, parsing, type systems, etc etc.

Yes, = you have to learn Rust, but Rust is one of the most popular languages for s= ystems programming. It means that there's a library for almost any func= tionality you could want. There's tooling for building for any platform= -- Sapio targets WASM happily, which helps with compile-once run sandboxed= anywhere (this really helps for using Sapio as a replicated state machine = for channels).

I agree with you that Haskell was previously a limit= ing factor for BlueSpec, but Rust is not Haskell. Rust is incredibly popula= r, and easier to learn (author's opinion) than Haskell, C++, or even Py= thon (perhaps biased, but I always struggled with python for more than simp= le scripts knowing when objects were copied or referenced -- an initial ver= sion of Sapio was in python, but that codebase collapsed under its own comp= lexity... and I had *great* MyPy coverage). It's certainly easier to le= arn Rust than to learn Sapio as a DSL -- Sapio requires learning an entirel= y novel way of thinking about structuring programs already, a DSL would req= uire learning both the "Sapio Programming Model" and "The Sa= pio DSL". With embedded Rust, you can transfer all existing knowledge = on Rust programming and add a veneer of Sapio.

Further Sapio is d= esigned to not just compile to smart contracts as Bitcoin addresses, but be= able to be deeply integrated inside of an application. For example, suppos= e you wanted to fetch keys for a contract from a database, query a network = oracle for a state resolution, or something else. A DSL would scope creep i= nfinitely or require numerous hacks, and at that point we're largely be= tter off benefitting from the larger Rust community's efforts at provid= ing excellent APIs for any task.

=
Therefore to make Sapio functional f= or building and deploying real bitcoin smart contract applications, a rust = eDSL was not just the natural, it was the only choice.

W.r.t. to succ= inctness and "extra concepts" I'll admit that there is some d= isadvantage to Rust. There's a borrow checker -- which can be mostly de= feated if you don't care about performance with Arc / Clone. You have t= o manually impl some traits -- but the trait system ends up not being bloat= , but central to making contract state machines https://learn.sapio-lang.org/ch08= -01-state-machines.html. And if you do actually look at the Sapio progr= ams themselves, they are still quite succinct comparatively. I can imagine = them being a bit shorter, but I think optimizing for the shortest possible = utterance is an anti-goal for safety -- I aim for clarity. And that's w= here I don't exactly hate the borrow checker, since it makes it easier = to tell when sub-contracts are using the same or modified data.

B= est,

Jeremy



On Fri, Apr 16, 2021 at= 7:36 AM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
Good morning Jeremy, et al.,


> Bitcoin Developers,
>
> I'm very excited to introduce Sapio[0] formally to you all.

This seems quite interesting to me as well!

I broadly agree with the rant on monetary units.
In C-Lightning we always (except for some legacy fields that will eventuall= y be removed) output values as strings with an explicit `msat` unit, even f= or onchain values (the smallest of which are satoshi, but for consistency w= e always print as millisatoshi), and accept explicit `btc`, `sat`, and `msa= t` units.

--

Personally I would have used a non-embedded DSL.

In practice an embedded DSL requires a user to learn two languages --- the = hosting language and the embedded language.
Whereas if you designed a non-embedded DSL, a new user would have to learn = only one language.
For instance, if an error is emitted, then the user has to know whether the= error comes from the hosting language compiler, or the embedded language i= mplementation.

In a past career embedded DSLs for hardware description languages were bein= g pushed, and we found that one of the drawbacks was the need to learn as w= ell the hosting language --- at some point Haskell-embedded DSLs became so = unpopular that anything that was even Haskell-related had a negative reacti= on in some hardware design shops.
For example BlueSpec originally was a Haskell-embedded DSL, and eventually = implemented a Verilog-like syntax that was not embedded in Haskell, becomin= g BlueSpecSystemVerilog.

Further, as per coding theory, the hosting language is often quite generic = and can talk about anything, including other embedded languages, thus we ex= pect (all other things being equal) that in general, an utterance in an emb= edded DSL will be longer than an utterance in a non-embedded DSL (as there = is more things to talk about, more symbols are necessary, and thus we expec= t things to be longer in the generic hosting language).
Whereas a non-embedded DSL can cut away most of the extra verbage needed to= introduce to the hosting language implementation, in order to indicate the= "entry" into the domain-specific language.

--

If my understanding is correct, I seem, that the hosting language is a full= , general, Turing-complete language, that "builds up" a total (no= n-Turing-complete) contract description.

I have had (private) speculations before that it would be possible to desig= n a language with two layers:

* A non-Turing-complete total "base language".
* A syntax meta-language similar to Scheme `syntax-rules`, which constructs= ASTs for the "base language".

Note that Scheme `syntax-rules` is indeed Turing-complete, as a macro can e= xpand to a form with two lists that form two "ends" of a tape, an= d act as a Turing machine on that tape, thus Turing-equivalent.
It is not a general language as it lacks many basic practicalities, but as = pure computation, indeed it is possible to compute anything in that languag= e.

The advantage of this scheme is that the meta-language is executed at langu= age compile time, and the developer can see (by observing the compilation p= rocess) whether the meta-program halts or not.
However, the end-user executing the program is assured that the program, de= livered as a compiled binary, will indeed terminate, as the base language i= s total and non-Turing-complete (i.e. the halting problem is trivial for th= e base language --- all programs halt).

I even have started designing a syntax scheme that adds in infix notation a= nd indent-sensitivity to a Lisp-like syntax, at the cost of disallowing typ= ical Lisp-like names like `pair?`, e.g.

=C2=A0 =C2=A0 foo x =3D value (bar x)
=C2=A0 =C2=A0 =C2=A0 where
=C2=A0 =C2=A0 =C2=A0 =C2=A0 bar x =3D x

is equivalent to:

=C2=A0 =C2=A0 (`=3D` (foo x)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(value (bar x)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (where
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (`=3D` (bar = x) x))))

I can provide more details if interested.

Note that the base language is not embedded in the meta-language, as the me= ta-language is effectively only capable of talking about how the utterance = in the base language is constructed --- the meta-language is not quite gene= ral enough (i.e. the meta-language cannot implement "Hello World"= ).
Thus coding theory should imply that this should lead to more succinct utte= rances (in general).
From this point of view, language design is about striking a balance betwee= n the low input bandwidth of neurotypical human brains (thus compression is= needed, i.e. the language encourages succinct programs) and the limited pr= ocessing power of neurotypical human brains (thus decompression speed is ne= eded, i.e. it should be obvious what something expands to).


Regards,
ZmnSCPxj
--000000000000db1cda05c01aeb52--