summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLuke Kenneth Casson Leighton <lkcl@lkcl.net>2021-02-03 13:24:13 +0000
committerbitcoindev <bitcoindev@gnusha.org>2021-02-03 13:24:40 +0000
commit11e02f727f534dbd800d91fd68f22c4a8b6a5db3 (patch)
tree42d99d3a3696bc4c4f0171ddac3eabff01295e30
parent7074ee4a79e89e6df375aadc89d89d08cc001633 (diff)
downloadpi-bitcoindev-11e02f727f534dbd800d91fd68f22c4a8b6a5db3.tar.gz
pi-bitcoindev-11e02f727f534dbd800d91fd68f22c4a8b6a5db3.zip
Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs
-rw-r--r--98/2cbc869992a253427d828be46cc31ee073b10e364
1 files changed, 364 insertions, 0 deletions
diff --git a/98/2cbc869992a253427d828be46cc31ee073b10e b/98/2cbc869992a253427d828be46cc31ee073b10e
new file mode 100644
index 000000000..004344e72
--- /dev/null
+++ b/98/2cbc869992a253427d828be46cc31ee073b10e
@@ -0,0 +1,364 @@
+Return-Path: <lkcl@lkcl.net>
+Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133])
+ by lists.linuxfoundation.org (Postfix) with ESMTP id 5CE92C013A
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Wed, 3 Feb 2021 13:24:40 +0000 (UTC)
+Received: from localhost (localhost [127.0.0.1])
+ by hemlock.osuosl.org (Postfix) with ESMTP id 56C3B8710A
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Wed, 3 Feb 2021 13:24:40 +0000 (UTC)
+X-Virus-Scanned: amavisd-new at osuosl.org
+Received: from hemlock.osuosl.org ([127.0.0.1])
+ by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)
+ with ESMTP id EmdT4pluxN5h
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Wed, 3 Feb 2021 13:24:38 +0000 (UTC)
+X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
+Received: from lkcl.net (lkcl.net [217.147.94.29])
+ by hemlock.osuosl.org (Postfix) with ESMTPS id 75C8D86FAD
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Wed, 3 Feb 2021 13:24:38 +0000 (UTC)
+DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lkcl.net;
+ s=201607131;
+ h=Content-Type:Cc:To:Subject:Message-ID:Date:From:References:In-Reply-To:MIME-Version;
+ bh=L3Docrkp8d32XEDGlVFtrKbkuYL/aC2gDKYaa1pep20=;
+ b=FBKppL1oitsFTHTq7I4FuSuD1iDkcOK/IwTSsUq8Fx+OHjbczmGhZcCTi0bFxqIFU0+7Ssc6cWTk0VMkUBWE1mpy1d1ETsoTLBV/WlGRsLZBO5wysHvs61HAkuM5mkgKHtFb71Q19Q2sxLenfHK1quJse+JbYXcKGWj9AtKncm4=;
+Received: from mail-lf1-f49.google.com ([209.85.167.49])
+ by lkcl.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
+ (Exim 4.84_2) (envelope-from <lkcl@lkcl.net>) id 1l7I95-0003Yt-G2
+ for bitcoin-dev@lists.linuxfoundation.org; Wed, 03 Feb 2021 13:24:35 +0000
+Received: by mail-lf1-f49.google.com with SMTP id e15so13113352lft.13
+ for <bitcoin-dev@lists.linuxfoundation.org>;
+ Wed, 03 Feb 2021 05:24:20 -0800 (PST)
+X-Gm-Message-State: AOAM530EJvrii88N5X+QMNIrq7VRp3g5X2atxxXegflftB2bYrPcG2QH
+ NWKBp8nK09zQuVKNIJ1lYd5BlnzCLDSPcm9XjTk=
+X-Google-Smtp-Source: ABdhPJzBT7lnAR7ONWQgH8zXdhnCMEkGHOeU8qWqBQulyPMw/qpNHKw6mdC0vsEmJuBvTHUTeuJZX1dEGTnmlggAP0M=
+X-Received: by 2002:a19:4191:: with SMTP id o139mr1773310lfa.224.1612358654527;
+ Wed, 03 Feb 2021 05:24:14 -0800 (PST)
+MIME-Version: 1.0
+Received: by 2002:a05:6520:2f95:b029:bc:bc2b:60bb with HTTP; Wed, 3 Feb 2021
+ 05:24:13 -0800 (PST)
+In-Reply-To: <oCNGbVElAQCJ1bEmwLXLzIVec0ZoOA2Ar3vkOc1a0GW12h78bhMi_W4n3pCdDt7hJyPFoMRb0U1T5Wx5uQl4oo6zeQtjKs0MdAXGtvLw1SQ=@protonmail.com>
+References: <CAPweEDx4wH_PG8=wqLgM_+RfTQEUSGfax=SOkgTZhe1FagXF9g@mail.gmail.com>
+ <oCNGbVElAQCJ1bEmwLXLzIVec0ZoOA2Ar3vkOc1a0GW12h78bhMi_W4n3pCdDt7hJyPFoMRb0U1T5Wx5uQl4oo6zeQtjKs0MdAXGtvLw1SQ=@protonmail.com>
+From: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
+Date: Wed, 3 Feb 2021 13:24:13 +0000
+X-Gmail-Original-Message-ID: <CAPweEDy7Xf3nD1mfyX5MmtsGX=1sd5=gsLosZ=bYavJ0BZyy3g@mail.gmail.com>
+Message-ID: <CAPweEDy7Xf3nD1mfyX5MmtsGX=1sd5=gsLosZ=bYavJ0BZyy3g@mail.gmail.com>
+To: ZmnSCPxj <ZmnSCPxj@protonmail.com>
+Content-Type: multipart/alternative; boundary="000000000000da8d8405ba6e8095"
+X-Mailman-Approved-At: Wed, 03 Feb 2021 14:14:10 +0000
+Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
+Subject: Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs
+X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
+X-Mailman-Version: 2.1.15
+Precedence: list
+List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
+List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
+ <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
+List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
+List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
+List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
+List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
+ <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
+X-List-Received-Date: Wed, 03 Feb 2021 13:24:40 -0000
+
+--000000000000da8d8405ba6e8095
+Content-Type: text/plain; charset="UTF-8"
+
+(hi folks do cc me, i am subscribed digest, thank you for doing that,
+ZmnSCPxj)
+
+On Wednesday, February 3, 2021, ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
+> Good morning Luke,
+>
+> I happen to have experience designing digital ASICs, mostly pipelined
+data processing.
+> However my experience is limited to larger geometries and in
+SystemVerilog.
+
+larger geometries for a hardware wallet ASIC is ok (as long as it is not
+retail based and trying to run e.g. RSA, taking so long to complete that
+the retail customer walks out)
+
+> On the technical side, as I understand it (I have been out of that
+industry for 4 years now, so my knowledge may be obsolete)
+
+not at all! still very valuable
+
+> as you approach lower geometries, you also start approaching analog
+design.
+
+yyeah i could intuitively tell/guess there might be something like this
+which would throw a spanner in the works, it is why the grant request i put
+in specifically excluded data-dependent constant time analysis and also
+power analysis.
+
+
+> In our case we were already manually laying out gates and flip-flops (or
+replacing flip-flops with level-triggered latches and being extra careful
+with clocks) to squeeze performance (and area) ...
+
+ya-howw :)
+
+
+> Many of the formal correctness proofs were really about the formal
+equivalence of the netlist to the RTL; the correctness of the RTL was
+"proved" by simulation testing.
+
+thanks to Symbiyosys we are using formal proofs much more extensively, as
+effectively a 100% coverage replacement for unit tests.
+
+an example is popcount. we did two versions. one is a recursive tree
+algorithm, almost impossible to read and understand what the hell it does.
+
+the other is a total braindead 1-liner "x = x + input[i]", rubbish
+performance though.
+
+running a formal proof on these gave us 100% confidence that the complex
+optimised version does the damn job.
+
+
+yes we still do unit tests, these are more "demo code".
+
+now, the caveat is that you have to have a model of the "dut" (device under
+test) against which to compare, and if the dut is ridiculously complex then
+the formal model variant, which has to do the same job, ends up equally as
+complex (or effectively a duplicate of the dut) and the exercise is a bit
+of a waste of time...
+
+...*unless*... there happens to be other implementations out there. then
+the proof can be run against those and everybody wins through collaboration.
+
+
+
+now, here's why i put in the NLnet Grant request to explore going back to
+the mathematics of crypto-primitives.
+
+many ISAs e.g. intel AVX2 have added GFMULT8 etc etc because that does
+S-Boxes for Rijndael. they have gone mad by analysing algorithms trying to
+fit them to standard ISAs.
+
+nobody does Rijndael S-Boxes any way other than 256-entry lookup tables
+because no standard ISA has general-purpose Galois Field Multiply.
+
+consequently implementations in assembler get completely divorced from the
+original mathematics on which the cryptographic algorithm was based.
+
+the approach i would like to take is, "hang on a minute: how far would you
+get if you actually added *general-purpose* instructions that *directly*
+provided the underlying mathematical principles, and then wrapped a
+Vector-Matrix Engine around them?".
+
+would this drastically simplify algorithms to the point where *READABLE* c
+code compiles directly to opcodes that run screamingly fast, outperforming
+hand-optimised SIMD code using standard ISAs?
+
+then, given the Formal Correctness approach above, can we verify that the
+mathematically-related opcodes do the job?
+
+
+> (to be fair, there were tools to force you to improve coverage by
+injecting faults to your RTL, e.g. it would virtually flip an `&&` to an
+`||` and if none of your tests signaled an error it would complain that
+your test coverage sucked.)
+
+nice!
+
+> Things might have changed.
+
+nah. this is such a complex area, run by few incumbent players, that
+innovation is rare. not least, innovation is different and cannot be
+trusted by the Foundries!
+
+
+> A good RTL would embed SystemVerilog Assertions or PSL Assertions as well.
+> Some formal verification tools can understand a subset of SystemVerilog
+Assertions / PSL assertions and validate that your RTL conformed to the
+assertions, which would probably help cut down on the need for RTL
+simulation.
+
+interesting.
+
+> Overall, my understanding is that smaller geometries are needed only if
+you want to target a really high performance / unit cost and performance /
+energy consumption ratios.
+> That is, you would target smaller geometries for mining.
+
+yes.
+
+> If you need a secure tr\*sted computing module that does not need to be
+fast or cheap, just very accurate to the required specification, the larger
+geometries should be fine and you would be able to live almost entirely in
+RTL-land without diving into netlist and layout specifications.
+
+hardware wallet ASICs.
+
+i concur.
+
+> A wrinkle here is that licenses for tools from tr\*sted vendors like
+Synopsys or Cadence are ***expensive***.
+
+yes they are :) we are currently working with Sorbonne University LIP6.fr
+and Staf Verhaegen from Chips4Makers, trying a different approach:
+coriolis2.
+
+this will do fine up to 130nm (skywater). beyond that, mmm, we need a few
+more years.
+
+> What is more, you should really buy two sets of licenses, e.g. do logic
+synthesis with Synopsys and then formal verification with Cadence, because
+you do not want to fully tr\*st just one vendor.
+
+interesting, good advice.
+
+> Synthesis in particular is a black box and each vendor keeps their
+particular implementations and tricks secret.
+
+sigh. i think that's partly because they have to insert diodes, and
+buffers, and generally mess with the netlist.
+
+i was stunned to learn that in a 28nm ASIC, 50% of it is repeater-buffers!
+
+plus, they make an awful lot of money, it is good business.
+
+> Pointing some funding at the open-source Icarus Verilog might also fit,
+as it lost its ability to do synthesis more than a decade ago due to
+inability to maintain.
+
+ah i didn't know it could do synthesis at all! i thought it was simulation
+only.
+
+> Note as well that I heard (at the time when I was in the industry) that
+some foundries will not even accept a netlist unless it was created by a
+synthesis tool from one of the major vendors (Synopsys, Cadence, Mentor
+Graphics, maybe more I have forgotten since).
+
+yes i heard this too, they don't want their time wasted: after all they
+only make money by selling wafers, and if they can't sell any they have to
+run empty wafers to keep the equipment at operating temperature.
+
+if you book a slot 18 months in advance and the RTL doesn't work during
+testing 3 months before the deadline they may not be able to find someone
+else in time.
+
+anything to reduce the risk there is good, so i totally get why.
+
+thank you for the insights and the discussion, really appreciated.
+
+l.
+
+
+--
+---
+crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
+
+--000000000000da8d8405ba6e8095
+Content-Type: text/html; charset="UTF-8"
+Content-Transfer-Encoding: quoted-printable
+
+(hi folks do cc me, i am subscribed digest, thank you for doing that, ZmnSC=
+Pxj)<br><br>On Wednesday, February 3, 2021, ZmnSCPxj &lt;<a href=3D"mailto:=
+ZmnSCPxj@protonmail.com">ZmnSCPxj@protonmail.com</a>&gt; wrote:<br>&gt; Goo=
+d morning Luke,<br>&gt;<br>&gt; I happen to have experience designing digit=
+al ASICs, mostly pipelined data processing.<br>&gt; However my experience i=
+s limited to larger geometries and in SystemVerilog.<br><br>larger geometri=
+es for a hardware wallet ASIC is ok (as long as it is not retail based and =
+trying to run e.g. RSA, taking so long to complete that the retail customer=
+ walks out)<br><br>&gt; On the technical side, as I understand it (I have b=
+een out of that industry for 4 years now, so my knowledge may be obsolete)<=
+br><br>not at all! still very valuable<br><br>&gt; as you approach lower ge=
+ometries, you also start approaching analog design.<br><br>yyeah i could in=
+tuitively tell/guess there might be something like this which would throw a=
+ spanner in the works, it is why the grant request i put in specifically ex=
+cluded data-dependent constant time analysis and also power analysis.<br><b=
+r><br>&gt; In our case we were already manually laying out gates and flip-f=
+lops (or replacing flip-flops with level-triggered latches and being extra =
+careful with clocks) to squeeze performance (and area) ...<br><br>ya-howw :=
+)<br><br><br>&gt; Many of the formal correctness proofs were really about t=
+he formal equivalence of the netlist to the RTL; the correctness of the RTL=
+ was &quot;proved&quot; by simulation testing.<br><br>thanks to Symbiyosys =
+we are using formal proofs much more extensively, as effectively a 100% cov=
+erage replacement for unit tests.<br><br>an example is popcount. =C2=A0we d=
+id two versions. =C2=A0one is a recursive tree algorithm, almost impossible=
+ to read and understand what the hell it does.<br><br>the other is a total =
+braindead 1-liner &quot;x =3D x + input[i]&quot;, rubbish performance thoug=
+h.<br><br>running a formal proof on these gave us 100% confidence that the =
+complex optimised version does the damn job.<br><br><br>yes we still do uni=
+t tests, these are more &quot;demo code&quot;.<br><br>now, the caveat is th=
+at you have to have a model of the &quot;dut&quot; (device under test) agai=
+nst which to compare, and if the dut is ridiculously complex then the forma=
+l model variant, which has to do the same job, ends up equally as complex (=
+or effectively a duplicate of the dut) and the exercise is a bit of a waste=
+ of time...<br><br>...*unless*... there happens to be other implementations=
+ out there. =C2=A0then the proof can be run against those and everybody win=
+s through collaboration.<br><br><br><br>now, here&#39;s why i put in the NL=
+net Grant request to explore going back to the mathematics of crypto-primit=
+ives.<br><br>many ISAs e.g. intel AVX2 have added GFMULT8 etc etc because t=
+hat does S-Boxes for Rijndael. =C2=A0they have gone mad by analysing algori=
+thms trying to fit them to standard ISAs.<br><br>nobody does Rijndael S-Box=
+es any way other than 256-entry lookup tables because no standard ISA has g=
+eneral-purpose Galois Field Multiply.<br><br>consequently implementations i=
+n assembler get completely divorced from the original mathematics on which =
+the cryptographic algorithm was based.<br><br>the approach i would like to =
+take is, &quot;hang on a minute: how far would you get if you actually adde=
+d *general-purpose* instructions that *directly* provided the underlying ma=
+thematical principles, and then wrapped a Vector-Matrix Engine around them?=
+&quot;.<br><br>would this drastically simplify algorithms to the point wher=
+e *READABLE* c code compiles directly to opcodes that run screamingly fast,=
+ outperforming hand-optimised SIMD code using standard ISAs? <br><br>then, =
+given the Formal Correctness approach above, can we verify that the mathema=
+tically-related opcodes do the job?<br><br><br>&gt; (to be fair, there were=
+ tools to force you to improve coverage by injecting faults to your RTL, e.=
+g. it would virtually flip an `&amp;&amp;` to an `||` and if none of your t=
+ests signaled an error it would complain that your test coverage sucked.)<b=
+r><br>nice!<br><br>&gt; Things might have changed.<br><br>nah. =C2=A0this i=
+s such a complex area, run by few incumbent players, that innovation is rar=
+e. =C2=A0not least, innovation is different and cannot be trusted by the Fo=
+undries!<br><br><br>&gt; A good RTL would embed SystemVerilog Assertions or=
+ PSL Assertions as well.<br>&gt; Some formal verification tools can underst=
+and a subset of SystemVerilog Assertions / PSL assertions and validate that=
+ your RTL conformed to the assertions, which would probably help cut down o=
+n the need for RTL simulation.<br><br>interesting.<br><br>&gt; Overall, my =
+understanding is that smaller geometries are needed only if you want to tar=
+get a really high performance / unit cost and performance / energy consumpt=
+ion ratios.<br>&gt; That is, you would target smaller geometries for mining=
+.<br><br>yes.<br><br>&gt; If you need a secure tr\*sted computing module th=
+at does not need to be fast or cheap, just very accurate to the required sp=
+ecification, the larger geometries should be fine and you would be able to =
+live almost entirely in RTL-land without diving into netlist and layout spe=
+cifications.<br><br>hardware wallet ASICs.<br><br>i concur.<br><br>&gt; A w=
+rinkle here is that licenses for tools from tr\*sted vendors like Synopsys =
+or Cadence are ***expensive***.<br><br>yes they are :) =C2=A0we are current=
+ly working with Sorbonne University LIP6.fr and Staf Verhaegen from Chips4M=
+akers, trying a different approach: coriolis2.<br><br>this will do fine up =
+to 130nm (skywater). =C2=A0beyond that, mmm, we need a few more years.<br><=
+br>&gt; What is more, you should really buy two sets of licenses, e.g. do l=
+ogic synthesis with Synopsys and then formal verification with Cadence, bec=
+ause you do not want to fully tr\*st just one vendor.<br><br>interesting, g=
+ood advice.<br><br>&gt; Synthesis in particular is a black box and each ven=
+dor keeps their particular implementations and tricks secret.<br><br>sigh. =
+=C2=A0i think that&#39;s partly because they have to insert diodes, and buf=
+fers, and generally mess with the netlist.<br><br>i was stunned to learn th=
+at in a 28nm ASIC, 50% of it is repeater-buffers!<br><br>plus, they make an=
+ awful lot of money, it is good business.<br><br>&gt; Pointing some funding=
+ at the open-source Icarus Verilog might also fit, as it lost its ability t=
+o do synthesis more than a decade ago due to inability to maintain.<br><br>=
+ah i didn&#39;t know it could do synthesis at all! i thought it was simulat=
+ion only.<br><br>&gt; Note as well that I heard (at the time when I was in =
+the industry) that some foundries will not even accept a netlist unless it =
+was created by a synthesis tool from one of the major vendors (Synopsys, Ca=
+dence, Mentor Graphics, maybe more I have forgotten since).<br><br>yes i he=
+ard this too, they don&#39;t want their time wasted: after all they only ma=
+ke money by selling wafers, and if they can&#39;t sell any they have to run=
+ empty wafers to keep the equipment at operating temperature.<br><br>if you=
+ book a slot 18 months in advance and the RTL doesn&#39;t work during testi=
+ng 3 months before the deadline they may not be able to find someone else i=
+n time.<br><br>anything to reduce the risk there is good, so i totally get =
+why.<br><br>thank you for the insights and the discussion, really appreciat=
+ed.<br><br>l.<br><br><br>-- <br>---<br>crowd-funded eco-conscious hardware:=
+ <a href=3D"https://www.crowdsupply.com/eoma68" target=3D"_blank">https://w=
+ww.crowdsupply.com/eoma68</a><br><br>
+
+--000000000000da8d8405ba6e8095--
+