MIME-Version: 1.0
In-Reply-To: <CAAt2M1-kpQyK2bQyHM=vy2oor75m30frHLGk_phPs10gLsH14A@mail.gmail.com>
References: <CAAt2M19QK9cZShOf16UmRz6me9+h_FvjaSCcq_T6aaabX_qpKg@mail.gmail.com>
	<CAAt2M1_Eo9F+mH851GQLhCR+nAsBNrfYJeYkuHmW=Jz335Z0ZA@mail.gmail.com>
	<CAAt2M19VuJ9iEZCvdC3OtpJkMVM8kAB2NRSFqmt2zNREOYAMaA@mail.gmail.com>
	<CAAt2M19bmxR0hUp0rsnNZ8MoUayz82sNnOOXVzg0WXFwma4xzQ@mail.gmail.com>
	<CAAt2M199BKp_k=cia_1APaAZwZ+EV8z=9ebWKHp3+VY68G1jYw@mail.gmail.com>
	<CAAt2M18O3hEcGU7md7Ojsx5716QCYfP0GG9UyNudGCjRszPrFA@mail.gmail.com>
	<CAAt2M1-18yfw+J2S+CDkRO6DT_Rj3FRQQcBhpiuujSYnnC-w7w@mail.gmail.com>
	<CAAt2M19bgyyXsjOXEzSK2h6jTCRZDn9nw+B51yPOYhD4qUjDcw@mail.gmail.com>
	<CAAt2M1_Yikty=W_GvRMSNdkC7xVmFyECP1cM-HBJ0vOs9jiiGg@mail.gmail.com>
	<CAAt2M18Mdp2xJzt-ovbqF2_VSKK4-6RMtSySqso2PDcse5TuMQ@mail.gmail.com>
	<CAAt2M188dzogU4hxardZPNyFeW0=Nu6-h5VRHaiOZ9vyoaktcw@mail.gmail.com>
	<CAAt2M182UqUDt4jg7yY33uycaHJHWw+jfY6Qu0kEf=FsyUzMYQ@mail.gmail.com>
	<CAAt2M18+Moq-+twzqpuZiJM7+q_-wuu=rskvp+srKH2PEH5Ziw@mail.gmail.com>
	<CAAt2M1-Vr+k5kwqbJqbo-UgSXA7v0YesN6WrtFMjdaniMAmJ+A@mail.gmail.com>
	<CAAt2M1_2ADtYjSbE-0RDZ8DSvfx+O82AkeNeBXo0bzy-KTfD_w@mail.gmail.com>
	<CAAt2M19WRsM3XGhL_ghnkywAuuvcosx0aMCjYxpvWdnAqV_0Yw@mail.gmail.com>
	<CAAt2M1-aqTmj2FQcYRrv5oe0Ly3j8ywQYE-DhCrvF7jp2ghs2w@mail.gmail.com>
	<CAAt2M1_KJqcKWcckvYG3Yy0mMqHUfPfD5vfkxtZWcJ_KexkCzA@mail.gmail.com>
	<CAAt2M18-RCFqxqDptsjrHAWcExQibfgxJH+duh_zN1_-1XHTEw@mail.gmail.com>
	<CAAt2M1-0i=6wcL2+On5G=v-nMpw6pO9fa3FoxF30JO8vtZL8hA@mail.gmail.com>
	<CAAt2M19nyUy4U7cURD7TVbvEzN=v+Roq=W4qrjYQ4=1jP_ojdQ@mail.gmail.com>
	<CAAt2M1907HBkXcMMTUybSUT1uSibJ9dTKSiRmrqsSzm_qM58gg@mail.gmail.com>
	<CAAt2M19ezOG1=Bpps54Sk=n+KhvgAVNaMa-=vfr6FNSMFhKyHg@mail.gmail.com>
	<CAAt2M1_KiJfhjo8Qk4cLQXZ+4=8nZeeHS0gVrNDV75kcfkJaFQ@mail.gmail.com>
	<CAAt2M1-WSjfxxX60DBgkkhaRZUTy4cn=zgHZUThiU1p2VdxGnQ@mail.gmail.com>
	<CAAt2M18-9WVs7OgeEREMSFQXMg6p865U2mczivU3jZF7SKmD=g@mail.gmail.com>
	<CAAt2M1_siUEfXsWB5Hj3V5tcm6xboVjjYNZhS4Pn4YBw6YQv7A@mail.gmail.com>
	<CAAt2M1_AdwAoc981zs0ObwBuefQwyi6XqWF8VXLe+ojz0802eA@mail.gmail.com>
	<CAAt2M1_A6m=s6hwURNF7WUbQKvshEOROPySuDwvjSMisdwLbXA@mail.gmail.com>
	<CAAt2M1_d2ycOW3TPF0_nRyzK_v-kzko3Zw8KCzMhCp2KNOg0DA@mail.gmail.com>
	<CAAt2M18MeqTXLd1KpyzRoKb9C2fzj1rr4E89kvkGU_Yjda0goQ@mail.gmail.com>
	<CAAt2M1_pDJrSghsd5eSA3p_6FZkix1HdcjKeUrGDN8Hkwrcm2w@mail.gmail.com>
	<CAAt2M1_2buzQgcybVv8qQLRgcnD0uzj43cOyS53HJdzicJ6_PA@mail.gmail.com>
	<CAAt2M1-t8w1Tw2sgG2RPmpqcCuwdc3piB_6pj2cj-CnXqFwDsw@mail.gmail.com>
	<CAAt2M1_O2GHpYTMqf6Sr0se+6XMKaKeVXFfqGYkPD537sqFfAw@mail.gmail.com>
	<CAAt2M18P_=Nd+Q7=0CKGfpf47+0MDWNgPV3+BRVthGFz8ref-Q@mail.gmail.com>
	<CAAt2M19igqAzme9_G+0sKDaaDPsFOUcp9m15XmnNXTYjtSVYjA@mail.gmail.com>
	<CAAt2M1-FLeQAAKWcM-xEpsDD0P=yJLhMYwzQ72xzFLqovo9SsQ@mail.gmail.com>
	<CAAt2M18Pe6GJUxoiZSO-kiexj+mv+nAP4vvFiWvaafEBRCy08w@mail.gmail.com>
	<CAAt2M1822oiWBO+QxiUUO+3qnLxSFYz4Uh-77=omo-N6RCsc1g@mail.gmail.com>
	<CAAt2M1_rzKX27=naLKdyWh8Cfb3iCY+TUCmi8etJKUmNpGpW=g@mail.gmail.com>
	<CAAt2M1_UZobHAjbEYV0=fPOSJzM+sm_GnLmvtqdFCQqyU0EmrA@mail.gmail.com>
	<CAAt2M1896xctLXp0ae+eGK9jsAM2w4XTk1XVpzZm0CbGLAt7Lw@mail.gmail.com>
	<CAAt2M1-5Lw497BHn=n2+maTnAdGzzosLacN7+hvcj2EQ9wyWyA@mail.gmail.com>
	<CAAt2M18qXJNwWoVDrAOfxaqkSyqwQeEWmHCNRpkbigjE__T44w@mail.gmail.com>
	<CAAt2M1_5tqKwLuGBTY9GjhEK-Mn-9Pr=_hoGD0=p85oFuzTLzw@mail.gmail.com>
	<CAAt2M1-MArg6HUT5D-VnQVAS+1nmPGKp-LTOPnzSueWxKz8imQ@mail.gmail.com>
	<CAAt2M18F+RhOuVSrimpmBp8h7BqBhU6wYJYR+b0Rx_dHAdyXGw@mail.gmail.com>
	<CAAt2M1_D8STg96jqscU1Z=uTntwFvSZ5MTuJLmNJ05jLZ9p8rg@mail.gmail.com>
	<CAAt2M1_+BKObb27LR87JjbpTnZbg0AGCA8uh7Ks03bf459x7jg@mail.gmail.com>
	<CAAt2M18VCjufEU3H4ENRuMcAfTbj5A73S8t2zXsvL8y2UyZhew@mail.gmail.com>
	<CAAt2M19NScrLv3ax_=PCPW4-L466js=H29ogVCOgAVbk6Xma_g@mail.gmail.com>
	<CAAt2M1_n-p20OK_ibzx5qrh_BVtat3=rs5vL+6TF5rAPi1iExg@mail.gmail.com>
	<CAAt2M1-bk9B7B45AKYOKyB7s_LF8ckBw51OdyTqG2JguRxR-9Q@mail.gmail.com>
	<CAAt2M1_hwHE-tm2OppnnogQruB8bhyR6Cc+bNVP5ZrTNv92-Rg@mail.gmail.com>
	<CAAt2M1-YVEAGBy9cQ1q0siOPpC5C8wh-mdz=m8W33boN1dc5CQ@mail.gmail.com>
	<CAAt2M197Mj5sWGY1gewoKHYwsOiSf=38c5DVv7CH+gfJ1Khe-w@mail.gmail.com>
	<CAAt2M1_RdOOq11uzNLShvVNaDYMRw_3S1JhbKfy_cv_3F7z-cA@mail.gmail.com>
	<CAAt2M1-wv1tWKpo+sEMAuKD79vVX6JhtOFOgbxPtOD6LckdW8g@mail.gmail.com>
	<CAAt2M194fKBctfgQmXv2a8OSrYEa_X=cpzQT45OvnpH+B_BA2A@mail.gmail.com>
	<CAAt2M1-Y7+dQc7aLvES-y0rz9YXOsdp-Ou+CERs-EmpRp3N=Ng@mail.gmail.com>
	<CAAt2M18HF2-+dpSF=bR8rw3jR1-i-_FdK9DnvHHEwMq_b7rs6w@mail.gmail.com>
	<CAAt2M19x0vcHG2-Vs9aq-OjgxKg1pHSw8oS=gxAQv8VaugSgGA@mail.gmail.com>
	<CAAt2M19w81CiLQStMaH+5LXukvYmxvZqWdR5R6Cf37bJS2Ci0A@mail.gmail.com>
	<CAAt2M1-PWBdBqc4w0kbTvzaSq7O+39Hj1v1uZD0uM1-AvgLk1w@mail.gmail.com>
	<CAAt2M1_-KM3y-wOnAc-3bFJzJVz3GooSG0h_rixofd61R3h6gA@mail.gmail.com>
	<CAAt2M1_YdK6n4B_tsJsCeKRNnG_p+e4mH=D_Jz1FkeYL1y54TA@mail.gmail.com>
	<CAAt2M187-81ED8ZLf6F7m324RLBwB6u62=iPMbr9FD9qnu1XJw@mail.gmail.com>
	<CAAt2M1_C_3wv9g7auCSg_ECZ=gbxAPumcfZZZO=o5Ly4VV+GvA@mail.gmail.com>
	<CAAt2M1-a66K=uqOkGiHuuzU1qEPan6RHoRfwC+YVFeU94JBxNw@mail.gmail.com>
	<CAAt2M1-RXYxVuNBDdKrnyWow6PCNDeGawv0gQx+jn4f-3xuumg@mail.gmail.com>
	<CAAt2M1-kpQyK2bQyHM=vy2oor75m30frHLGk_phPs10gLsH14A@mail.gmail.com>
From: Natanael <natanael.l@gmail.com>
Date: Tue, 18 Apr 2017 12:34:04 +0200
Message-ID: <CAAt2M1-LsEgBXhMpAn+JkBwA+Q7AwikiVFY6z0jEfj8dCCBUTw@mail.gmail.com>
To: Erik Aronesty <earonesty@gmail.com>
Content-Type: multipart/alternative; boundary=94eb2c06e3fa7251fe054d6e7289
Cc: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org>
Subject: [bitcoin-dev] Properties of an ideal PoW algorithm & implementation
Precedence: list

--94eb2c06e3fa7251fe054d6e7289
Content-Type: text/plain; charset=UTF-8

To expand on this below;

Den 18 apr. 2017 00:34 skrev "Natanael" <natanael.l@gmail.com>:

IMHO the best option if we change PoW is an algorithm that's moderately
processing heavy (we still need reasonably fast verification) and which
resists partial state reuse (not fast or fully "linear" in processing like
SHA256) just for the sake of invalidating asicboost style attacks, and it
should also have an existing reference implementation for hardware that's
provably close in performance to the theoretical ideal implementation of
the algorithm (in other words, one where we know there's no hidden
optimizations).

[...] The competition would mostly be about packing similar gate designs
closely and energy efficiency. (Now that I think about it, the proof MAY
have to consider energy use too, as a larger and slower but more efficient
chip still is competitive in mining...)


What matters for miners in terms of cost is primarily (correctly computed)
hashes per joule (watt-seconds). The most direct proxy for this in terms of
algorithm execution is the number of transistor (gate) activations per
computed hash (PoW unit).

To prove that an implementation is near optimal, you would show there's a
minimum number of necessary transistor activations per computed hash, and
that your implementation is within a reasonable range of that number.

We also need to show that for a practical implementation you can't reuse
much internal state (easiest way is "whitening" the block header,
pre-hashing or having a slow hash with an initial whitening step of its
own). This is to kill any ASICBOOST type optimization. Performance should
be constant, not linear relative to input size.

The PoW step should always be the most expensive part of creating a
complete block candidate! Otherwise it loses part of its meaning. It should
however still also be reasonably easy to verify.

Given that there's already PoW ASIC optimizations since years back that use
deliberately lossy hash computations just because those circuits can run
faster (X% of hashes are computed wrong, but you get Y% more computed
hashes in return which exceeds the error rate), any proof of an
implementation being near optimal (for mining) must also consider the
possibility of implementations of a design that deliberately allows errors
just to reduce the total count of transistor activations per N amount of
computed hashes. Yes, that means the reference implementation is allowed to
be lossy.

So for a reasonably large N (number of computed hashes, to take batch
processing into consideration), the proof would show that there's a
specific ratio for a minimum number of average gate activations per
correctly computed hash, a smallest ratio = X number of gate activations /
(N * success rate) across all possible implementations of the algorithm.
And you'd show your implementation is close to that ratio.

It would also have to consider a reasonable range of time-memory tradeoffs
including the potential of precomputation. Hopefully we could implement an
algorithm that effectively makes such precomputation meaningless by making
the potential gain insignificant for any reasonable ASIC chip size and
amount of precomputation resources.

A summary of important mining PoW algorithm properties;

* Constant verification speed, reasonably fast even on slow hardware

* As explained above, still slow / expensive enough to dominate the costs
of block candidate creation

* Difficulty must be easy to adjust (no problem for simple hash-style
algorithms like today)

* Cryptographic strength, something like preimage resistance (the algorithm
can't allow forcing a particular output, the chance must not be better than
random within any achievable computational bounds)

* As explained above, no hidden shortcuts. Everybody has equal knowledge.

* Predictable and close to constant PoW computation performance, and not
linear in performance relative to input size the way SHA256 is (lossy
implementations will always make it not-quite-constant)

* As explained above, no significant reusable state or other reusable work
(killing ASICBOOST)

* As explained above, no meaningful precomputation possible. No unfair
headstarts.

* Should only rely on just transistors for implementation, shouldn't rely
on memory or other components due to unknowable future engineering results
and changes in cost

* Reasonably compact implementation, measured in memory use, CPU load and
similar metrics

* Reasonably small inputs and outputs (in line with regular hashes)

* All mining PoW should be "embarrassingly parallel" (highly
parallellizable) with minimal or no gain from batch computation,
performance scaling should be linear with increased chip size & cycle
speed.

What else is there? Did I miss anything important?

--94eb2c06e3fa7251fe054d6e7289
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto"><div data-smartmail=3D"gmail_signature" dir=3D"auto">To e=
xpand on this below;</div><div data-smartmail=3D"gmail_signature" dir=3D"au=
to"><br></div><div class=3D"gmail_extra" dir=3D"auto"><div class=3D"gmail_q=
uote">Den 18 apr. 2017 00:34 skrev &quot;Natanael&quot; &lt;<a href=3D"mail=
to:natanael.l@gmail.com">natanael.l@gmail.com</a>&gt;:<blockquote class=3D"=
quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1=
ex"><div dir=3D"auto"><div dir=3D"auto">IMHO the best option if we change P=
oW is an algorithm that&#39;s moderately processing heavy (we still need re=
asonably fast verification) and which resists partial state reuse (not fast=
 or fully &quot;linear&quot; in processing like SHA256) just for the sake o=
f invalidating asicboost style attacks, and it should also have an existing=
 reference implementation for hardware that&#39;s provably close in perform=
ance to the theoretical ideal implementation of the algorithm (in other wor=
ds, one where we know there&#39;s no hidden optimizations).=C2=A0</div><div=
 dir=3D"auto"><br></div><div dir=3D"auto">[...] The competition would mostl=
y be about packing similar gate designs closely and energy efficiency. (Now=
 that I think about it, the proof MAY have to consider energy use too, as a=
 larger and slower but more efficient chip still is competitive in mining..=
.)=C2=A0</div></div></blockquote></div></div><div dir=3D"auto"><br></div><d=
iv dir=3D"auto">What matters for miners in terms of cost is primarily (corr=
ectly computed) hashes per joule (watt-seconds). The most direct proxy for =
this in terms of algorithm execution is the number of transistor (gate) act=
ivations per computed hash (PoW unit).=C2=A0</div><div dir=3D"auto"><br></d=
iv><div dir=3D"auto">To prove that an implementation is near optimal, you w=
ould show there&#39;s a minimum number of necessary transistor activations =
per computed hash, and that your implementation is within a reasonable rang=
e of that number.=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D"auto">=
We also need to show that for a practical implementation you can&#39;t reus=
e much internal state (easiest way is &quot;whitening&quot; the block heade=
r, pre-hashing or having a slow hash with an initial whitening step of its =
own). This is to kill any ASICBOOST type optimization. Performance should b=
e constant, not linear relative to input size.=C2=A0</div><div dir=3D"auto"=
><br></div><div dir=3D"auto">The PoW step should always be the most expensi=
ve part of creating a complete block candidate! Otherwise it loses part of =
its meaning. It should however still also be reasonably easy to verify.=C2=
=A0</div><div dir=3D"auto"><br></div><div dir=3D"auto">Given that there&#39=
;s already PoW ASIC optimizations since years back that use deliberately lo=
ssy hash computations just because those circuits can run faster (X% of has=
hes are computed wrong, but you get Y% more computed hashes in return which=
 exceeds the error rate), any proof of an implementation being near optimal=
 (for mining) must also consider the possibility of implementations of a de=
sign that deliberately allows errors just to reduce the total count of tran=
sistor activations per N amount of computed hashes. Yes, that means the ref=
erence implementation is allowed to be lossy.=C2=A0</div><div dir=3D"auto">=
<br></div><div dir=3D"auto">So for a reasonably large N (number of computed=
 hashes, to take batch processing into consideration), the proof would show=
 that there&#39;s a specific ratio for a minimum number of average gate act=
ivations per correctly computed hash, a smallest ratio =3D X number of gate=
 activations / (N * success rate) across all possible implementations of th=
e algorithm. And you&#39;d show your implementation is close to that ratio.=
=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D"auto">It would also hav=
e to consider a reasonable range of time-memory tradeoffs including the pot=
ential of precomputation. Hopefully we could implement an algorithm that ef=
fectively makes such precomputation meaningless by making the potential gai=
n insignificant for any reasonable ASIC chip size and amount of precomputat=
ion resources.=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D"auto">A s=
ummary of important mining PoW algorithm properties;</div><div dir=3D"auto"=
><br></div><div dir=3D"auto">* Constant verification speed, reasonably fast=
 even on slow hardware=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D"a=
uto">* As explained above, still slow / expensive enough to dominate the co=
sts of block candidate creation</div><div dir=3D"auto"><br></div><div dir=
=3D"auto">* Difficulty must be easy to adjust (no problem for simple hash-s=
tyle algorithms like today)=C2=A0</div><div dir=3D"auto"><br></div><div dir=
=3D"auto">* Cryptographic strength, something like preimage resistance (the=
 algorithm can&#39;t allow forcing a particular output, the chance must not=
 be better than random within any achievable computational bounds)=C2=A0</d=
iv><div dir=3D"auto"><br></div><div dir=3D"auto">* As explained above, no h=
idden shortcuts. Everybody has equal knowledge.=C2=A0</div><div dir=3D"auto=
"><br></div><div dir=3D"auto">* Predictable and close to constant PoW compu=
tation performance,=C2=A0<span style=3D"font-family:sans-serif">and=C2=A0</=
span><span style=3D"font-family:sans-serif">not linear in performance relat=
ive to input size the way SHA256 is=C2=A0</span>(lossy implementations will=
 always make it not-quite-constant)=C2=A0</div><div dir=3D"auto"><br></div>=
<div dir=3D"auto">* As explained above, no significant reusable state or ot=
her reusable work (killing ASICBOOST)=C2=A0</div><div dir=3D"auto"><br></di=
v><div dir=3D"auto">* As explained above, no meaningful precomputation poss=
ible. No unfair headstarts.=C2=A0</div><div dir=3D"auto"><br></div><div dir=
=3D"auto">* Should only rely on just transistors for implementation, should=
n&#39;t rely on memory or other components due to unknowable future enginee=
ring results and changes in cost</div><div dir=3D"auto"><br></div><div dir=
=3D"auto">* Reasonably compact implementation, measured in memory use, CPU =
load and similar metrics=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D=
"auto">* Reasonably small inputs and outputs (in line with regular hashes)=
=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D"auto">* All mining PoW =
should be &quot;embarrassingly parallel&quot; (highly parallellizable) with=
 minimal or no gain from batch computation, performance scaling should be l=
inear with increased chip size &amp; cycle speed.=C2=A0</div><div dir=3D"au=
to"><br></div><div dir=3D"auto">What else is there? Did I miss anything imp=
ortant?</div><div class=3D"gmail_extra" dir=3D"auto"></div></div>

--94eb2c06e3fa7251fe054d6e7289--