1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
|
Delivery-date: Sat, 29 Jun 2024 13:42:30 -0700
Received: from mail-yw1-f188.google.com ([209.85.128.188])
by mail.fairlystable.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
(Exim 4.94.2)
(envelope-from <bitcoindev+bncBC5P5KEHZQLBBLPDQG2AMGQEFGJOEDQ@googlegroups.com>)
id 1sNetx-0006sd-7f
for bitcoindev@gnusha.org; Sat, 29 Jun 2024 13:42:30 -0700
Received: by mail-yw1-f188.google.com with SMTP id 00721157ae682-64d2e2aaff0sf2173817b3.2
for <bitcoindev@gnusha.org>; Sat, 29 Jun 2024 13:42:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=googlegroups.com; s=20230601; t=1719693743; x=1720298543; darn=gnusha.org;
h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
:list-id:mailing-list:precedence:x-original-sender:mime-version
:subject:references:in-reply-to:message-id:to:from:date:sender:from
:to:cc:subject:date:message-id:reply-to;
bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=;
b=dNwN7kzNHrLs7Vex16MlU+pvQhZhFklcteLMBBcBSGznOEP58Lgs8EouEDodg4/Wlq
7BooBg+LJ2zuMEpvXHDo1fnFOlS3x83jJwcxiyZ5z4JWNspsy6forbk8He2tKnc2zOcN
e7vNMjnW0ksHAxrf9u0iLzivWdz15TnjSkgD1r2ejK1kc7ay7XFUCxmsMGVTVmPWjZG/
/1fAyzz+u2Ms9WWiMoOl/8AZwVMehrFQmXQ7nAkTKEqXl9iugA0PfH/liLTk6rGJdD9X
yiztvkZnfbUWFIObM6G7vmQLNMPxEzP3QlswFub8BFmM3XO0O2usajw203VUm3BNOGI9
dHHg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=googlegroups-com.20230601.gappssmtp.com; s=20230601; t=1719693743; x=1720298543; darn=gnusha.org;
h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
:list-id:mailing-list:precedence:x-original-sender:mime-version
:subject:references:in-reply-to:message-id:to:from:date:from:to:cc
:subject:date:message-id:reply-to;
bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=;
b=dLz6V/b2+/IWWCLRrDpX+kFcgcdPTlKJAG1uR+79UowaQa+MAlyO1ZqqjdOcmnZxQ8
kNjsb3NpQgILN6GUeKJ/n261MGCjeI/hB2e+99+7CTlzCv2slVdhWrTGqKA5hcPchboe
5XdO+O1J6SuiyjQfHUlW0CReumDglJpSoan+/p7D0sEnpxM7izYFOK/CxmAs1Ils94ja
9VsMf/XuIUa+OAKmxmXMdL1SrsXtVt4UxB9CyBduvGAWuwiTOwQbSBqoxX492EvbzyYM
fwiMZZOp51M2+80MM3hqRfeBRVEMvnxqrgIoFLgMA3GlqJv1AruibiXmaSLurS/pEmdp
zVVg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1719693743; x=1720298543;
h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
:list-id:mailing-list:precedence:x-original-sender:mime-version
:subject:references:in-reply-to:message-id:to:from:date:x-beenthere
:x-gm-message-state:sender:from:to:cc:subject:date:message-id
:reply-to;
bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=;
b=wjmtFnfqI3J8gX84MTnjUpyGjdfs1gkpIV56z81l6R1SlyKzv6jvJxUSb6XP3V4Bzc
09p/agCEul4GiT7W08ADxTDwGoEXkUS5DnxHBAGF9VadvOgLAt11Jzs/nsXL2sBTCHwR
yu7hxyjjI+dTogC8ZhqQJAGmfOZN1O5DPJdzxBkIMc6Xyp5/zeE2eNQXha/6YqRNzKwS
tcUv8E/bewLP0DXiSINdR2kvVT7n96T048EosGOxypoGqbca6IUjcRe2N6jbab7MqFBh
z1881YE3r9SoF5qfEoh/ZlF1C0nAXtE+6HGgmEyiZjRWd6twRJ5lEcQUerfbf6pg7IuZ
W4hw==
Sender: bitcoindev@googlegroups.com
X-Forwarded-Encrypted: i=1; AJvYcCV1+8RT9AMmZ+vJ2gLHOykkfCcnXjmg+L+6WbKPS4QtsWBzMrxSkozLPF1+NWm3ZbliJeaENW+H9uKFilypT45r878CzD4=
X-Gm-Message-State: AOJu0YywMfY+GVbKZ4p3RLCRIWgAy9bArbE3JZa1etRF5LLRtULdLM2W
v6nX/ov1LVAgbIvzH5M/IcwvQNPhIXb3mMRuCE8zoipVGRstDO6P
X-Google-Smtp-Source: AGHT+IGsGPXprUgIyiwYUdetOPWC3/T65mxFvMfhrpEfB/vX4q3z23gzc5UHK+wOs32wklAzyVSODA==
X-Received: by 2002:a25:df50:0:b0:e03:4f45:2ef5 with SMTP id 3f1490d57ef6-e036ec32213mr1803892276.45.1719693742949;
Sat, 29 Jun 2024 13:42:22 -0700 (PDT)
X-BeenThere: bitcoindev@googlegroups.com
Received: by 2002:a05:6902:100e:b0:dfa:8028:8bc9 with SMTP id
3f1490d57ef6-e0356251544ls2954120276.1.-pod-prod-06-us; Sat, 29 Jun 2024
13:42:21 -0700 (PDT)
X-Received: by 2002:a05:6902:1102:b0:e02:c619:73d with SMTP id 3f1490d57ef6-e036eb1b63dmr85765276.5.1719693741680;
Sat, 29 Jun 2024 13:42:21 -0700 (PDT)
Received: by 2002:a81:ae02:0:b0:627:7f59:2eee with SMTP id 00721157ae682-64d36134a0ams7b3;
Sat, 29 Jun 2024 13:40:40 -0700 (PDT)
X-Received: by 2002:a25:b297:0:b0:e02:bd2f:97f5 with SMTP id 3f1490d57ef6-e035c0454fdmr88080276.6.1719693640203;
Sat, 29 Jun 2024 13:40:40 -0700 (PDT)
Date: Sat, 29 Jun 2024 13:40:39 -0700 (PDT)
From: Eric Voskuil <eric@voskuil.org>
To: Bitcoin Development Mailing List <bitcoindev@googlegroups.com>
Message-Id: <3dceca4d-03a8-44f3-be64-396702247fadn@googlegroups.com>
In-Reply-To: <607a2233-ac12-4a80-ae4a-08341b3549b3n@googlegroups.com>
References: <gnM89sIQ7MhDgI62JciQEGy63DassEv7YZAMhj0IEuIo0EdnafykF6RH4OqjTTHIHsIoZvC2MnTUzJI7EfET4o-UQoD-XAQRDcct994VarE=@protonmail.com>
<72e83c31-408f-4c13-bff5-bf0789302e23n@googlegroups.com>
<heKH68GFJr4Zuf6lBozPJrb-StyBJPMNvmZL0xvKFBnBGVA3fVSgTLdWc-_8igYWX8z3zCGvzflH-CsRv0QCJQcfwizNyYXlBJa_Kteb2zg=@protonmail.com>
<5b0331a5-4e94-465d-a51d-02166e2c1937n@googlegroups.com>
<yt1O1F7NiVj-WkmnYeta1fSqCYNFx8h6OiJaTBmwhmJ2MWAZkmmjPlUST6FM7t6_-2NwWKdglWh77vcnEKA8swiAnQCZJY2SSCAh4DOKt2I=@protonmail.com>
<be78e733-6e9f-4f4e-8dc2-67b79ddbf677n@googlegroups.com>
<jJLDrYTXvTgoslhl1n7Fk9-pL1mMC-0k6gtoniQINmioJpzgtqrJ_WqyFZkLltsCUusnQ4jZ6HbvRC-mGuaUlDi3kcqcFHALd10-JQl-FMY=@protonmail.com>
<9a4c4151-36ed-425a-a535-aa2837919a04n@googlegroups.com>
<3f0064f9-54bd-46a7-9d9a-c54b99aca7b2n@googlegroups.com>
<26b7321b-cc64-44b9-bc95-a4d8feb701e5n@googlegroups.com>
<CALZpt+EwVyaz1=A6hOOycqFGJs+zxyYYocZixTJgVmzZezUs9Q@mail.gmail.com>
<607a2233-ac12-4a80-ae4a-08341b3549b3n@googlegroups.com>
Subject: Re: [bitcoindev] Re: Great Consensus Cleanup Revival
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_336776_1807486589.1719693639951"
X-Original-Sender: eric@voskuil.org
Precedence: list
Mailing-list: list bitcoindev@googlegroups.com; contact bitcoindev+owners@googlegroups.com
List-ID: <bitcoindev.googlegroups.com>
X-Google-Group-Id: 786775582512
List-Post: <https://groups.google.com/group/bitcoindev/post>, <mailto:bitcoindev@googlegroups.com>
List-Help: <https://groups.google.com/support/>, <mailto:bitcoindev+help@googlegroups.com>
List-Archive: <https://groups.google.com/group/bitcoindev
List-Subscribe: <https://groups.google.com/group/bitcoindev/subscribe>, <mailto:bitcoindev+subscribe@googlegroups.com>
List-Unsubscribe: <mailto:googlegroups-manage+786775582512+unsubscribe@googlegroups.com>,
<https://groups.google.com/group/bitcoindev/subscribe>
X-Spam-Score: -0.7 (/)
------=_Part_336776_1807486589.1719693639951
Content-Type: multipart/alternative;
boundary="----=_Part_336777_264697585.1719693639951"
------=_Part_336777_264697585.1719693639951
Content-Type: text/plain; charset="UTF-8"
Caching identity in the case of invalidity is more interesting question
than it might seem.
Background: A fully-validated block has established identity in its block
hash. However an invalid block message may include the same block header,
producing the same hash, but with any kind of nonsense following the
header. The purpose of the transaction and witness commitments is of course
to establish this identity, so these two checks are therefore necessary
even under checkpoint/milestone. And then of course the two Merkle tree
issues complicate the tx commitment (the integrity of the witness
commitment is assured by that of the tx commitment).
So what does it mean to speak of a block hash derived from:
(1) a block message with an unparseable header?
(2) a block message with parseable but invalid header?
(3) a block message with valid header but unparseable tx data?
(4) a block message with valid header but parseable invalid uncommitted tx
data?
(5) a block message with valid header but parseable invalid malleated
committed tx data?
(6) a block message with valid header but parseable invalid unmalleated
committed tx data?
(7) a block message with valid header but uncommitted valid tx data?
(8) a block message with valid header but malleated committed valid tx data?
(9) a block message with valid header but unmalleated committed valid tx
data?
Note that only the #9 p2p block message contains an actual Bitcoin block,
the others are bogus messages. In all cases the message can be sha256
hashed to establish the identity of the *message*. And if one's objective
is to reject repeating bogus messages, this might be a useful strategy.
It's already part of the p2p protocol, is orders of magnitude cheaper to
produce than a Merkle root, and has no identity issues.
The concept of Bitcoin block hash as unique identifier for invalid p2p
block messages is problematic. Apart from the malleation question, what is
the Bitcoin block hash for a message with unparseable data (#1 and #3)?
Such messages are trivial to produce and have no block hash. What is the
useful identifier for a block with malleated commitments (#5 and #8) or
invalid commitments (#4 and #7) - valid txs or otherwise?
The stated objective for a consensus rule to invalidate all 64 byte txs is:
> being able to cache the hash of a (non-malleated) invalid block as
permanently invalid to avoid re-downloading and re-validating it.
This seems reasonable at first glance, but given the list of scenarios
above, which does it apply to? Presumably the invalid header (#2) doesn't
get this far because of headers-first. That leaves just invalid blocks with
useful block hash identifiers (#6). In all other cases the message is
simply discarded. In this case the attempt is to move category #5 into
category #6 by prohibiting 64 byte txs.
The requirement to "avoid re-downloading and re-validating it" is about
performance, presumably minimizing initial block download/catch-up time.
There is a computational cost to producing 64 byte malleations and none for
any of the other bogus block message categories above, including the other
form of malleation. Furthermore, 64 byte malleation has almost zero cost to
preclude. No hashing and not even true header or tx parsing are required.
Only a handful of bytes must be read from the raw message before it can be
discarded presently.
That's actually far cheaper than any of the other scenarios that again,
have no cost to produce. The other type of malleation requires parsing all
of the txs in the block and hashing and comparing some or all of them. In
other words, if there is an attack scenario, that must be addressed before
this can be meaningful. In fact all of the other bogus message scenarios
(with tx data) will remain more expensive to discard than this one.
The problem arises from trying to optimize dismissal by storing an
identifier. Just *producing* the identifier is orders of magnitude more
costly than simply dismissing this bogus message. I can't imagine why any
implementation would want to compute and store and retrieve and recompute
and compare hashes when the alterative is just dismissing the bogus
messages with no hashing at all.
Bogus messages will arrive, they do not even have to be requested. The
simplest are dealt with by parse failure. What defines a parse is entirely
subjective. Generally it's "structural" but nothing precludes incorporating
a requirement for a necessary leading pattern in the stream, sort of like
how the witness pattern is identified. If we were going to prioritize early
dismissal this is where we would put it.
However, there is a tradeoff in terms of early dismissal. Looking up
invalid hashes is a costly tradeoff, which becomes multiplied by every
block validated. For example, expending 1 millisecond in hash/lookup to
save 1 second of validation time in the failure case seems like a
reasonable tradeoff, until you multiply across the whole chain. 1 ms
becomes 14 minutes across the chain, just to save a second for each mallied
block encountered. That means you need to have encountered 840 such mallied
blocks just to break even. Early dismissing the block for non-null coinbase
point (without hashing anything) would be on the order of 1000x faster than
that (breakeven at 1 encounter). So why the block hash cache requirement?
It cannot be applied to many scenarios, and cannot be optimal in this one.
Eric
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.com.
------=_Part_336777_264697585.1719693639951
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Caching identity in the case of invalidity is more interesting question tha=
n it might seem.<br /><br />Background: A fully-validated block has establi=
shed identity in its block hash. However an invalid block message may inclu=
de the same block header, producing the same hash, but with any kind of non=
sense following the header. The purpose of the transaction and witness comm=
itments is of course to establish this identity, so these two checks are th=
erefore necessary even under checkpoint/milestone. And then of course the t=
wo Merkle tree issues complicate the tx commitment (the integrity of the wi=
tness commitment is assured by that of the tx commitment).<br /><br />So wh=
at does it mean to speak of a block hash derived from:<br /><br />(1) a blo=
ck message with an unparseable header?<br />(2) a block message with parsea=
ble but invalid header?<br />(3) a block message with valid header but unpa=
rseable tx data?<br />(4) a block message with valid header but parseable i=
nvalid uncommitted tx data?<br />(5) a block message with valid header but =
parseable invalid malleated committed tx data?<br />(6) a block message wit=
h valid header but parseable invalid unmalleated committed tx data?<br />(7=
) a block message with valid header but uncommitted valid tx data?<br />(8)=
a block message with valid header but malleated committed valid tx data?<b=
r />(9) a block message with valid header but unmalleated committed valid t=
x data?<br /><br />Note that only the #9 p2p block message contains an actu=
al Bitcoin block, the others are bogus messages. In all cases the message c=
an be sha256 hashed to establish the identity of the *message*. And if one'=
s objective is to reject repeating bogus messages, this might be a useful s=
trategy. It's already part of the p2p protocol, is orders of magnitude chea=
per to produce than a Merkle root, and has no identity issues.<br /><br />T=
he concept of Bitcoin block hash as unique identifier for invalid p2p block=
messages is problematic. Apart from the malleation question, what is the B=
itcoin block hash for a message with unparseable data (#1 and #3)? Such mes=
sages are trivial to produce and have no block hash. What is the useful ide=
ntifier for a block with malleated commitments (#5 and #8) or invalid commi=
tments (#4 and #7) - valid txs or otherwise?<br /><br />The stated objectiv=
e for a consensus rule to invalidate all 64 byte txs is:<br /><br />> be=
ing able to cache the hash of a (non-malleated) invalid block as permanentl=
y invalid to avoid re-downloading and re-validating it.<br /><br />This see=
ms reasonable at first glance, but given the list of scenarios above, which=
does it apply to? Presumably the invalid header (#2) doesn't get this far =
because of headers-first. That leaves just invalid blocks with useful block=
hash identifiers (#6). In all other cases the message is simply discarded.=
In this case the attempt is to move category #5 into category #6 by prohib=
iting 64 byte txs.<br /><br />The requirement to "avoid re-downloading and =
re-validating it" is about performance, presumably minimizing initial block=
download/catch-up time. There is a computational cost to producing 64 byte=
malleations and none for any of the other bogus block message categories a=
bove, including the other form of malleation. Furthermore, 64 byte malleati=
on has almost zero cost to preclude. No hashing and not even true header or=
tx parsing are required. Only a handful of bytes must be read from the raw=
message before it can be discarded presently.<br /><br />That's actually f=
ar cheaper than any of the other scenarios that again, have no cost to prod=
uce. The other type of malleation requires parsing all of the txs in the bl=
ock and hashing and comparing some or all of them. In other words, if there=
is an attack scenario, that must be addressed before this can be meaningfu=
l. In fact all of the other bogus message scenarios (with tx data) will rem=
ain more expensive to discard than this one.<br /><br />The problem arises =
from trying to optimize dismissal by storing an identifier. Just *producing=
* the identifier is orders of magnitude more costly than simply dismissing =
this bogus message. I can't imagine why any implementation would want to co=
mpute and store and retrieve and recompute and compare hashes when the alte=
rative is just dismissing the bogus messages with no hashing at all.<br /><=
br />Bogus messages will arrive, they do not even have to be requested. The=
simplest are dealt with by parse failure. What defines a parse is entirely=
subjective. Generally it's "structural" but nothing precludes incorporatin=
g a requirement for a necessary leading pattern in the stream, sort of like=
how the witness pattern is identified. If we were going to prioritize earl=
y dismissal this is where we would put it.<br /><br />However, there is a t=
radeoff in terms of early dismissal. Looking up invalid hashes is a costly =
tradeoff, which becomes multiplied by every block validated. For example, e=
xpending 1 millisecond in hash/lookup to save 1 second of validation time i=
n the failure case seems like a reasonable tradeoff, until you multiply acr=
oss the whole chain. 1 ms becomes 14 minutes across the chain, just to save=
a second for each mallied block encountered. That means you need to have e=
ncountered 840 such mallied blocks just to break even. Early dismissing the=
block for non-null coinbase point (without hashing anything) would be on t=
he order of 1000x faster than that (breakeven at 1 encounter). So why the b=
lock hash cache requirement? It cannot be applied to many scenarios, and ca=
nnot be optimal in this one.<br /><br />Eric<br />
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;Bitcoin Development Mailing List" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:bitcoindev+unsubscribe@googlegroups.com">bitcoind=
ev+unsubscribe@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msgid/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.=
com?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d/msg=
id/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.com</a>.=
<br />
------=_Part_336777_264697585.1719693639951--
------=_Part_336776_1807486589.1719693639951--
|