summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEric Voskuil <eric@voskuil.org>2024-06-29 13:40:39 -0700
committerbitcoindev <bitcoindev@googlegroups.com>2024-06-29 13:42:30 -0700
commitca896f027207ab9a908a1e0d6ec2b6c3e3a18abf (patch)
treee04a852f807b6e915e3a88b0e709f2afea62a538
parentd188db49e5679a265e0c60984ca78e06f3df977c (diff)
downloadpi-bitcoindev-ca896f027207ab9a908a1e0d6ec2b6c3e3a18abf.tar.gz
pi-bitcoindev-ca896f027207ab9a908a1e0d6ec2b6c3e3a18abf.zip
Re: [bitcoindev] Re: Great Consensus Cleanup Revival
-rw-r--r--d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7308
1 files changed, 308 insertions, 0 deletions
diff --git a/d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7 b/d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7
new file mode 100644
index 000000000..15fea419f
--- /dev/null
+++ b/d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7
@@ -0,0 +1,308 @@
+Delivery-date: Sat, 29 Jun 2024 13:42:30 -0700
+Received: from mail-yw1-f188.google.com ([209.85.128.188])
+ by mail.fairlystable.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
+ (Exim 4.94.2)
+ (envelope-from <bitcoindev+bncBC5P5KEHZQLBBLPDQG2AMGQEFGJOEDQ@googlegroups.com>)
+ id 1sNetx-0006sd-7f
+ for bitcoindev@gnusha.org; Sat, 29 Jun 2024 13:42:30 -0700
+Received: by mail-yw1-f188.google.com with SMTP id 00721157ae682-64d2e2aaff0sf2173817b3.2
+ for <bitcoindev@gnusha.org>; Sat, 29 Jun 2024 13:42:28 -0700 (PDT)
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
+ d=googlegroups.com; s=20230601; t=1719693743; x=1720298543; darn=gnusha.org;
+ h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
+ :list-id:mailing-list:precedence:x-original-sender:mime-version
+ :subject:references:in-reply-to:message-id:to:from:date:sender:from
+ :to:cc:subject:date:message-id:reply-to;
+ bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=;
+ b=dNwN7kzNHrLs7Vex16MlU+pvQhZhFklcteLMBBcBSGznOEP58Lgs8EouEDodg4/Wlq
+ 7BooBg+LJ2zuMEpvXHDo1fnFOlS3x83jJwcxiyZ5z4JWNspsy6forbk8He2tKnc2zOcN
+ e7vNMjnW0ksHAxrf9u0iLzivWdz15TnjSkgD1r2ejK1kc7ay7XFUCxmsMGVTVmPWjZG/
+ /1fAyzz+u2Ms9WWiMoOl/8AZwVMehrFQmXQ7nAkTKEqXl9iugA0PfH/liLTk6rGJdD9X
+ yiztvkZnfbUWFIObM6G7vmQLNMPxEzP3QlswFub8BFmM3XO0O2usajw203VUm3BNOGI9
+ dHHg==
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
+ d=googlegroups-com.20230601.gappssmtp.com; s=20230601; t=1719693743; x=1720298543; darn=gnusha.org;
+ h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
+ :list-id:mailing-list:precedence:x-original-sender:mime-version
+ :subject:references:in-reply-to:message-id:to:from:date:from:to:cc
+ :subject:date:message-id:reply-to;
+ bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=;
+ b=dLz6V/b2+/IWWCLRrDpX+kFcgcdPTlKJAG1uR+79UowaQa+MAlyO1ZqqjdOcmnZxQ8
+ kNjsb3NpQgILN6GUeKJ/n261MGCjeI/hB2e+99+7CTlzCv2slVdhWrTGqKA5hcPchboe
+ 5XdO+O1J6SuiyjQfHUlW0CReumDglJpSoan+/p7D0sEnpxM7izYFOK/CxmAs1Ils94ja
+ 9VsMf/XuIUa+OAKmxmXMdL1SrsXtVt4UxB9CyBduvGAWuwiTOwQbSBqoxX492EvbzyYM
+ fwiMZZOp51M2+80MM3hqRfeBRVEMvnxqrgIoFLgMA3GlqJv1AruibiXmaSLurS/pEmdp
+ zVVg==
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
+ d=1e100.net; s=20230601; t=1719693743; x=1720298543;
+ h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
+ :list-id:mailing-list:precedence:x-original-sender:mime-version
+ :subject:references:in-reply-to:message-id:to:from:date:x-beenthere
+ :x-gm-message-state:sender:from:to:cc:subject:date:message-id
+ :reply-to;
+ bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=;
+ b=wjmtFnfqI3J8gX84MTnjUpyGjdfs1gkpIV56z81l6R1SlyKzv6jvJxUSb6XP3V4Bzc
+ 09p/agCEul4GiT7W08ADxTDwGoEXkUS5DnxHBAGF9VadvOgLAt11Jzs/nsXL2sBTCHwR
+ yu7hxyjjI+dTogC8ZhqQJAGmfOZN1O5DPJdzxBkIMc6Xyp5/zeE2eNQXha/6YqRNzKwS
+ tcUv8E/bewLP0DXiSINdR2kvVT7n96T048EosGOxypoGqbca6IUjcRe2N6jbab7MqFBh
+ z1881YE3r9SoF5qfEoh/ZlF1C0nAXtE+6HGgmEyiZjRWd6twRJ5lEcQUerfbf6pg7IuZ
+ W4hw==
+Sender: bitcoindev@googlegroups.com
+X-Forwarded-Encrypted: i=1; AJvYcCV1+8RT9AMmZ+vJ2gLHOykkfCcnXjmg+L+6WbKPS4QtsWBzMrxSkozLPF1+NWm3ZbliJeaENW+H9uKFilypT45r878CzD4=
+X-Gm-Message-State: AOJu0YywMfY+GVbKZ4p3RLCRIWgAy9bArbE3JZa1etRF5LLRtULdLM2W
+ v6nX/ov1LVAgbIvzH5M/IcwvQNPhIXb3mMRuCE8zoipVGRstDO6P
+X-Google-Smtp-Source: AGHT+IGsGPXprUgIyiwYUdetOPWC3/T65mxFvMfhrpEfB/vX4q3z23gzc5UHK+wOs32wklAzyVSODA==
+X-Received: by 2002:a25:df50:0:b0:e03:4f45:2ef5 with SMTP id 3f1490d57ef6-e036ec32213mr1803892276.45.1719693742949;
+ Sat, 29 Jun 2024 13:42:22 -0700 (PDT)
+X-BeenThere: bitcoindev@googlegroups.com
+Received: by 2002:a05:6902:100e:b0:dfa:8028:8bc9 with SMTP id
+ 3f1490d57ef6-e0356251544ls2954120276.1.-pod-prod-06-us; Sat, 29 Jun 2024
+ 13:42:21 -0700 (PDT)
+X-Received: by 2002:a05:6902:1102:b0:e02:c619:73d with SMTP id 3f1490d57ef6-e036eb1b63dmr85765276.5.1719693741680;
+ Sat, 29 Jun 2024 13:42:21 -0700 (PDT)
+Received: by 2002:a81:ae02:0:b0:627:7f59:2eee with SMTP id 00721157ae682-64d36134a0ams7b3;
+ Sat, 29 Jun 2024 13:40:40 -0700 (PDT)
+X-Received: by 2002:a25:b297:0:b0:e02:bd2f:97f5 with SMTP id 3f1490d57ef6-e035c0454fdmr88080276.6.1719693640203;
+ Sat, 29 Jun 2024 13:40:40 -0700 (PDT)
+Date: Sat, 29 Jun 2024 13:40:39 -0700 (PDT)
+From: Eric Voskuil <eric@voskuil.org>
+To: Bitcoin Development Mailing List <bitcoindev@googlegroups.com>
+Message-Id: <3dceca4d-03a8-44f3-be64-396702247fadn@googlegroups.com>
+In-Reply-To: <607a2233-ac12-4a80-ae4a-08341b3549b3n@googlegroups.com>
+References: <gnM89sIQ7MhDgI62JciQEGy63DassEv7YZAMhj0IEuIo0EdnafykF6RH4OqjTTHIHsIoZvC2MnTUzJI7EfET4o-UQoD-XAQRDcct994VarE=@protonmail.com>
+ <72e83c31-408f-4c13-bff5-bf0789302e23n@googlegroups.com>
+ <heKH68GFJr4Zuf6lBozPJrb-StyBJPMNvmZL0xvKFBnBGVA3fVSgTLdWc-_8igYWX8z3zCGvzflH-CsRv0QCJQcfwizNyYXlBJa_Kteb2zg=@protonmail.com>
+ <5b0331a5-4e94-465d-a51d-02166e2c1937n@googlegroups.com>
+ <yt1O1F7NiVj-WkmnYeta1fSqCYNFx8h6OiJaTBmwhmJ2MWAZkmmjPlUST6FM7t6_-2NwWKdglWh77vcnEKA8swiAnQCZJY2SSCAh4DOKt2I=@protonmail.com>
+ <be78e733-6e9f-4f4e-8dc2-67b79ddbf677n@googlegroups.com>
+ <jJLDrYTXvTgoslhl1n7Fk9-pL1mMC-0k6gtoniQINmioJpzgtqrJ_WqyFZkLltsCUusnQ4jZ6HbvRC-mGuaUlDi3kcqcFHALd10-JQl-FMY=@protonmail.com>
+ <9a4c4151-36ed-425a-a535-aa2837919a04n@googlegroups.com>
+ <3f0064f9-54bd-46a7-9d9a-c54b99aca7b2n@googlegroups.com>
+ <26b7321b-cc64-44b9-bc95-a4d8feb701e5n@googlegroups.com>
+ <CALZpt+EwVyaz1=A6hOOycqFGJs+zxyYYocZixTJgVmzZezUs9Q@mail.gmail.com>
+ <607a2233-ac12-4a80-ae4a-08341b3549b3n@googlegroups.com>
+Subject: Re: [bitcoindev] Re: Great Consensus Cleanup Revival
+MIME-Version: 1.0
+Content-Type: multipart/mixed;
+ boundary="----=_Part_336776_1807486589.1719693639951"
+X-Original-Sender: eric@voskuil.org
+Precedence: list
+Mailing-list: list bitcoindev@googlegroups.com; contact bitcoindev+owners@googlegroups.com
+List-ID: <bitcoindev.googlegroups.com>
+X-Google-Group-Id: 786775582512
+List-Post: <https://groups.google.com/group/bitcoindev/post>, <mailto:bitcoindev@googlegroups.com>
+List-Help: <https://groups.google.com/support/>, <mailto:bitcoindev+help@googlegroups.com>
+List-Archive: <https://groups.google.com/group/bitcoindev
+List-Subscribe: <https://groups.google.com/group/bitcoindev/subscribe>, <mailto:bitcoindev+subscribe@googlegroups.com>
+List-Unsubscribe: <mailto:googlegroups-manage+786775582512+unsubscribe@googlegroups.com>,
+ <https://groups.google.com/group/bitcoindev/subscribe>
+X-Spam-Score: -0.7 (/)
+
+------=_Part_336776_1807486589.1719693639951
+Content-Type: multipart/alternative;
+ boundary="----=_Part_336777_264697585.1719693639951"
+
+------=_Part_336777_264697585.1719693639951
+Content-Type: text/plain; charset="UTF-8"
+
+Caching identity in the case of invalidity is more interesting question
+than it might seem.
+
+Background: A fully-validated block has established identity in its block
+hash. However an invalid block message may include the same block header,
+producing the same hash, but with any kind of nonsense following the
+header. The purpose of the transaction and witness commitments is of course
+to establish this identity, so these two checks are therefore necessary
+even under checkpoint/milestone. And then of course the two Merkle tree
+issues complicate the tx commitment (the integrity of the witness
+commitment is assured by that of the tx commitment).
+
+So what does it mean to speak of a block hash derived from:
+
+(1) a block message with an unparseable header?
+(2) a block message with parseable but invalid header?
+(3) a block message with valid header but unparseable tx data?
+(4) a block message with valid header but parseable invalid uncommitted tx
+data?
+(5) a block message with valid header but parseable invalid malleated
+committed tx data?
+(6) a block message with valid header but parseable invalid unmalleated
+committed tx data?
+(7) a block message with valid header but uncommitted valid tx data?
+(8) a block message with valid header but malleated committed valid tx data?
+(9) a block message with valid header but unmalleated committed valid tx
+data?
+
+Note that only the #9 p2p block message contains an actual Bitcoin block,
+the others are bogus messages. In all cases the message can be sha256
+hashed to establish the identity of the *message*. And if one's objective
+is to reject repeating bogus messages, this might be a useful strategy.
+It's already part of the p2p protocol, is orders of magnitude cheaper to
+produce than a Merkle root, and has no identity issues.
+
+The concept of Bitcoin block hash as unique identifier for invalid p2p
+block messages is problematic. Apart from the malleation question, what is
+the Bitcoin block hash for a message with unparseable data (#1 and #3)?
+Such messages are trivial to produce and have no block hash. What is the
+useful identifier for a block with malleated commitments (#5 and #8) or
+invalid commitments (#4 and #7) - valid txs or otherwise?
+
+The stated objective for a consensus rule to invalidate all 64 byte txs is:
+
+> being able to cache the hash of a (non-malleated) invalid block as
+permanently invalid to avoid re-downloading and re-validating it.
+
+This seems reasonable at first glance, but given the list of scenarios
+above, which does it apply to? Presumably the invalid header (#2) doesn't
+get this far because of headers-first. That leaves just invalid blocks with
+useful block hash identifiers (#6). In all other cases the message is
+simply discarded. In this case the attempt is to move category #5 into
+category #6 by prohibiting 64 byte txs.
+
+The requirement to "avoid re-downloading and re-validating it" is about
+performance, presumably minimizing initial block download/catch-up time.
+There is a computational cost to producing 64 byte malleations and none for
+any of the other bogus block message categories above, including the other
+form of malleation. Furthermore, 64 byte malleation has almost zero cost to
+preclude. No hashing and not even true header or tx parsing are required.
+Only a handful of bytes must be read from the raw message before it can be
+discarded presently.
+
+That's actually far cheaper than any of the other scenarios that again,
+have no cost to produce. The other type of malleation requires parsing all
+of the txs in the block and hashing and comparing some or all of them. In
+other words, if there is an attack scenario, that must be addressed before
+this can be meaningful. In fact all of the other bogus message scenarios
+(with tx data) will remain more expensive to discard than this one.
+
+The problem arises from trying to optimize dismissal by storing an
+identifier. Just *producing* the identifier is orders of magnitude more
+costly than simply dismissing this bogus message. I can't imagine why any
+implementation would want to compute and store and retrieve and recompute
+and compare hashes when the alterative is just dismissing the bogus
+messages with no hashing at all.
+
+Bogus messages will arrive, they do not even have to be requested. The
+simplest are dealt with by parse failure. What defines a parse is entirely
+subjective. Generally it's "structural" but nothing precludes incorporating
+a requirement for a necessary leading pattern in the stream, sort of like
+how the witness pattern is identified. If we were going to prioritize early
+dismissal this is where we would put it.
+
+However, there is a tradeoff in terms of early dismissal. Looking up
+invalid hashes is a costly tradeoff, which becomes multiplied by every
+block validated. For example, expending 1 millisecond in hash/lookup to
+save 1 second of validation time in the failure case seems like a
+reasonable tradeoff, until you multiply across the whole chain. 1 ms
+becomes 14 minutes across the chain, just to save a second for each mallied
+block encountered. That means you need to have encountered 840 such mallied
+blocks just to break even. Early dismissing the block for non-null coinbase
+point (without hashing anything) would be on the order of 1000x faster than
+that (breakeven at 1 encounter). So why the block hash cache requirement?
+It cannot be applied to many scenarios, and cannot be optimal in this one.
+
+Eric
+
+--
+You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
+To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
+To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.com.
+
+------=_Part_336777_264697585.1719693639951
+Content-Type: text/html; charset="UTF-8"
+Content-Transfer-Encoding: quoted-printable
+
+Caching identity in the case of invalidity is more interesting question tha=
+n it might seem.<br /><br />Background: A fully-validated block has establi=
+shed identity in its block hash. However an invalid block message may inclu=
+de the same block header, producing the same hash, but with any kind of non=
+sense following the header. The purpose of the transaction and witness comm=
+itments is of course to establish this identity, so these two checks are th=
+erefore necessary even under checkpoint/milestone. And then of course the t=
+wo Merkle tree issues complicate the tx commitment (the integrity of the wi=
+tness commitment is assured by that of the tx commitment).<br /><br />So wh=
+at does it mean to speak of a block hash derived from:<br /><br />(1) a blo=
+ck message with an unparseable header?<br />(2) a block message with parsea=
+ble but invalid header?<br />(3) a block message with valid header but unpa=
+rseable tx data?<br />(4) a block message with valid header but parseable i=
+nvalid uncommitted tx data?<br />(5) a block message with valid header but =
+parseable invalid malleated committed tx data?<br />(6) a block message wit=
+h valid header but parseable invalid unmalleated committed tx data?<br />(7=
+) a block message with valid header but uncommitted valid tx data?<br />(8)=
+ a block message with valid header but malleated committed valid tx data?<b=
+r />(9) a block message with valid header but unmalleated committed valid t=
+x data?<br /><br />Note that only the #9 p2p block message contains an actu=
+al Bitcoin block, the others are bogus messages. In all cases the message c=
+an be sha256 hashed to establish the identity of the *message*. And if one'=
+s objective is to reject repeating bogus messages, this might be a useful s=
+trategy. It's already part of the p2p protocol, is orders of magnitude chea=
+per to produce than a Merkle root, and has no identity issues.<br /><br />T=
+he concept of Bitcoin block hash as unique identifier for invalid p2p block=
+ messages is problematic. Apart from the malleation question, what is the B=
+itcoin block hash for a message with unparseable data (#1 and #3)? Such mes=
+sages are trivial to produce and have no block hash. What is the useful ide=
+ntifier for a block with malleated commitments (#5 and #8) or invalid commi=
+tments (#4 and #7) - valid txs or otherwise?<br /><br />The stated objectiv=
+e for a consensus rule to invalidate all 64 byte txs is:<br /><br />&gt; be=
+ing able to cache the hash of a (non-malleated) invalid block as permanentl=
+y invalid to avoid re-downloading and re-validating it.<br /><br />This see=
+ms reasonable at first glance, but given the list of scenarios above, which=
+ does it apply to? Presumably the invalid header (#2) doesn't get this far =
+because of headers-first. That leaves just invalid blocks with useful block=
+ hash identifiers (#6). In all other cases the message is simply discarded.=
+ In this case the attempt is to move category #5 into category #6 by prohib=
+iting 64 byte txs.<br /><br />The requirement to "avoid re-downloading and =
+re-validating it" is about performance, presumably minimizing initial block=
+ download/catch-up time. There is a computational cost to producing 64 byte=
+ malleations and none for any of the other bogus block message categories a=
+bove, including the other form of malleation. Furthermore, 64 byte malleati=
+on has almost zero cost to preclude. No hashing and not even true header or=
+ tx parsing are required. Only a handful of bytes must be read from the raw=
+ message before it can be discarded presently.<br /><br />That's actually f=
+ar cheaper than any of the other scenarios that again, have no cost to prod=
+uce. The other type of malleation requires parsing all of the txs in the bl=
+ock and hashing and comparing some or all of them. In other words, if there=
+ is an attack scenario, that must be addressed before this can be meaningfu=
+l. In fact all of the other bogus message scenarios (with tx data) will rem=
+ain more expensive to discard than this one.<br /><br />The problem arises =
+from trying to optimize dismissal by storing an identifier. Just *producing=
+* the identifier is orders of magnitude more costly than simply dismissing =
+this bogus message. I can't imagine why any implementation would want to co=
+mpute and store and retrieve and recompute and compare hashes when the alte=
+rative is just dismissing the bogus messages with no hashing at all.<br /><=
+br />Bogus messages will arrive, they do not even have to be requested. The=
+ simplest are dealt with by parse failure. What defines a parse is entirely=
+ subjective. Generally it's "structural" but nothing precludes incorporatin=
+g a requirement for a necessary leading pattern in the stream, sort of like=
+ how the witness pattern is identified. If we were going to prioritize earl=
+y dismissal this is where we would put it.<br /><br />However, there is a t=
+radeoff in terms of early dismissal. Looking up invalid hashes is a costly =
+tradeoff, which becomes multiplied by every block validated. For example, e=
+xpending 1 millisecond in hash/lookup to save 1 second of validation time i=
+n the failure case seems like a reasonable tradeoff, until you multiply acr=
+oss the whole chain. 1 ms becomes 14 minutes across the chain, just to save=
+ a second for each mallied block encountered. That means you need to have e=
+ncountered 840 such mallied blocks just to break even. Early dismissing the=
+ block for non-null coinbase point (without hashing anything) would be on t=
+he order of 1000x faster than that (breakeven at 1 encounter). So why the b=
+lock hash cache requirement? It cannot be applied to many scenarios, and ca=
+nnot be optimal in this one.<br /><br />Eric<br />
+
+<p></p>
+
+-- <br />
+You received this message because you are subscribed to the Google Groups &=
+quot;Bitcoin Development Mailing List&quot; group.<br />
+To unsubscribe from this group and stop receiving emails from it, send an e=
+mail to <a href=3D"mailto:bitcoindev+unsubscribe@googlegroups.com">bitcoind=
+ev+unsubscribe@googlegroups.com</a>.<br />
+To view this discussion on the web visit <a href=3D"https://groups.google.c=
+om/d/msgid/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.=
+com?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d/msg=
+id/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.com</a>.=
+<br />
+
+------=_Part_336777_264697585.1719693639951--
+
+------=_Part_336776_1807486589.1719693639951--
+