diff options
author | Eric Voskuil <eric@voskuil.org> | 2024-06-29 13:40:39 -0700 |
---|---|---|
committer | bitcoindev <bitcoindev@googlegroups.com> | 2024-06-29 13:42:30 -0700 |
commit | ca896f027207ab9a908a1e0d6ec2b6c3e3a18abf (patch) | |
tree | e04a852f807b6e915e3a88b0e709f2afea62a538 | |
parent | d188db49e5679a265e0c60984ca78e06f3df977c (diff) | |
download | pi-bitcoindev-ca896f027207ab9a908a1e0d6ec2b6c3e3a18abf.tar.gz pi-bitcoindev-ca896f027207ab9a908a1e0d6ec2b6c3e3a18abf.zip |
Re: [bitcoindev] Re: Great Consensus Cleanup Revival
-rw-r--r-- | d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7 | 308 |
1 files changed, 308 insertions, 0 deletions
diff --git a/d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7 b/d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7 new file mode 100644 index 000000000..15fea419f --- /dev/null +++ b/d9/9a6fd2a69f092b1dbf28f4209224f0c600bed7 @@ -0,0 +1,308 @@ +Delivery-date: Sat, 29 Jun 2024 13:42:30 -0700 +Received: from mail-yw1-f188.google.com ([209.85.128.188]) + by mail.fairlystable.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 + (Exim 4.94.2) + (envelope-from <bitcoindev+bncBC5P5KEHZQLBBLPDQG2AMGQEFGJOEDQ@googlegroups.com>) + id 1sNetx-0006sd-7f + for bitcoindev@gnusha.org; Sat, 29 Jun 2024 13:42:30 -0700 +Received: by mail-yw1-f188.google.com with SMTP id 00721157ae682-64d2e2aaff0sf2173817b3.2 + for <bitcoindev@gnusha.org>; Sat, 29 Jun 2024 13:42:28 -0700 (PDT) +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=googlegroups.com; s=20230601; t=1719693743; x=1720298543; darn=gnusha.org; + h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post + :list-id:mailing-list:precedence:x-original-sender:mime-version + :subject:references:in-reply-to:message-id:to:from:date:sender:from + :to:cc:subject:date:message-id:reply-to; + bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=; + b=dNwN7kzNHrLs7Vex16MlU+pvQhZhFklcteLMBBcBSGznOEP58Lgs8EouEDodg4/Wlq + 7BooBg+LJ2zuMEpvXHDo1fnFOlS3x83jJwcxiyZ5z4JWNspsy6forbk8He2tKnc2zOcN + e7vNMjnW0ksHAxrf9u0iLzivWdz15TnjSkgD1r2ejK1kc7ay7XFUCxmsMGVTVmPWjZG/ + /1fAyzz+u2Ms9WWiMoOl/8AZwVMehrFQmXQ7nAkTKEqXl9iugA0PfH/liLTk6rGJdD9X + yiztvkZnfbUWFIObM6G7vmQLNMPxEzP3QlswFub8BFmM3XO0O2usajw203VUm3BNOGI9 + dHHg== +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=googlegroups-com.20230601.gappssmtp.com; s=20230601; t=1719693743; x=1720298543; darn=gnusha.org; + h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post + :list-id:mailing-list:precedence:x-original-sender:mime-version + :subject:references:in-reply-to:message-id:to:from:date:from:to:cc + :subject:date:message-id:reply-to; + bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=; + b=dLz6V/b2+/IWWCLRrDpX+kFcgcdPTlKJAG1uR+79UowaQa+MAlyO1ZqqjdOcmnZxQ8 + kNjsb3NpQgILN6GUeKJ/n261MGCjeI/hB2e+99+7CTlzCv2slVdhWrTGqKA5hcPchboe + 5XdO+O1J6SuiyjQfHUlW0CReumDglJpSoan+/p7D0sEnpxM7izYFOK/CxmAs1Ils94ja + 9VsMf/XuIUa+OAKmxmXMdL1SrsXtVt4UxB9CyBduvGAWuwiTOwQbSBqoxX492EvbzyYM + fwiMZZOp51M2+80MM3hqRfeBRVEMvnxqrgIoFLgMA3GlqJv1AruibiXmaSLurS/pEmdp + zVVg== +X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=1e100.net; s=20230601; t=1719693743; x=1720298543; + h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post + :list-id:mailing-list:precedence:x-original-sender:mime-version + :subject:references:in-reply-to:message-id:to:from:date:x-beenthere + :x-gm-message-state:sender:from:to:cc:subject:date:message-id + :reply-to; + bh=2nTudNywqu/iU/VvqVmygYWMKnh3m1Pp1+Id1EU8olU=; + b=wjmtFnfqI3J8gX84MTnjUpyGjdfs1gkpIV56z81l6R1SlyKzv6jvJxUSb6XP3V4Bzc + 09p/agCEul4GiT7W08ADxTDwGoEXkUS5DnxHBAGF9VadvOgLAt11Jzs/nsXL2sBTCHwR + yu7hxyjjI+dTogC8ZhqQJAGmfOZN1O5DPJdzxBkIMc6Xyp5/zeE2eNQXha/6YqRNzKwS + tcUv8E/bewLP0DXiSINdR2kvVT7n96T048EosGOxypoGqbca6IUjcRe2N6jbab7MqFBh + z1881YE3r9SoF5qfEoh/ZlF1C0nAXtE+6HGgmEyiZjRWd6twRJ5lEcQUerfbf6pg7IuZ + W4hw== +Sender: bitcoindev@googlegroups.com +X-Forwarded-Encrypted: i=1; AJvYcCV1+8RT9AMmZ+vJ2gLHOykkfCcnXjmg+L+6WbKPS4QtsWBzMrxSkozLPF1+NWm3ZbliJeaENW+H9uKFilypT45r878CzD4= +X-Gm-Message-State: AOJu0YywMfY+GVbKZ4p3RLCRIWgAy9bArbE3JZa1etRF5LLRtULdLM2W + v6nX/ov1LVAgbIvzH5M/IcwvQNPhIXb3mMRuCE8zoipVGRstDO6P +X-Google-Smtp-Source: AGHT+IGsGPXprUgIyiwYUdetOPWC3/T65mxFvMfhrpEfB/vX4q3z23gzc5UHK+wOs32wklAzyVSODA== +X-Received: by 2002:a25:df50:0:b0:e03:4f45:2ef5 with SMTP id 3f1490d57ef6-e036ec32213mr1803892276.45.1719693742949; + Sat, 29 Jun 2024 13:42:22 -0700 (PDT) +X-BeenThere: bitcoindev@googlegroups.com +Received: by 2002:a05:6902:100e:b0:dfa:8028:8bc9 with SMTP id + 3f1490d57ef6-e0356251544ls2954120276.1.-pod-prod-06-us; Sat, 29 Jun 2024 + 13:42:21 -0700 (PDT) +X-Received: by 2002:a05:6902:1102:b0:e02:c619:73d with SMTP id 3f1490d57ef6-e036eb1b63dmr85765276.5.1719693741680; + Sat, 29 Jun 2024 13:42:21 -0700 (PDT) +Received: by 2002:a81:ae02:0:b0:627:7f59:2eee with SMTP id 00721157ae682-64d36134a0ams7b3; + Sat, 29 Jun 2024 13:40:40 -0700 (PDT) +X-Received: by 2002:a25:b297:0:b0:e02:bd2f:97f5 with SMTP id 3f1490d57ef6-e035c0454fdmr88080276.6.1719693640203; + Sat, 29 Jun 2024 13:40:40 -0700 (PDT) +Date: Sat, 29 Jun 2024 13:40:39 -0700 (PDT) +From: Eric Voskuil <eric@voskuil.org> +To: Bitcoin Development Mailing List <bitcoindev@googlegroups.com> +Message-Id: <3dceca4d-03a8-44f3-be64-396702247fadn@googlegroups.com> +In-Reply-To: <607a2233-ac12-4a80-ae4a-08341b3549b3n@googlegroups.com> +References: <gnM89sIQ7MhDgI62JciQEGy63DassEv7YZAMhj0IEuIo0EdnafykF6RH4OqjTTHIHsIoZvC2MnTUzJI7EfET4o-UQoD-XAQRDcct994VarE=@protonmail.com> + <72e83c31-408f-4c13-bff5-bf0789302e23n@googlegroups.com> + <heKH68GFJr4Zuf6lBozPJrb-StyBJPMNvmZL0xvKFBnBGVA3fVSgTLdWc-_8igYWX8z3zCGvzflH-CsRv0QCJQcfwizNyYXlBJa_Kteb2zg=@protonmail.com> + <5b0331a5-4e94-465d-a51d-02166e2c1937n@googlegroups.com> + <yt1O1F7NiVj-WkmnYeta1fSqCYNFx8h6OiJaTBmwhmJ2MWAZkmmjPlUST6FM7t6_-2NwWKdglWh77vcnEKA8swiAnQCZJY2SSCAh4DOKt2I=@protonmail.com> + <be78e733-6e9f-4f4e-8dc2-67b79ddbf677n@googlegroups.com> + <jJLDrYTXvTgoslhl1n7Fk9-pL1mMC-0k6gtoniQINmioJpzgtqrJ_WqyFZkLltsCUusnQ4jZ6HbvRC-mGuaUlDi3kcqcFHALd10-JQl-FMY=@protonmail.com> + <9a4c4151-36ed-425a-a535-aa2837919a04n@googlegroups.com> + <3f0064f9-54bd-46a7-9d9a-c54b99aca7b2n@googlegroups.com> + <26b7321b-cc64-44b9-bc95-a4d8feb701e5n@googlegroups.com> + <CALZpt+EwVyaz1=A6hOOycqFGJs+zxyYYocZixTJgVmzZezUs9Q@mail.gmail.com> + <607a2233-ac12-4a80-ae4a-08341b3549b3n@googlegroups.com> +Subject: Re: [bitcoindev] Re: Great Consensus Cleanup Revival +MIME-Version: 1.0 +Content-Type: multipart/mixed; + boundary="----=_Part_336776_1807486589.1719693639951" +X-Original-Sender: eric@voskuil.org +Precedence: list +Mailing-list: list bitcoindev@googlegroups.com; contact bitcoindev+owners@googlegroups.com +List-ID: <bitcoindev.googlegroups.com> +X-Google-Group-Id: 786775582512 +List-Post: <https://groups.google.com/group/bitcoindev/post>, <mailto:bitcoindev@googlegroups.com> +List-Help: <https://groups.google.com/support/>, <mailto:bitcoindev+help@googlegroups.com> +List-Archive: <https://groups.google.com/group/bitcoindev +List-Subscribe: <https://groups.google.com/group/bitcoindev/subscribe>, <mailto:bitcoindev+subscribe@googlegroups.com> +List-Unsubscribe: <mailto:googlegroups-manage+786775582512+unsubscribe@googlegroups.com>, + <https://groups.google.com/group/bitcoindev/subscribe> +X-Spam-Score: -0.7 (/) + +------=_Part_336776_1807486589.1719693639951 +Content-Type: multipart/alternative; + boundary="----=_Part_336777_264697585.1719693639951" + +------=_Part_336777_264697585.1719693639951 +Content-Type: text/plain; charset="UTF-8" + +Caching identity in the case of invalidity is more interesting question +than it might seem. + +Background: A fully-validated block has established identity in its block +hash. However an invalid block message may include the same block header, +producing the same hash, but with any kind of nonsense following the +header. The purpose of the transaction and witness commitments is of course +to establish this identity, so these two checks are therefore necessary +even under checkpoint/milestone. And then of course the two Merkle tree +issues complicate the tx commitment (the integrity of the witness +commitment is assured by that of the tx commitment). + +So what does it mean to speak of a block hash derived from: + +(1) a block message with an unparseable header? +(2) a block message with parseable but invalid header? +(3) a block message with valid header but unparseable tx data? +(4) a block message with valid header but parseable invalid uncommitted tx +data? +(5) a block message with valid header but parseable invalid malleated +committed tx data? +(6) a block message with valid header but parseable invalid unmalleated +committed tx data? +(7) a block message with valid header but uncommitted valid tx data? +(8) a block message with valid header but malleated committed valid tx data? +(9) a block message with valid header but unmalleated committed valid tx +data? + +Note that only the #9 p2p block message contains an actual Bitcoin block, +the others are bogus messages. In all cases the message can be sha256 +hashed to establish the identity of the *message*. And if one's objective +is to reject repeating bogus messages, this might be a useful strategy. +It's already part of the p2p protocol, is orders of magnitude cheaper to +produce than a Merkle root, and has no identity issues. + +The concept of Bitcoin block hash as unique identifier for invalid p2p +block messages is problematic. Apart from the malleation question, what is +the Bitcoin block hash for a message with unparseable data (#1 and #3)? +Such messages are trivial to produce and have no block hash. What is the +useful identifier for a block with malleated commitments (#5 and #8) or +invalid commitments (#4 and #7) - valid txs or otherwise? + +The stated objective for a consensus rule to invalidate all 64 byte txs is: + +> being able to cache the hash of a (non-malleated) invalid block as +permanently invalid to avoid re-downloading and re-validating it. + +This seems reasonable at first glance, but given the list of scenarios +above, which does it apply to? Presumably the invalid header (#2) doesn't +get this far because of headers-first. That leaves just invalid blocks with +useful block hash identifiers (#6). In all other cases the message is +simply discarded. In this case the attempt is to move category #5 into +category #6 by prohibiting 64 byte txs. + +The requirement to "avoid re-downloading and re-validating it" is about +performance, presumably minimizing initial block download/catch-up time. +There is a computational cost to producing 64 byte malleations and none for +any of the other bogus block message categories above, including the other +form of malleation. Furthermore, 64 byte malleation has almost zero cost to +preclude. No hashing and not even true header or tx parsing are required. +Only a handful of bytes must be read from the raw message before it can be +discarded presently. + +That's actually far cheaper than any of the other scenarios that again, +have no cost to produce. The other type of malleation requires parsing all +of the txs in the block and hashing and comparing some or all of them. In +other words, if there is an attack scenario, that must be addressed before +this can be meaningful. In fact all of the other bogus message scenarios +(with tx data) will remain more expensive to discard than this one. + +The problem arises from trying to optimize dismissal by storing an +identifier. Just *producing* the identifier is orders of magnitude more +costly than simply dismissing this bogus message. I can't imagine why any +implementation would want to compute and store and retrieve and recompute +and compare hashes when the alterative is just dismissing the bogus +messages with no hashing at all. + +Bogus messages will arrive, they do not even have to be requested. The +simplest are dealt with by parse failure. What defines a parse is entirely +subjective. Generally it's "structural" but nothing precludes incorporating +a requirement for a necessary leading pattern in the stream, sort of like +how the witness pattern is identified. If we were going to prioritize early +dismissal this is where we would put it. + +However, there is a tradeoff in terms of early dismissal. Looking up +invalid hashes is a costly tradeoff, which becomes multiplied by every +block validated. For example, expending 1 millisecond in hash/lookup to +save 1 second of validation time in the failure case seems like a +reasonable tradeoff, until you multiply across the whole chain. 1 ms +becomes 14 minutes across the chain, just to save a second for each mallied +block encountered. That means you need to have encountered 840 such mallied +blocks just to break even. Early dismissing the block for non-null coinbase +point (without hashing anything) would be on the order of 1000x faster than +that (breakeven at 1 encounter). So why the block hash cache requirement? +It cannot be applied to many scenarios, and cannot be optimal in this one. + +Eric + +-- +You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. +To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. +To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.com. + +------=_Part_336777_264697585.1719693639951 +Content-Type: text/html; charset="UTF-8" +Content-Transfer-Encoding: quoted-printable + +Caching identity in the case of invalidity is more interesting question tha= +n it might seem.<br /><br />Background: A fully-validated block has establi= +shed identity in its block hash. However an invalid block message may inclu= +de the same block header, producing the same hash, but with any kind of non= +sense following the header. The purpose of the transaction and witness comm= +itments is of course to establish this identity, so these two checks are th= +erefore necessary even under checkpoint/milestone. And then of course the t= +wo Merkle tree issues complicate the tx commitment (the integrity of the wi= +tness commitment is assured by that of the tx commitment).<br /><br />So wh= +at does it mean to speak of a block hash derived from:<br /><br />(1) a blo= +ck message with an unparseable header?<br />(2) a block message with parsea= +ble but invalid header?<br />(3) a block message with valid header but unpa= +rseable tx data?<br />(4) a block message with valid header but parseable i= +nvalid uncommitted tx data?<br />(5) a block message with valid header but = +parseable invalid malleated committed tx data?<br />(6) a block message wit= +h valid header but parseable invalid unmalleated committed tx data?<br />(7= +) a block message with valid header but uncommitted valid tx data?<br />(8)= + a block message with valid header but malleated committed valid tx data?<b= +r />(9) a block message with valid header but unmalleated committed valid t= +x data?<br /><br />Note that only the #9 p2p block message contains an actu= +al Bitcoin block, the others are bogus messages. In all cases the message c= +an be sha256 hashed to establish the identity of the *message*. And if one'= +s objective is to reject repeating bogus messages, this might be a useful s= +trategy. It's already part of the p2p protocol, is orders of magnitude chea= +per to produce than a Merkle root, and has no identity issues.<br /><br />T= +he concept of Bitcoin block hash as unique identifier for invalid p2p block= + messages is problematic. Apart from the malleation question, what is the B= +itcoin block hash for a message with unparseable data (#1 and #3)? Such mes= +sages are trivial to produce and have no block hash. What is the useful ide= +ntifier for a block with malleated commitments (#5 and #8) or invalid commi= +tments (#4 and #7) - valid txs or otherwise?<br /><br />The stated objectiv= +e for a consensus rule to invalidate all 64 byte txs is:<br /><br />> be= +ing able to cache the hash of a (non-malleated) invalid block as permanentl= +y invalid to avoid re-downloading and re-validating it.<br /><br />This see= +ms reasonable at first glance, but given the list of scenarios above, which= + does it apply to? Presumably the invalid header (#2) doesn't get this far = +because of headers-first. That leaves just invalid blocks with useful block= + hash identifiers (#6). In all other cases the message is simply discarded.= + In this case the attempt is to move category #5 into category #6 by prohib= +iting 64 byte txs.<br /><br />The requirement to "avoid re-downloading and = +re-validating it" is about performance, presumably minimizing initial block= + download/catch-up time. There is a computational cost to producing 64 byte= + malleations and none for any of the other bogus block message categories a= +bove, including the other form of malleation. Furthermore, 64 byte malleati= +on has almost zero cost to preclude. No hashing and not even true header or= + tx parsing are required. Only a handful of bytes must be read from the raw= + message before it can be discarded presently.<br /><br />That's actually f= +ar cheaper than any of the other scenarios that again, have no cost to prod= +uce. The other type of malleation requires parsing all of the txs in the bl= +ock and hashing and comparing some or all of them. In other words, if there= + is an attack scenario, that must be addressed before this can be meaningfu= +l. In fact all of the other bogus message scenarios (with tx data) will rem= +ain more expensive to discard than this one.<br /><br />The problem arises = +from trying to optimize dismissal by storing an identifier. Just *producing= +* the identifier is orders of magnitude more costly than simply dismissing = +this bogus message. I can't imagine why any implementation would want to co= +mpute and store and retrieve and recompute and compare hashes when the alte= +rative is just dismissing the bogus messages with no hashing at all.<br /><= +br />Bogus messages will arrive, they do not even have to be requested. The= + simplest are dealt with by parse failure. What defines a parse is entirely= + subjective. Generally it's "structural" but nothing precludes incorporatin= +g a requirement for a necessary leading pattern in the stream, sort of like= + how the witness pattern is identified. If we were going to prioritize earl= +y dismissal this is where we would put it.<br /><br />However, there is a t= +radeoff in terms of early dismissal. Looking up invalid hashes is a costly = +tradeoff, which becomes multiplied by every block validated. For example, e= +xpending 1 millisecond in hash/lookup to save 1 second of validation time i= +n the failure case seems like a reasonable tradeoff, until you multiply acr= +oss the whole chain. 1 ms becomes 14 minutes across the chain, just to save= + a second for each mallied block encountered. That means you need to have e= +ncountered 840 such mallied blocks just to break even. Early dismissing the= + block for non-null coinbase point (without hashing anything) would be on t= +he order of 1000x faster than that (breakeven at 1 encounter). So why the b= +lock hash cache requirement? It cannot be applied to many scenarios, and ca= +nnot be optimal in this one.<br /><br />Eric<br /> + +<p></p> + +-- <br /> +You received this message because you are subscribed to the Google Groups &= +quot;Bitcoin Development Mailing List" group.<br /> +To unsubscribe from this group and stop receiving emails from it, send an e= +mail to <a href=3D"mailto:bitcoindev+unsubscribe@googlegroups.com">bitcoind= +ev+unsubscribe@googlegroups.com</a>.<br /> +To view this discussion on the web visit <a href=3D"https://groups.google.c= +om/d/msgid/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.= +com?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d/msg= +id/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.com</a>.= +<br /> + +------=_Part_336777_264697585.1719693639951-- + +------=_Part_336776_1807486589.1719693639951-- + |