Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id D8BEB723 for ; Fri, 18 Nov 2016 14:45:49 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from sender163-mail.zoho.com (sender163-mail.zoho.com [74.201.84.163]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id A51C8180 for ; Fri, 18 Nov 2016 14:45:48 +0000 (UTC) Received: from [10.8.8.2] (119246245241.ctinets.com [119.246.245.241]) by mx.zohomail.com with SMTPS id 1479480343478291.70966412638506; Fri, 18 Nov 2016 06:45:43 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) From: Johnson Lau In-Reply-To: <11B3C69E-5F1B-4D25-86CE-E5F3B603266F@voskuil.org> Date: Fri, 18 Nov 2016 22:43:03 +0800 Content-Transfer-Encoding: quoted-printable Message-Id: <169CC80A-3B63-4D58-8E8C-0D1D9489E891@xbt.hk> References: <5ef23296-5909-a350-ab11-e717f8fffc41@voskuil.org> <34949746-c0c9-7f14-0e92-69d5a7d44b04@voskuil.org> <8d92ae05-ac6a-30b7-5ef3-f7aa1298e46d@voskuil.org> <632B36D5-74AF-41E2-8E21-359F02645066@xbt.hk> <59D27CC6-120C-4673-9F20-6B5E95EA60C6@voskuil.org> <6F2B3EA2-4245-4A0E-8E19-12D02A871815@xbt.hk> <11B3C69E-5F1B-4D25-86CE-E5F3B603266F@voskuil.org> To: Eric Voskuil X-Mailer: Apple Mail (2.3124) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,MIME_QP_LONG_LINE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: bitcoin-dev Subject: Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments) X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Nov 2016 14:45:50 -0000 In this case I don=E2=80=99t understand how your implementation won=E2=80=99= t be DoS-ed. An attacker could keep sending you inv for the same block / = transaction. Since you don=E2=80=99t assume the hash is unique, each = time you have to download the block/tx again before you could tell if = that is the same one you have already known. Otherwise, you are = implementing the =E2=80=9Cfirst seen=E2=80=9D rule. Also, you can=E2=80=99t ban a peer just because you get an invalid tx = from him, because he might be referring to a hash-colliding UTXO that = you don=E2=80=99t know. In that case you need to request for the parent = tx to verify. I wonder if you are really doing that. > On 18 Nov 2016, at 11:20, Eric Voskuil wrote: >=20 > You are suggesting that, since a node implements a denial of service = policy that actually denies itself otherwise valid blocks, those blocks = are conditionally invalid. And that, since the validity condition is = based on order of arrival and therefore independently unverifiable, = Bitcoin consensus is broken in the face of a hash collision. >=20 > I am aware of two other hash collision scenarios that cause Core to = declare blocks invalid based on ordering. The block hash duplicate check = (it's not fork-point relative) and signature verification caching. Like = the "block banning" issue above, the latter is related to an internal = optimization. I would categorize the former as a simple oversight that = presumably goes way back. >=20 > What then is the consequence of validity that is unverifiable? You = believe this means that Bitcoin consensus is broken. This is incorrect. = First understand that it is not possible for consensus rules to = invalidate blocks based on order of arrival. As such any = *implementation* that invalidates blocks based on order of arrival is = broken. It is an error to claim that these behaviors are part of = consensus, despite being implemented in the satoshi node(s). >=20 > Validity must be verifiable independent of the state of other nodes. = Consensus is a function of block history and time alone. Time is = presumed to be universally consistent. To be a consensus rule all nodes = must be able to independently reach the same validity conclusion, given = the same set of blocks, independent of order. If this is not the case = the behavior is not a consensus rule, it is simply a bug.=20 >=20 > Deviating from such bugs is not a break with consensus, since such = non-rules cannot be part of consensus. One node implementation can = behave deterministically while others are behaving = non-deterministically, with the two nodes remaining consistent from a = consensus standpoint (deterministic produces a subset of = non-deterministic results). But, unlike arbitrary nodes, deterministic = nodes will not cause disruption on the network. >=20 > You imply that these determinism bugs are necessary, that there is no = fix. This is also incorrect. >=20 > The block banning hash collision bug is avoided by not using = non-chain/clock state to determine validity. Doing otherwise is clearly = a bug. The hash of a block is not the block itself, a logically-correct = ban would be to compare the wire serialization of the block as opposed = to the hash, or not maintain the feature at all. >=20 > The signature verification caching hash collision bug is the same = problem, an optimization based on an invalid assumption. A full = serialization comparison (true identity), or elimination of the feature = resolves the bug. >=20 > The block hash check collision bug is trivially resolved by checking = at the fork point as opposed to the tip. This prevents arbitrary (and = irrational) invalidity based on conflict with irrelevant blocks that may = or may not exist above the fork point. >=20 > Libbitcoin is deterministic in all three cases (although the third = issue is not made consistent until v3). I am not aware of any other = non-determinism in Core, but I don't spend a lot of time there. There is = no need to study other implementations to ensure determinism, as that = can be verified independently. >=20 > Any situation in which a node cannot provide deterministic validation = of unordered blocks constitutes a non-consensus bug, as the behavior is = not consistently verifiable by others under any conditions. = Fixing/preventing these bugs is responsible development behavior, and = does not require forks or BIPs, since Bitcoin doesn't inherently contain = any such bugs. They are the consequence of incorrect implementation, and = in two of the three cases above have resulted from supposed = optimizations. But any code that creates non-determinism in exchange for = speed, etc. is not an optimization, it's a bug. A node must implement = its optimizations in a manner that does not alter consensus. >=20 > The BIP30 regression hard fork is not a case of non-determinism. This = will produce deterministic results (apart from the impact of unrelated = bugs). However the results are both a clear break from previous (and = documented) consensus but also produce a very undesirable outcome - = destruction of all unspent outputs in the "replaced" transaction for = starters. So this is a distinct category, not a determinism bug but a = hard fork that produces undesired consequences. >=20 > The BIP30 regression hard fork actually enables the various = pathological scenarios that you were describing, where no such issues = existed in Bitcoin consensus previously. It is now possible to produce a = block that mutates another arbitrarily deep block, and forces a reorg = all the way back to the mutated block. This was done to save = microseconds per block. Despite the improbability of hash collisions, I = find this deplorable and the lack of public discussion on the decision = concerning. >=20 > With respect to the original post, the point at issue is the = introduction of another hard fork, with some odd behaviors, but without = any justification apart from tidying up the small amount of necessary = code. These issues are related in that they are both consensus forks = that have been introduced as supposed optimizations, with no public = discussion prior to release (or at least merging to master with the = presumption of shipping in the latter case). Two of the three hash = collision issues above are also related in that they are bugs introduced = by a desire to optimize internals. >=20 > The engineering lesson here should be clear - watch out for developers = bearing optimizations. A trade against correctness is not an = optimization, it's a break. Satoshi was clearly a fan of the premature = optimization. FindAndDelete is a howler. So this is a tradition in = Bitcoin. My intent is not to sling mud but to improve the situation. >=20 > It is very possible to produce straightforward and deterministic code = that abides consensus and materially outperforms Core, without any of = the above optimization breaks, even avoiding the utxo set optimization. = Even the tx (memory) and block (orphan) pools are complex store = denormalizations implemented as optimizations. Optimizing before = producing a clean conceptual model architecture and design is a software = development anti-pattern (premature optimization). The proposed fork is = a premature optimization. There are much more significant opportunities = to better organize code (and improve performance). I cannot support the = decision to advance it. >=20 > I was unaware Core had regressed BIP30. Given that the behavior is = catastrophic and that it introduces the *only* hash-collision consensus = misbehavior (unless we consider a deep reorg sans the otherwise = necessary proof of work desirable behavior), I strongly recommend it be = reverted, with a post-mortem BIP. >=20 > Finally I recommend people contemplate the difference between unlikely = and impossible. The chance of random collision is very small, but not = zero. Colliding hashes is extremely difficult, but not impossible. But = Bitcoin does not rely on impossibility for correct behavior. It relies = of difficulty. This is a subtle but important distinction that people = are missing. >=20 > Difficulty is a knowable quantity - a function of computing power. If = hash operations remain difficult, Bitcoin is undeterred. Collisions will = have no impact, even if they happen with unexpected frequency (which = would still be vanishingly infrequent). If the difficulty of producing a = collision is reduced to the point where people cannot rely on addresses = (for example), then Bitcoin has a problem, as it has become a leaky ship = (and then there's mining). But with the unnecessary problems described = above, a single hash collision can be catastrophic. Unlike difficulty, = which is known, nobody can know when a single collision will show up. = Betting Bitcoin, and potentially the world's money, on the unknowable is = poor reasoning, especially given that the cost of not doing so is so = very low. >=20 > e >=20 >> On Nov 17, 2016, at 10:08 AM, Johnson Lau wrote: >>=20 >> The fact that some implementations ban an invalid block hash and some = do not, suggests that it=E2=80=99s not a pure p2p protocol issue. A pure = p2p split should be unified by a bridge node. However, a bridge node is = not helpful in this case. Banning an invalid block hash is an implicit = =E2=80=9Cfirst seen=E2=80=9D consensus rule. >>=20 >> jl2012 >>=20 >>> On 18 Nov 2016, at 01:49, Eric Voskuil wrote: >>>=20 >>> Actually both possibilities were specifically covered in my = description. Sorry if it wasn't clear. >>>=20 >>> If you create a new valid block out of an old one it's has potential = to cause a reorg. The blocks that previously built on the original are = still able to do so but presumably cannot build forever on the *new* = block as it has a different tx. But other new blocks can. There is no = chain split due to a different interpretation of valid, there are simply = two valid competing chains. >>>=20 >>> Note that this scenario requires not only block and tx validity with = a tx hash collision, but also that the tx be valid within the block. = Pretty far to reach to not even get a chain split, but it could produce = a deep reorg with a very low chance of success. As I keep telling = people, deep reorgs can happen, they are just unlikely, as is this = scenario. >>>=20 >>> If you create a new invalid block it is discarded by everyone. That = does not invalidate the hash of that block. Permanent blocking as you = describe it would be a p2p protocol design choice, having nothing to do = with consensus. Libbitcoin for example does not ban invalidated hashes = at all. It just discards the block and drops the peer. >>>=20 >>> e >>=20 >>=20