Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 0BC06902 for ; Tue, 17 May 2016 14:25:30 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com [74.125.82.49]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id BA60414E for ; Tue, 17 May 2016 14:25:27 +0000 (UTC) Received: by mail-wm0-f49.google.com with SMTP id a17so34189664wme.0 for ; Tue, 17 May 2016 07:25:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=4RpX1dt/fipFEy5tRrodTP3GGA+YkQMrrA+ufcgAgjQ=; b=LoooPVBRn4X8dZM9MdSAWLkW1mKDxk1PNnHZP8zuMMpztEL/j4DcWWoR74Y9y5JfyC U6KA/J2lTUp5VQD1mpp2poYFbfPNGvTHNbXZYxWsfY5Uf2YOez6q7v9nhlj/oYLVTl1G NtFWZoBuoQoW8FDyWYIhZfRuvtDsDOWUrez/Vww3AbzITn9u60+oKaauNAnFlxiYdjH2 nZBhUNJOkNMVAc+LEQwjkkU36PMjg2l0uxqkQLVnJ8F8za6GppPVJYi20TdOOG7JDlmV +GfWoJXO0mD3WHPukCtK9WvrZg2BgK0CLHv6hBgW9Tpe3DrpKcfOt2O8diK1zrPaigaX HcMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=4RpX1dt/fipFEy5tRrodTP3GGA+YkQMrrA+ufcgAgjQ=; b=hSliwUREZDw1t6QtnnNCfmRl44AWmI7tYwQSESX0HxpZBhJiEgg8nYAhz19nT9nV13 QbJv7NMhTlNgrprBxW/B3SzBpV+8J6Sg/t6fcP1hsPN+pKjz4J1HUpfNn1E7QfuS4hQm 0Ar/97ll5NI9gQtidToqjCJo/sVZLgx50aOO+1eARObwAnYuHznJumocSo0hYhn+oX0w 7OWAEGc2OJz0CEXiiGZDNgze8sKSzrUmuI5Am9d/2wGHxPOkrYVuXR1uQackRpNDRIth PKxKjHKA7pfBW/h/3fi2PbEDZmo8UhqgLg0La8+Z1qyTs9WOXBf9RNIBXvcrITu1gM6a xybQ== X-Gm-Message-State: AOPr4FUrFQVbWd+vy+cH+4Ul2HqhTZ0RTUdqBFEMIM1TWDFugPBjsSf11LTI7ofvCwRN6A== X-Received: by 10.28.62.15 with SMTP id l15mr24222808wma.30.1463495125694; Tue, 17 May 2016 07:25:25 -0700 (PDT) Received: from [192.168.1.9] (109-92-230-17.dynamic.isp.telekom.rs. [109.92.230.17]) by smtp.gmail.com with ESMTPSA id b124sm25145642wmb.1.2016.05.17.07.25.24 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 17 May 2016 07:25:25 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) From: Eric Lombrozo In-Reply-To: <20160517132311.GA21656@fedora-21-dvm> Date: Tue, 17 May 2016 16:25:22 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <17436700-3F7F-406B-AA09-51C20FFD7675@gmail.com> References: <20160517132311.GA21656@fedora-21-dvm> To: Peter Todd , Bitcoin Protocol Discussion X-Mailer: Apple Mail (2.3112) X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] Making UTXO Set Growth Irrelevant With Low-Latency Delayed TXO Commitments X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2016 14:25:30 -0000 Nice! We=E2=80=99ve been talking about doing this forever and it=E2=80=99s so = desperately needed. > On May 17, 2016, at 3:23 PM, Peter Todd via bitcoin-dev = wrote: >=20 > # Motivation >=20 > UTXO growth is a serious concern for Bitcoin's long-term = decentralization. To > run a competitive mining operation potentially the entire UTXO set = must be in > RAM to achieve competitive latency; your larger, more centralized, = competitors > will have the UTXO set in RAM. Mining is a zero-sum game, so the extra = latency > of not doing so if they do directly impacts your profit margin. = Secondly, > having possession of the UTXO set is one of the minimum requirements = to run a > full node; the larger the set the harder it is to run a full node. >=20 > Currently the maximum size of the UTXO set is unbounded as there is no > consensus rule that limits growth, other than the block-size limit = itself; as > of writing the UTXO set is 1.3GB in the on-disk, compressed = serialization, > which expands to significantly more in memory. UTXO growth is driven = by a > number of factors, including the fact that there is little incentive = to merge > inputs, lost coins, dust outputs that can't be economically spent, and > non-btc-value-transfer "blockchain" use-cases such as anti-replay = oracles and > timestamping. >=20 > We don't have good tools to combat UTXO growth. Segregated Witness = proposes to > give witness space a 75% discount, in part of make reducing the UTXO = set size > by spending txouts cheaper. While this may change wallets to more = often spend > dust, it's hard to imagine an incentive sufficiently strong to = discourage most, > let alone all, UTXO growing behavior. >=20 > For example, timestamping applications often create unspendable = outputs due to > ease of implementation, and because doing so is an easy way to make = sure that > the data required to reconstruct the timestamp proof won't get lost - = all > Bitcoin full nodes are forced to keep a copy of it. Similarly = anti-replay > use-cases like using the UTXO set for key rotation piggyback on the = uniquely > strong security and decentralization guarantee that Bitcoin provides; = it's very > difficult - perhaps impossible - to provide these applications with > alternatives that are equally secure. These non-btc-value-transfer = use-cases > can often afford to pay far higher fees per UTXO created than = competing > btc-value-transfer use-cases; many users could afford to spend $50 to = register > a new PGP key, yet would rather not spend $50 in fees to create a = standard two > output transaction. Effective techniques to resist miner censorship = exist, so > without resorting to whitelists blocking non-btc-value-transfer = use-cases as > "spam" is not a long-term, incentive compatible, solution. >=20 > A hard upper limit on UTXO set size could create a more level playing = field in > the form of fixed minimum requirements to run a performant Bitcoin = node, and > make the issue of UTXO "spam" less important. However, making any = coins > unspendable, regardless of age or value, is a politically untenable = economic > change. >=20 >=20 > # TXO Commitments >=20 > A merkle tree committing to the state of all transaction outputs, both = spent > and unspent, we can provide a method of compactly proving the current = state of > an output. This lets us "archive" less frequently accessed parts of = the UTXO > set, allowing full nodes to discard the associated data, still = providing a > mechanism to spend those archived outputs by proving to those nodes = that the > outputs are in fact unspent. >=20 > Specifically TXO commitments proposes a Merkle Mountain Range=C2=B9 = (MMR), a > type of deterministic, indexable, insertion ordered merkle tree, which = allows > new items to be cheaply appended to the tree with minimal storage = requirements, > just log2(n) "mountain tips". Once an output is added to the TXO MMR = it is > never removed; if an output is spent its status is updated in place. = Both the > state of a specific item in the MMR, as well the validity of changes = to items > in the MMR, can be proven with log2(n) sized proofs consisting of a = merkle path > to the tip of the tree. >=20 > At an extreme, with TXO commitments we could even have no UTXO set at = all, > entirely eliminating the UTXO growth problem. Transactions would = simply be > accompanied by TXO commitment proofs showing that the outputs they = wanted to > spend were still unspent; nodes could update the state of the TXO MMR = purely > from TXO commitment proofs. However, the log2(n) bandwidth overhead = per txin is > substantial, so a more realistic implementation is be to have a UTXO = cache for > recent transactions, with TXO commitments acting as a alternate for = the (rare) > event that an old txout needs to be spent. >=20 > Proofs can be generated and added to transactions without the = involvement of > the signers, even after the fact; there's no need for the proof itself = to > signed and the proof is not part of the transaction hash. Anyone with = access to > TXO MMR data can (re)generate missing proofs, so minimal, if any, = changes are > required to wallet software to make use of TXO commitments. >=20 >=20 > ## Delayed Commitments >=20 > TXO commitments aren't a new idea - the author proposed them years ago = in > response to UTXO commitments. However it's critical for small miners' = orphan > rates that block validation be fast, and so far it has proven = difficult to > create (U)TXO implementations with acceptable performance; updating = and > recalculating cryptographicly hashed merkelized datasets is inherently = more > work than not doing so. Fortunately if we maintain a UTXO set for = recent > outputs, TXO commitments are only needed when spending old, archived, = outputs. > We can take advantage of this by delaying the commitment, allowing it = to be > calculated well in advance of it actually being used, thus changing a > latency-critical task into a much easier average throughput problem. >=20 > Concretely each block B_i commits to the TXO set state as of block = B_{i-n}, in > other words what the TXO commitment would have been n blocks ago, if = not for > the n block delay. Since that commitment only depends on the contents = of the > blockchain up until block B_{i-n}, the contents of any block after are > irrelevant to the calculation. >=20 >=20 > ## Implementation >=20 > Our proposed high-performance/low-latency delayed commitment full-node > implementation needs to store the following data: >=20 > 1) UTXO set >=20 > Low-latency K:V map of txouts definitely known to be unspent. = Similar to > existing UTXO implementation, but with the key difference that old, > unspent, outputs may be pruned from the UTXO set. >=20 >=20 > 2) STXO set >=20 > Low-latency set of transaction outputs known to have been spent by > transactions after the most recent TXO commitment, but created = prior to the > TXO commitment. >=20 >=20 > 3) TXO journal >=20 > FIFO of outputs that need to be marked as spent in the TXO MMR. = Appends > must be low-latency; removals can be high-latency. >=20 >=20 > 4) TXO MMR list >=20 > Prunable, ordered list of TXO MMR's, mainly the highest pending = commitment, > backed by a reference counted, cryptographically hashed object = store > indexed by digest (similar to how git repos work). High-latency ok. = We'll > cover this in more in detail later. >=20 >=20 > ### Fast-Path: Verifying a Txout Spend In a Block >=20 > When a transaction output is spent by a transaction in a block we have = two > cases: >=20 > 1) Recently created output >=20 > Output created after the most recent TXO commitment, so it should = be in the > UTXO set; the transaction spending it does not need a TXO = commitment proof. > Remove the output from the UTXO set and append it to the TXO = journal. >=20 > 2) Archived output >=20 > Output created prior to the most recent TXO commitment, so there's = no > guarantee it's in the UTXO set; transaction will have a TXO = commitment > proof for the most recent TXO commitment showing that it was = unspent. > Check that the output isn't already in the STXO set (double-spent), = and if > not add it. Append the output and TXO commitment proof to the TXO = journal. >=20 > In both cases recording an output as spent requires no more than two = key:value > updates, and one journal append. The existing UTXO set requires one = key:value > update per spend, so we can expect new block validation latency to be = within 2x > of the status quo even in the worst case of 100% archived output = spends. >=20 >=20 > ### Slow-Path: Calculating Pending TXO Commitments >=20 > In a low-priority background task we flush the TXO journal, recording = the > outputs spent by each block in the TXO MMR, and hashing MMR data to = obtain the > TXO commitment digest. Additionally this background task removes = STXO's that > have been recorded in TXO commitments, and prunes TXO commitment data = no longer > needed. >=20 > Throughput for the TXO commitment calculation will be worse than the = existing > UTXO only scheme. This impacts bulk verification, e.g. initial block = download. > That said, TXO commitments provides other possible tradeoffs that can = mitigate > impact of slower validation throughput, such as skipping validation of = old > history, as well as fraud proof approaches. >=20 >=20 > ### TXO MMR Implementation Details >=20 > Each TXO MMR state is a modification of the previous one with most = information > shared, so we an space-efficiently store a large number of TXO = commitments > states, where each state is a small delta of the previous state, by = sharing > unchanged data between each state; cycles are impossible in merkelized = data > structures, so simple reference counting is sufficient for garbage = collection. > Data no longer needed can be pruned by dropping it from the database, = and > unpruned by adding it again. Since everything is committed to via = cryptographic > hash, we're guaranteed that regardless of where we get the data, after > unpruning we'll have the right data. >=20 > Let's look at how the TXO MMR works in detail. Consider the following = TXO MMR > with two txouts, which we'll call state #0: >=20 > 0 > / \ > a b >=20 > If we add another entry we get state #1: >=20 > 1 > / \ > 0 \ > / \ \ > a b c >=20 > Note how it 100% of the state #0 data was reused in commitment #1. = Let's > add two more entries to get state #2: >=20 > 2 > / \ > 2 \ > / \ \ > / \ \ > / \ \ > 0 2 \ > / \ / \ \ > a b c d e >=20 > This time part of state #1 wasn't reused - it's wasn't a perfect = binary > tree - but we've still got a lot of re-use. >=20 > Now suppose state #2 is committed into the blockchain by the most = recent block. > Future transactions attempting to spend outputs created as of state #2 = are > obliged to prove that they are unspent; essentially they're forced to = provide > part of the state #2 MMR data. This lets us prune that data, = discarding it, > leaving us with only the bare minimum data we need to append new = txouts to the > TXO MMR, the tips of the perfect binary trees ("mountains") within the = MMR: >=20 > 2 > / \ > 2 \ > \ > \ > \ > \ > \ > e >=20 > Note that we're glossing over some nuance here about exactly what data = needs to > be kept; depending on the details of the implementation the only data = we need > for nodes "2" and "e" may be their hash digest. >=20 > Adding another three more txouts results in state #3: >=20 > 3 > / \ > / \ > / \ > / \ > / \ > / \ > / \ > 2 3 > / \ > / \ > / \ > 3 3 > / \ / \ > e f g h >=20 > Suppose recently created txout f is spent. We have all the data = required to > update the MMR, giving us state #4. It modifies two inner nodes and = one leaf > node: >=20 > 4 > / \ > / \ > / \ > / \ > / \ > / \ > / \ > 2 4 > / \ > / \ > / \ > 4 3 > / \ / \ > e (f) g h >=20 > If an archived txout is spent requires the transaction to provide the = merkle > path to the most recently committed TXO, in our case state #2. If = txout b is > spent that means the transaction must provide the following data from = state #2: >=20 > 2 > / > 2 > / > / > / > 0 > \ > b >=20 > We can add that data to our local knowledge of the TXO MMR, unpruning = part of > it: >=20 > 4 > / \ > / \ > / \ > / \ > / \ > / \ > / \ > 2 4 > / / \ > / / \ > / / \ > 0 4 3 > \ / \ / \ > b e (f) g h >=20 > Remember, we haven't _modified_ state #4 yet; we just have more data = about it. > When we mark txout b as spent we get state #5: >=20 > 5 > / \ > / \ > / \ > / \ > / \ > / \ > / \ > 5 4 > / / \ > / / \ > / / \ > 5 4 3 > \ / \ / \ > (b) e (f) g h >=20 > Secondly by now state #3 has been committed into the chain, and = transactions > that want to spend txouts created as of state #3 must provide a TXO = proof > consisting of state #3 data. The leaf nodes for outputs g and h, and = the inner > node above them, are part of state #3, so we prune them: >=20 > 5 > / \ > / \ > / \ > / \ > / \ > / \ > / \ > 5 4 > / / > / / > / / > 5 4 > \ / \ > (b) e (f) >=20 > Finally, lets put this all together, by spending txouts a, c, and g, = and > creating three new txouts i, j, and k. State #3 was the most recently = committed > state, so the transactions spending a and g are providing merkle paths = up to > it. This includes part of the state #2 data: >=20 > 3 > / \ > / \ > / \ > / \ > / \ > / \ > / \ > 2 3 > / \ \ > / \ \ > / \ \ > 0 2 3 > / / / > a c g >=20 > After unpruning we have the following data for state #5: >=20 > 5 > / \ > / \ > / \ > / \ > / \ > / \ > / \ > 5 4 > / \ / \ > / \ / \ > / \ / \ > 5 2 4 3 > / \ / / \ / > a (b) c e (f) g >=20 > That's sufficient to mark the three outputs as spent and add the three = new > txouts, resulting in state #6: >=20 > 6 > / \ > / \ > / \ > / \ > / \ > 6 \ > / \ \ > / \ \ > / \ \ > / \ \ > / \ \ > / \ \ > / \ \ > 6 6 \ > / \ / \ \ > / \ / \ 6 > / \ / \ / \ > 6 6 4 6 6 \ > / \ / / \ / / \ \ > (a) (b) (c) e (f) (g) i j k >=20 > Again, state #4 related data can be pruned. In addition, depending on = how the > STXO set is implemented may also be able to prune data related to = spent txouts > after that state, including inner nodes where all txouts under them = have been > spent (more on pruning spent inner nodes later). >=20 >=20 > ### Consensus and Pruning >=20 > It's important to note that pruning behavior is consensus critical: a = full node > that is missing data due to pruning it too soon will fall out of = consensus, and > a miner that fails to include a merkle proof that is required by the = consensus > is creating an invalid block. At the same time many full nodes will = have > significantly more data on hand than the bare minimum so they can help = wallets > make transactions spending old coins; implementations should strongly = consider > separating the data that is, and isn't, strictly required for = consensus. >=20 > A reasonable approach for the low-level cryptography may be to = actually treat > the two cases differently, with the TXO commitments committing too = what data > does and does not need to be kept on hand by the UTXO expiration = rules. On the > other hand, leaving that uncommitted allows for certain types of = soft-forks > where the protocol is changed to require more data than it previously = did. >=20 >=20 > ### Consensus Critical Storage Overheads >=20 > Only the UTXO and STXO sets need to be kept on fast random access = storage. > Since STXO set entries can only be created by spending a UTXO - and = are smaller > than a UTXO entry - we can guarantee that the peak size of the UTXO = and STXO > sets combined will always be less than the peak size of the UTXO set = alone in > the existing UTXO-only scheme (though the combined size can be = temporarily > higher than what the UTXO set size alone would be when large numbers = of > archived txouts are spent). >=20 > TXO journal entries and unpruned entries in the TXO MMR have log2(n) = maximum > overhead per entry: a unique merkle path to a TXO commitment (by = "unique" we > mean that no other entry shares data with it). On a reasonably fast = system the > TXO journal will be flushed quickly, converting it into TXO MMR data; = the TXO > journal will never be more than a few blocks in size. >=20 > Transactions spending non-archived txouts are not required to provide = any TXO > commitment data; we must have that data on hand in the form of one TXO = MMR > entry per UTXO. Once spent however the TXO MMR leaf node associated = with that > non-archived txout can be immediately pruned - it's no longer in the = UTXO set > so any attempt to spend it will fail; the data is now immutable and = we'll never > need it again. Inner nodes in the TXO MMR can also be pruned if all = leafs under > them are fully spent; detecting this is easy the TXO MMR is a = merkle-sum tree, > with each inner node committing to the sum of the unspent txouts under = it. >=20 > When a archived txout is spent the transaction is required to provide = a merkle > path to the most recent TXO commitment. As shown above that path is = sufficient > information to unprune the necessary nodes in the TXO MMR and apply = the spend > immediately, reducing this case to the TXO journal size question = (non-consensus > critical overhead is a different question, which we'll address in the = next > section). >=20 > Taking all this into account the only significant storage overhead of = our TXO > commitments scheme when compared to the status quo is the log2(n) = merkle path > overhead; as long as less than 1/log2(n) of the UTXO set is active, > non-archived, UTXO's we've come out ahead, even in the unrealistic = case where > all storage available is equally fast. In the real world that isn't = yet the > case - even SSD's significantly slower than RAM. >=20 >=20 > ### Non-Consensus Critical Storage Overheads >=20 > Transactions spending archived txouts pose two challenges: >=20 > 1) Obtaining up-to-date TXO commitment proofs >=20 > 2) Updating those proofs as blocks are mined >=20 > The first challenge can be handled by specialized archival nodes, not = unlike > how some nodes make transaction data available to wallets via bloom = filters or > the Electrum protocol. There's a whole variety of options available, = and the > the data can be easily sharded to scale horizontally; the data is > self-validating allowing horizontal scaling without trust. >=20 > While miners and relay nodes don't need to be concerned about the = initial > commitment proof, updating that proof is another matter. If a node = aggressively > prunes old versions of the TXO MMR as it calculates pending TXO = commitments, it > won't have the data available to update the TXO commitment proof to be = against > the next block, when that block is found; the child nodes of the TXO = MMR tip > are guaranteed to have changed, yet aggressive pruning would have = discarded that > data. >=20 > Relay nodes could ignore this problem if they simply accept the fact = that > they'll only be able to fully relay the transaction once, when it is = initially > broadcast, and won't be able to provide mempool functionality after = the initial > relay. Modulo high-latency mixnets, this is probably acceptable; the = author has > previously argued that relay nodes don't need a mempool=C2=B2 at all. >=20 > For a miner though not having the data necessary to update the proofs = as blocks > are found means potentially losing out on transactions fees. So how = much extra > data is necessary to make this a non-issue? >=20 > Since the TXO MMR is insertion ordered, spending a non-archived txout = can only > invalidate the upper nodes in of the archived txout's TXO MMR proof = (if this > isn't clear, imagine a two-level scheme, with a per-block TXO MMRs, = committed > by a master MMR for all blocks). The maximum number of relevant inner = nodes > changed is log2(n) per block, so if there are n non-archival blocks = between the > most recent TXO commitment and the pending TXO MMR tip, we have to = store > log2(n)*n inner nodes - on the order of a few dozen MB even when n is = a > (seemingly ridiculously high) year worth of blocks. >=20 > Archived txout spends on the other hand can invalidate TXO MMR proofs = at any > level - consider the case of two adjacent txouts being spent. To = guarantee > success requires storing full proofs. However, they're limited by the = blocksize > limit, and additionally are expected to be relatively uncommon. For = example, if > 1% of 1MB blocks was archival spends, our hypothetical year long TXO = commitment > delay is only a few hundred MB of data with low-IO-performance = requirements. >=20 >=20 > ## Security Model >=20 > Of course, a TXO commitment delay of a year sounds ridiculous. Even = the slowest > imaginable computer isn't going to need more than a few blocks of TXO > commitment delay to keep up ~100% of the time, and there's no reason = why we > can't have the UTXO archive delay be significantly longer than the TXO > commitment delay. >=20 > However, as with UTXO commitments, TXO commitments raise issues with = Bitcoin's > security model by allowing relatively miners to profitably mine = transactions > without bothering to validate prior history. At the extreme, if there = was no > commitment delay at all at the cost of a bit of some extra network = bandwidth > "full" nodes could operate and even mine blocks completely statelessly = by > expecting all transactions to include "proof" that their inputs are = unspent; a > TXO commitment proof for a commitment you haven't verified isn't a = proof that a > transaction output is unspent, it's a proof that some miners claimed = the txout > was unspent. >=20 > At one extreme, we could simply implement TXO commitments in a = "virtual" > fashion, without miners actually including the TXO commitment digest = in their > blocks at all. Full nodes would be forced to compute the commitment = from > scratch, in the same way they are forced to compute the UTXO state, or = total > work. Of course a full node operator who doesn't want to verify old = history can > get a copy of the TXO state from a trusted source - no different from = how you > could get a copy of the UTXO set from a trusted source. >=20 > A more pragmatic approach is to accept that people will do that = anyway, and > instead assume that sufficiently old blocks are valid. But how old is > "sufficiently old"? First of all, if your full node implementation = comes "from > the factory" with a reasonably up-to-date minimum accepted total-work > threshold=E2=81=B1 - in other words it won't accept a chain with less = than that amount > of total work - it may be reasonable to assume any Sybil attacker with > sufficient hashing power to make a forked chain meeting that threshold = with, > say, six months worth of blocks has enough hashing power to threaten = the main > chain as well. >=20 > That leaves public attempts to falsify TXO commitments, done out in = the open by > the majority of hashing power. In this circumstance the "assumed = valid" > threshold determines how long the attack would have to go on before = full nodes > start accepting the invalid chain, or at least, newly = installed/recently reset > full nodes. The minimum age that we can "assume valid" is tradeoff = between > political/social/technical concerns; we probably want at least a few = weeks to > guarantee the defenders a chance to organise themselves. >=20 > With this in mind, a longer-than-technically-necessary TXO commitment = delay=CA=B2 > may help ensure that full node software actually validates some = minimum number > of blocks out-of-the-box, without taking shortcuts. However this can = be > achieved in a wide variety of ways, such as the author's = prev-block-proof > proposal=C2=B3, fraud proofs, or even a PoW with an inner loop = dependent on > blockchain data. Like UTXO commitments, TXO commitments are also = potentially > very useful in reducing the need for SPV wallet software to trust = third parties > providing them with transaction data. >=20 > i) Checkpoints that reject any chain without a specific block are a = more > common, if uglier, way of achieving this protection. >=20 > j) A good homework problem is to figure out how the TXO commitment = could be > designed such that the delay could be reduced in a soft-fork. >=20 >=20 > ## Further Work >=20 > While we've shown that TXO commitments certainly could be implemented = without > increasing peak IO bandwidth/block validation latency significantly = with the > delayed commitment approach, we're far from being certain that they = should be > implemented this way (or at all). >=20 > 1) Can a TXO commitment scheme be optimized sufficiently to be used = directly > without a commitment delay? Obviously it'd be preferable to avoid all = the above > complexity entirely. >=20 > 2) Is it possible to use a metric other than age, e.g. priority? While = this > complicates the pruning logic, it could use the UTXO set space more > efficiently, especially if your goal is to prioritise bitcoin = value-transfer > over other uses (though if "normal" wallets nearly never need to use = TXO > commitments proofs to spend outputs, the infrastructure to actually do = this may > rot). >=20 > 3) Should UTXO archiving be based on a fixed size UTXO set, rather = than an > age/priority/etc. threshold? >=20 > 4) By fixing the problem (or possibly just "fixing" the problem) are = we > encouraging/legitimising blockchain use-cases other than BTC value = transfer? > Should we? >=20 > 5) Instead of TXO commitment proofs counting towards the blocksize = limit, can > we use a different miner fairness/decentralization metric/incentive? = For > instance it might be reasonable for the TXO commitment proof size to = be > discounted, or ignored entirely, if a proof-of-propagation scheme = (e.g. > thinblocks) is used to ensure all miners have received the proof in = advance. >=20 > 6) How does this interact with fraud proofs? Obviously furthering = dependency on > non-cryptographically-committed STXO/UTXO databases is incompatible = with the > modularized validation approach to implementing fraud proofs. >=20 >=20 > # References >=20 > 1) "Merkle Mountain Ranges", > Peter Todd, OpenTimestamps, Mar 18 2013, > = https://github.com/opentimestamps/opentimestamps-server/blob/master/doc/me= rkle-mountain-range.md >=20 > 2) "Do we really need a mempool? (for relay nodes)", > Peter Todd, bitcoin-dev mailing list, Jul 18th 2015, > = https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-July/009479.h= tml >=20 > 3) "Segregated witnesses and validationless mining", > Peter Todd, bitcoin-dev mailing list, Dec 23rd 2015, > = https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-December/0121= 03.html >=20 > --=20 > https://petertodd.org 'peter'[:-1]@petertodd.org > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev