diff options
author | Jim Posen <jim.posen@gmail.com> | 2018-03-29 17:50:30 -0700 |
---|---|---|
committer | bitcoindev <bitcoindev@gnusha.org> | 2018-03-30 00:50:35 +0000 |
commit | 756f9be4c5e5bf60a194f85e4012aa5fec090faf (patch) | |
tree | 0d4c2990ee0f76b52ecd2d989c03038f948601ba | |
parent | c951a57b4e4de9908fb845f63f4d7bcaa2822a2d (diff) | |
download | pi-bitcoindev-756f9be4c5e5bf60a194f85e4012aa5fec090faf.tar.gz pi-bitcoindev-756f9be4c5e5bf60a194f85e4012aa5fec090faf.zip |
Re: [bitcoin-dev] Optimized Header Sync
-rw-r--r-- | 4c/2a8a23bd28231456cff0adfa8953c89470ee19 | 1388 |
1 files changed, 1388 insertions, 0 deletions
diff --git a/4c/2a8a23bd28231456cff0adfa8953c89470ee19 b/4c/2a8a23bd28231456cff0adfa8953c89470ee19 new file mode 100644 index 000000000..a92705c5b --- /dev/null +++ b/4c/2a8a23bd28231456cff0adfa8953c89470ee19 @@ -0,0 +1,1388 @@ +Return-Path: <jim.posen@gmail.com> +Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org + [172.17.192.35]) + by mail.linuxfoundation.org (Postfix) with ESMTPS id B6B503EE + for <bitcoin-dev@lists.linuxfoundation.org>; + Fri, 30 Mar 2018 00:50:35 +0000 (UTC) +X-Greylist: whitelisted by SQLgrey-1.7.6 +Received: from mail-io0-f170.google.com (mail-io0-f170.google.com + [209.85.223.170]) + by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C715B5F5 + for <bitcoin-dev@lists.linuxfoundation.org>; + Fri, 30 Mar 2018 00:50:32 +0000 (UTC) +Received: by mail-io0-f170.google.com with SMTP id m83so9615955ioi.8 + for <bitcoin-dev@lists.linuxfoundation.org>; + Thu, 29 Mar 2018 17:50:32 -0700 (PDT) +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; + h=mime-version:in-reply-to:references:from:date:message-id:subject:to + :cc; bh=Mk4+t64L7IAkJbALVJhVCJAE3rNiFdP2LrSXTVdzm9o=; + b=XZkVE0eUh3PSL/OV5K2RZ7OCn7/12md9lcAzwdwG1sEUQ0aRgPnxPqiyONoZNDz3Ep + i3+buqJgUYsXKzFpPKpgRgfBHGuDPhdg9oc+Am7VCs82tq0VDeCCEZHcezYyB9WymHj8 + MjwtVyK8WI3G0hQ6pC5OKKTcL3YIFYeB1c8Yd4v+cFKZlrbgXwozijSCwFFaMs67xj3+ + 2lFErZvFUTF7hhOWlWPvlqxBUQZS7/7B1zbqaVyuwpJC+HGIDd3LLE51443sF4thljy1 + mZgI3xOCyDlsjbXCTAhEhOpk8UvruoD6kCvSQ/FaMgl//RuuLKuWHBzD9anF3qfK8LpK + yE9Q== +X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=1e100.net; s=20161025; + h=x-gm-message-state:mime-version:in-reply-to:references:from:date + :message-id:subject:to:cc; + bh=Mk4+t64L7IAkJbALVJhVCJAE3rNiFdP2LrSXTVdzm9o=; + b=QsNcZ28RBh6IUQzOd4FD0w2VK2b5Gy65LeDcXYinIgt6Y2mf2YXSt32UeGXOUdlPxe + wUjm8fEws1GnshKYhB1ct1IOieALewTPEp8WoGzYhFwooKm+vtAFULNkX6EsVMcV7QMT + upKO5IBI1fo7dsT6eRq/cGDWtDOnesyXGgyihHUaYf63baagzJs6n2gl+U9gTf/x8ncb + mz8JkJ8nZKyPmZNpCzQsiTXSBEmd6No0HDd6By8vH10ZXrZdF9mfx6w2Revt8IIhxYcJ + 0OisCyLe40tPBFv0kj3p4VNioTquJERnfLah3Yo39C2IrOqRYTff/lNyPjRIqqfo7D3U + NPPw== +X-Gm-Message-State: AElRT7H0w4/gSGeb0L94oxhJ+5TR7QCjFZi1e0x/W8mSkjADJvSXfMCM + NMEQU2G8MLfY7P1lsUgHmyMAD5wcDrFQy34KmzQD7db+ +X-Google-Smtp-Source: AIpwx4+ToPnEtx6Rvc4sskxxADrJn4Mgd+Lod/ngX7mqaYug8fXWXradlOB5eGhwN1aSQOwqFdk412a1I8YHTFVXICI= +X-Received: by 10.107.186.139 with SMTP id k133mr23074985iof.95.1522371031652; + Thu, 29 Mar 2018 17:50:31 -0700 (PDT) +MIME-Version: 1.0 +Received: by 10.107.52.80 with HTTP; Thu, 29 Mar 2018 17:50:30 -0700 (PDT) +In-Reply-To: <CADabwBAjTRdVqsL+V=YdQ+kr8+LtSPOeSXUJOzKoPNdKEbAOWQ@mail.gmail.com> +References: <CADZtCSg7+x-sg-ysgacXobRexOVwT+k9fr6a9S-6xU2w8f8m3A@mail.gmail.com> + <CADabwBAjTRdVqsL+V=YdQ+kr8+LtSPOeSXUJOzKoPNdKEbAOWQ@mail.gmail.com> +From: Jim Posen <jim.posen@gmail.com> +Date: Thu, 29 Mar 2018 17:50:30 -0700 +Message-ID: <CADZtCSjmQfBZoaO=MCyRoEn-AYe4A=1kDhxSVxVMw+O4k7YJfQ@mail.gmail.com> +To: Riccardo Casatta <riccardo.casatta@gmail.com> +Content-Type: multipart/alternative; boundary="94eb2c07077a8edf03056896a015" +X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, + DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, + HTML_OBFUSCATE_05_10, + RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 +X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on + smtp1.linux-foundation.org +X-Mailman-Approved-At: Fri, 30 Mar 2018 00:51:46 +0000 +Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org> +Subject: Re: [bitcoin-dev] Optimized Header Sync +X-BeenThere: bitcoin-dev@lists.linuxfoundation.org +X-Mailman-Version: 2.1.12 +Precedence: list +List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org> +List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, + <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe> +List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/> +List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org> +List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help> +List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, + <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe> +X-List-Received-Date: Fri, 30 Mar 2018 00:50:35 -0000 + +--94eb2c07077a8edf03056896a015 +Content-Type: text/plain; charset="UTF-8" +Content-Transfer-Encoding: quoted-printable + +Thanks for giving it a read and for sparking the discussion with your +observation about the 40% savings from dropping prev_hash! + + +> Thought this wasn't effective in case overt asic boost get widely adopted= +, +> but then I understood that at the moment only two bits of version get +> scrambled by that technique so this looks fine, maybe add a comment about +> this so the reader doesn't get the same initial doubt I got. +> + +I still need to compute for historical blocks how many could have an +omitted version. Will post back with that when I get results. If overt ASIC +Boost made this less effective, that would be unfortunate, but so be it. + + +> My feeling is that encoding of the headers and checkpoints/parallel +> download are separate subjects for two BIPS. +> About the checkpoints I don't grasp why they are useful since an attacker +> could lie about them but maybe I am missing something... +> + +Yeah, I guess the background wasn't explained in the BIP itself. After your +original post on the mailing list, there were suggestions that instead of +modifying the format of existing messages, it would be better do create a +new headers message. And as long as we're designing a new headers message, +we should change the semantics to allow parallel download. But if you want +to download from peers in parallel, you need to get a summary of the blocks +that they have. Hence the checkpoints message. So that is why both of these +messages are in the same BIP -- only together can they perform an efficient +sync. + +Regarding the reliability of the checkpoints, I think it's strictly better +than what we have now. Let's say a node is connected to 6 honest peers and +2 malicious peers. Even if the node does not know which ones are good or +bad until it validates the headers, it sees that 6 of the peers are on the +same chain, and can download those headers in parallel from 6 different +sources. So that's already a win. + +Taken a step further though, I'm really interested in treating the +checkpoints as commitments to chain work and using random sampling to +detect lying peers before downloading all of their headers. So imagine you +are connected to two peers, one good one bad, where the good one claims a +chain with X total work and the bad one claims a chain with Y total work. +To determine quickly which is correct, you can randomly sample ranges of +headers and check the proofs of work to see whether it matches what the +peer claimed. So basically you pick a checkpoint at random (weighted by the +work delta) which commits to a total amount of work from the last +checkpoint, then request all headers in between. If the peer responds with +headers with the correct start hash, end hash, and start height (from the +coinbase tx of the first header), then you can be somewhat more confident +their total PoW matches the claimed amount. + +How many times do you need to sample? I don't know yet, but I've heard +Benedikt Bunz is exploring this question with his research on FlyClients +[1], which was an inspiration for this. + + +> Bitfield allows great savings, however the encoding depends on the header= +s +> height a client ask for, this cause a little computational burden on the +> node and the undesirable side effect of difficult caching. Variable lengt= +h +> encoding cause caching difficulties too... +> A simpler approach could be to encode the headers in groups of 2016 +> headers (the difficulty period) where the first header is complete and th= +e +> others 2015 are missing the previous hash and the difficulty, this achiev= +e +> comparable savings ~45%, allows better caching and has fixed length +> encoding. This could be useful for the node by caching headers on a singl= +e +> file on disk and simply stream out the relative range when requested or t= +o +> serve the same encoded headers format in other context like http, +> leveraging http caching infrastructure. +> + +I don't see too much of a problem with caching. Most node implementations I +know of keep all headers in memory anyway, often in contiguous segments of +RAM for historical headers, so it should be fairly inexpensive to serve +queries. Beyond that, the response for a particular query (start_height, +end_hash, encoding version) can be cached, so if some service wants to +precompute max size responses for all start_height multiples of 1,000, they +could cache those. + +-Jim + +[1] https://www.youtube.com/watch?time_continue=3D8400&v=3DBPNs9EVxWrA + + +> 2018-03-28 1:31 GMT+02:00 Jim Posen via bitcoin-dev <bitcoin-dev@lists. +> linuxfoundation.org>: +> +>> Based on some ideas that were thrown around in this thread ( +>> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/ +>> 2017-December/015385.html), I have been working on a P2P extension that +>> will allow faster header sync mechanisms. The one-sentence summary is th= +at +>> by encoding headers more efficiently (eg. omitting prev_hash) and +>> downloading evenly spaced checkpoints throughout history (say every +>> 1,000th) from all peers first, we could speed up header sync, which woul= +d +>> be a huge improvement for light clients. Here is a draft of the BIP: +>> https://github.com/jimpo/bips/blob/headers-sync/headersv2.mediawiki. The +>> full text is below as well. +>> +>> I'd love to hear any feedback people have. +>> +>> ---------------------------------------------------------- +>> +>> =3D=3D Abstract =3D=3D +>> +>> This BIP describes a P2P network extension enabling faster, more reliabl= +e methods for syncing the block header chain. New P2P messages are proposed= + as more efficient replacements for <code>getheaders</code> and <code>heade= +rs</code> during initial block download. The proposed header download proto= +col reduces bandwidth usage by ~40%-50% and supports downloading headers ra= +nges from multiple peers in parallel, which is not possible with the curren= +t mechanism. This also enables sync strategies with better resistance to de= +nial-of-service attacks. +>> +>> =3D=3D Motivation =3D=3D +>> +>> Since 2015, optimized Bitcoin clients fetch all block headers before blo= +cks themselves in order to avoid downloading ones that are not part of the = +most work chain. The protocol currently in use for fetching headers leaves = +room for further optimization, specifically by compressing header data and = +downloading more headers simulaneously<ref>https://lists.linuxfoundation.or= +g/pipermail/bitcoin-dev/2017-December/015385.html</ref>. Any savings here s= +hould have a large impact given that both full nodes and light clients must= + sync the header chain as a first step, and that the time to validate and i= +ndex the headers is negligible compared to the time spent downloading them = +from the network. Furthermore, some current implementations of headers sync= +ing rely on preconfigured checkpoints to discourage attackers attempting to= + fill up a victim's disk space with low-work headers. The proposed messages= + enable sync strategies that are resilient against these types of attacks. = +The P2P messages are designed to be flexible, supporting multiple header sy= +nc strategies and leaving room for future innovations, while also compact. +>> +>> =3D=3D Definitions =3D=3D +>> +>> ''double-SHA256'' is a hash algorithm defined by two invocations of SHA-= +256: <code>double-SHA256(x) =3D SHA256(SHA256(x))</code>. +>> +>> =3D=3D Specification =3D=3D +>> +>> The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHO= +ULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document a= +re to be interpreted as described in RFC 2119. +>> +>> =3D=3D=3D New Structures =3D=3D=3D +>> +>> =3D=3D=3D=3D Compressed Headers =3D=3D=3D=3D +>> +>> Bitcoin headers are serialized by default in 80 bytes as follows: +>> +>> {| class=3D"wikitable" +>> ! Field Name +>> ! Data Type +>> ! Byte Size +>> ! Description +>> |- +>> | version +>> | int32_t +>> | 4 +>> | Block version information +>> |- +>> | prev_block +>> | uint256 +>> | 32 +>> | The hash of the previous block +>> |- +>> | merkle_root +>> | uint256 +>> | 32 +>> | The root hash of the transaction Merkle tree +>> |- +>> | timestamp +>> | uint32_t +>> | 4 +>> | A Unix timestamp of the block creation time, as reported by the miner +>> |- +>> | bits +>> | uint32_t +>> | 4 +>> | The calculated difficulty target for this block +>> |- +>> | nonce +>> | uint32_t +>> | 4 +>> | A nonce that is set such that the header's hash matches the difficulty= + target +>> |} +>> +>> When deserializing a correctly-formed sequence of block headers encoded = +in this way, it can be noted that: +>> +>> * The prev_block field should always match the double-SHA256 hash of the= + previous header, making it redundant +>> * According to Bitcoin consensus rules, the bits field only changes ever= +y 2016 blocks +>> * The version often matches that of a recent ancestor block +>> * The timestamp is often a small delta from the preceding header's times= +tamp +>> +>> To take advantage of these possible savings, this document defines a var= +iable-sized ''compressed encoding'' of block headers that occur in a range.= + Note that no savings are possible when serializing a single header; it sho= +uld only be used for vectors of sequential headers. The full headers are re= +constructed using data from previous headers in the range. The serializatio= +n begins with an ''encoding indicator'', which is a bitfield specifying how= + each field is serialized. The bits of the indicator have the following sem= +antics: +>> +>> {| class=3D"wikitable" +>> ! Bit Index +>> ! Reconstruction +>> ! Description +>> |- +>> | 0 +>> | <code>prev_block[i] =3D DSHA256(header[i-1])</code> +>> | The prev_block field is ommitted and assigned to the double-SHA256 has= +h of the previous uncompressed header. +>> |- +>> | 1 +>> | <code>nbits[i] =3D nbits[i-1]</code> +>> | The nbits field is omitted and matches that of the previous header. +>> |- +>> | 2 +>> | <code>timestamp[i] =3D timestamp[i-1] + value</code> +>> | The timestamp is replaced by a 2-byte signed short int, representing a= +n offset from the previous block's timestamp +>> |- +>> | 3 +>> | +>> | Interpreted along with bits 4 & 5. +>> |- +>> | 4 +>> | +>> | Interpreted along with bits 3 & 5. +>> |- +>> | 5 +>> | <code>version[i] =3D version[i - ((bit[3] << 2) + (bit[4] << 1) + bit[= +5])]</code> +>> | Bits 3, 4, and 5 are first interpreted as a 3-bit offset, with bit ind= +ex 3 as the most significant and bit index 5 as the least significant. If t= +he offset is non-zero, the version field is omitted and assigned to the ver= +sion of the block at the offset number of blocks prior. +>> |- +>> | 6 +>> | +>> | Reserved. +>> |- +>> | 7 +>> | +>> | Reserved. May be used in a future encoding version to signal another i= +ndicator byte. +>> |} +>> +>> The compressed header format is versioned by a 256-bit unsigned integer.= + This document defines version 0. +>> +>> =3D=3D=3D=3D VarInt =3D=3D=3D=3D +>> +>> ''VarInt'' is a variable-length unsigned integer encoding that supports = +a greater range of numbers than the standard ''CompactSize''. This encoding= + was introduced at the database layer in Bitcoin Core<ref>https://github.co= +m/bitcoin/bitcoin/commit/4d6144f97faf9d2a6c89f41d7d2360f21f0b71e2</ref> in = +2012, but is new to the Bitcoin P2P layer. +>> +>> This definition is per the code comments in Bitcoin Core written by Piet= +er Wuille: +>> +>> <pre> +>> Variable-length integers: bytes are a MSB base-128 encoding of the numbe= +r. +>> The high bit in each byte signifies whether another digit follows. To ma= +ke +>> the encoding is one-to-one, one is subtracted from all but the last digi= +t. +>> Thus, the byte sequence a[] with length len, where all but the last byte +>> has bit 128 set, encodes the number: +>> +>> (a[len-1] & 0x7F) + sum(i=3D1..len-1, 128^i*((a[len-i-1] & 0x7F)+1)) +>> +>> Properties: +>> * Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes) +>> * Every integer has exactly one encoding +>> * Encoding does not depend on size of original integer type +>> * No redundancy: every (infinite) byte sequence corresponds to a list +>> of encoded integers. +>> +>> 0: [0x00] 256: [0x81 0x00] +>> 1: [0x01] 16383: [0xFE 0x7F] +>> 127: [0x7F] 16384: [0xFF 0x00] +>> 128: [0x80 0x00] 16511: [0x80 0xFF 0x7F] +>> 255: [0x80 0x7F] 65535: [0x82 0xFD 0x7F] +>> 2^32: [0x8E 0xFE 0xFE 0xFF 0x00] +>> </pre> +>> +>> =3D=3D=3D=3D Checkpoints =3D=3D=3D=3D +>> +>> A ''checkpoint'' is defined for a block as a tuple of its hash and the c= +hain work: +>> +>> {| class=3D"wikitable" +>> ! Field Name +>> ! Data Type +>> ! Byte Size +>> ! Description +>> |- +>> | block_hash +>> | uint256 +>> | 32 +>> | The hash of the block +>> |- +>> | chain_work +>> | VarInt +>> | Variable(1-20) +>> | A delta between the total work in the chain at the checkpoint block an= +d a previous checkpoint, determined by context +>> |} +>> +>> =3D=3D=3D Service Bit =3D=3D=3D +>> +>> This BIP allocates a new service bit: +>> +>> {| class=3D"wikitable" +>> |- +>> | NODE_HEADERS_V2 +>> | <code>1 << ?</code> +>> | If enabled, the node MUST respond to <code>getcheckpts</code> and <cod= +e>getheaders2</code> queries +>> |} +>> +>> =3D=3D=3D New Messages =3D=3D=3D +>> +>> =3D=3D=3D=3D getcheckpts =3D=3D=3D=3D +>> <code>getcheckpts</code> is used to request block headers at a specified= + distance from each other which serve as checkpoints during parallel header= + download. The message contains the following fields: +>> +>> {| class=3D"wikitable" +>> ! Field Name +>> ! Data Type +>> ! Byte Size +>> ! Description +>> |- +>> | block_locator +>> | uint256[] +>> | Variable +>> | A vector of block hashes in descending order by height used to identif= +y the header chain of the requesting node +>> |- +>> | interval +>> | uint32_t +>> | 4 +>> | The distance in block height between requested block hashes +>> |} +>> +>> # Nodes SHOULD NOT send <code>getcheckpts</code> unless the peer has set= + the <code>NODE_HEADERS_V2</code> service bit +>> # The hashes in <code>block_locator</code> MUST be in descending order b= +y block height +>> # The block locator SHOULD be generated as it is in <code>getheaders</co= +de> requests +>> # The receiving node MUST respond to valid requests with a <code>checkpt= +s</code> response where the interval is the same as in the request and the = +first checkpoint hash matches the first common block hash in the block loca= +tor +>> +>> =3D=3D=3D=3D checkpts =3D=3D=3D=3D +>> <code>checkpts</code> is sent in response to <code>getcheckpts</code>, l= +isting block hashes at the specified interval. The message contains the fol= +lowing fields: +>> +>> {| class=3D"wikitable" +>> ! Field Name +>> ! Data Type +>> ! Byte Size +>> ! Description +>> |- +>> | start_height +>> | uint32_t +>> | 4 +>> | The height of the first block in the active chain matching the request= +'s block locator +>> |- +>> | end_height +>> | uint32_t +>> | 4 +>> | The height of the last block in the active chain +>> |- +>> | start_checkpoint +>> | Checkpoint +>> | 48 +>> | The checkpoint structure for the block in the active chain at height s= +tart_height +>> |- +>> | end_checkpoint +>> | Checkpoint +>> | 48 +>> | The checkpoint structure for the block in the active chain at height e= +nd_height +>> |- +>> | interval +>> | uint32_t +>> | 4 +>> | The distance in block height between checkpoints +>> |- +>> | checkpoints_length +>> | CompactSize +>> | Variable(1-5) +>> | The number of checkpoints to follow +>> |- +>> | checkpoints +>> | Checkpoint[] +>> | checkoints_length * Variable(33-52) +>> | The checkpoints as specified below +>> |} +>> +>> # The interval SHOULD match the field in the <code>getcheckpts</code> re= +quest +>> # The start_checkpoint SHOULD correspond to the first block hash in the = +locator from the <code>getcheckpts</code> request that is part of the activ= +e chain +>> # The end_checkpoint SHOULD correspond to the tip of the node's active c= +hain +>> # The start_height MOST be set to the block height of the start_checkpoi= +nt +>> # The end_height MOST be set to the block height of the end_checkpoint +>> # If the interval is zero, the checkpoints vector MUST be empty +>> # If the interval is non-zero, checkpoints MUST correspond to blocks on = +the active chain between the start_checkpoint and the end_checkpoint (exclu= +sive), where the difference in block height between each entry and the prev= +ious one is equal to the interval +>> # The checkpoints_length MUST be less than or equal to 2,000 +>> # The node SHOULD include as many checkpoints on its active chain as are= + available, up to the limit of 2,000 +>> # The chain_work field in the first checkpoint MUST be the total work in= + the chain ending at that block +>> # The chain_work field in each subsequent checkpoint MUST be the differe= +nce in chain work between that block and the previous checkpoint +>> # The chain_work field in each checkpoint MUST be a properly-encoded Var= +Int, not exceeding 20 bytes +>> +>> =3D=3D=3D=3D getheaders2 =3D=3D=3D=3D +>> <code>getheaders2</code> is used to request compressed headers for a ran= +ge of blocks. The message contains the following fields: +>> +>> {| class=3D"wikitable" +>> ! Field Name +>> ! Data Type +>> ! Byte Size +>> ! Description +>> |- +>> | max_version +>> | uint8_t +>> | 1 +>> | The maximum supported encoding version of the headers +>> |- +>> | flags +>> | uint8_t +>> | 1 +>> | A bitfield of message encoding flags +>> |- +>> | start_height +>> | uint32_t +>> | 4 +>> | The height of the first block header in the requested range +>> |- +>> | end_hash +>> | uint256 +>> | 32 +>> | The hash of the last block header in the requested range +>> |} +>> +>> # Nodes SHOULD NOT send <code>getheaders2</code> unless the peer has set= + the <code>NODE_HEADERS_V2</code> service bit +>> # The height of the block with hash end_hash MUST be greater than or equ= +al to start_height, and the difference MUST be strictly less than 3,000 +>> # The end_hash SHOULD match one in a previously received <code>checkpts<= +/code> message, otherwise the receiving node MAY disconnect +>> # The 0th bit (least significant order) of the flags field MAY be set to= + request the coinbase transaction and merkle branch for the block at height= + start_height +>> +>> =3D=3D=3D=3D headers2 =3D=3D=3D=3D +>> <code>headers2</code> is sent in response to <code>getheaders2</code>, l= +isting the compressed headers in the requested range. The message contains = +the following fields: +>> +>> {| class=3D"wikitable" +>> ! Field Name +>> ! Data Type +>> ! Byte Size +>> ! Description +>> |- +>> | version +>> | uint8_t +>> | 1 +>> | The encoding version of the headers +>> |- +>> | flags +>> | uint8_t +>> | 1 +>> | A bitfield of message encoding flags +>> |- +>> | start_height +>> | uint32_t +>> | 4 +>> | The height of the first block header returned +>> |- +>> | headers_length +>> | CompactSize +>> | 1-3 +>> | The number of block headers to follow +>> |- +>> | headers +>> | CompressedHeader[] +>> | Variable +>> | The compressed block headers +>> |- +>> | start_block_coinbase_tx +>> | CTransaction +>> | Variable +>> | The coinbase transaction in the block at start_height +>> |- +>> | start_block_coinbase_branch +>> | uint256[] +>> | Variable +>> | A merkle branch linking the coinbase transaction in the block at start= +_height to its header +>> |} +>> +>> # The version MUST be less than or equal to the max_version field of the= + <code>getheaders2</code> request +>> # Any bits set in the flags field of the <code>getheaders2</code> reques= +t MAY be set in the response field +>> # Any bits not set in the flags field of the <code>getheaders2</code> re= +quest MUST NOT be set in the response field +>> # The first header MUST be encoded with a 0-byte indicator (ie. the head= +er is uncompressed) +>> # start_height MUST be set to the block height of the first header +>> # The hash of the last block SHOULD equal the end_hash of the <code>geth= +eaders2</code> request, ''even if the block is no longer part of the active= + chain'' +>> # The length of the headers vector MUST be less than or equal to 3,000 +>> # The headers MUST be sequential in order of height, with each header a = +successor of the previous one +>> # Each header SHOULD be optimally compressed +>> # The start_block_coinbase_tx should be the serialized coinbase transact= +ion in the block corresponding to the first header +>> # The start_block_coinbase_branch should be a vector of right-hand-side = +hashes in the merkle branch linking the coinbase transaction to the first h= +eader, in order from bottom of the tree to top +>> # If the 0th bit (least significant order) of the flags field is unset, = +the start_block_coinbase_tx and start_block_coinbase_branch fields MUST be = +omitted +>> +>> =3D=3D=3D Sync Strategies =3D=3D=3D +>> +>> The general header sync protocol for clients now is to first request che= +ckpoints from all peers with <code>getcheckpts</code>, then decide which pe= +ers to fetch ranges of headers from and download them with <code>getheaders= +2</code>. +>> +>> =3D=3D=3D=3D Forward Sequential Syncing =3D=3D=3D=3D +>> +>> Similar to the current sync protocol, a client may choose one peer to do= +wnload headers from, then fetch them in forward sequential order. Once this= + peer is out of headers, the client performs the same routine with any peer= +s offering more headers. +>> +>> With this strategy, the client is able to fully validate the block heade= +rs in order and abort if the peer serves an invalid one. On the other hand,= + the peer may be able to serve a longer, lower-work chain than the global a= +ctive chain, wasting the client's time, memory, and storage space. +>> +>> =3D=3D=3D=3D Parallel Header Download =3D=3D=3D=3D +>> +>> In order to increase the throughput of header downloads, a node may down= +load multiple header ranges in parallel from all peers serving the same che= +ckpoints, then validate them in sequential order. +>> +>> =3D=3D=3D=3D Random Sampling Proof-of-Work =3D=3D=3D=3D +>> +>> Similar the FlyClient<ref>https://www.youtube.com/watch?time_continue=3D= +8400&v=3DBPNs9EVxWrA</ref> header download protocol, clients can select the= + peer claiming the greatest total work chain and use random sampling to eff= +iciently determine if the peer is likely to be reporting its chain work hon= +estly. +>> +>> The client treats the checkpoint message as a commitment to chain work o= +f intermediate ranges of headers, the client then randomly samples ranges o= +f headers weighted by total work to determine whether the total chain work = +is valid before downloading all headers. To defend against malicious peers = +attempting to reuse earlier headers later in the chain to fake greater tota= +l work, the client should check the block height in the coinbase transactio= +n for all headers after the BIP 34 activation height. If the peer is found = +to be dishonest, they can be banned before the client downloads too many he= +aders, otherwise the client chooses this as the primary sync peer for forwa= +rd sequential sync or parallel download. +>> +>> =3D=3D Rationale =3D=3D +>> +>> * '''Why include the coinbase transaction in the headers messages?''' Th= +e primary reason is that after BIP 34<ref>https://github.com/bitcoin/bips/b= +lob/master/bip-0034.mediawiki</ref> activation at block height 227,835, coi= +nbase transactions constitute cryptographic commitments to a block's height= + in the chain, which mitigates certain attacks during header sync. Furtherm= +ore, the <code>getheaders2</code> message can be used as a simple way of re= +questing a coinbase transaction for a single header, which may be independe= +ntly useful. +>> +>> * '''Why not omit nBits entirely?''' The compression is designed to perm= +it full decompression of all headers in a <code>headers2</code> message ''w= +ithout'' requiring any other chain context. This is desirable so that proof= +s of work may be validated for arbitrary header ranges. While nBits can be = +computed knowing previous headers, this requires block headers that may not= + be sent in the same message. +>> +>> =3D=3D Compatibility =3D=3D +>> +>> This is backwards compatible, as it defines new P2P messages which are a= +vailable if a service bit is signaled. There are no changes to consensus ru= +les. +>> +>> =3D=3D Acknowledgements =3D=3D +>> +>> Thanks to Gregory Maxwell for suggestions on the compressed header encod= +ing and the DOS-resistant sync strategies. Thanks to Suhas Daftuar for help= +ful discussions. +>> +>> Credit for the VarInt encoding goes to Pieter Wuille. +>> +>> +>> _______________________________________________ +>> bitcoin-dev mailing list +>> bitcoin-dev@lists.linuxfoundation.org +>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev +>> +>> +> +> +> -- +> Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta> +> + +--94eb2c07077a8edf03056896a015 +Content-Type: text/html; charset="UTF-8" +Content-Transfer-Encoding: quoted-printable + +<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div= +>Thanks for giving it a read and for sparking the discussion with your obse= +rvation about the 40% savings from dropping prev_hash!</div><div>=C2=A0</di= +v><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;borde= +r-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>T= +hought this wasn't effective in case overt asic boost get widely adopte= +d, but then I understood that at the moment only two bits of version get sc= +rambled by that technique so this looks fine, maybe add a comment about thi= +s so the reader doesn't get the same initial doubt I got.<br></div></di= +v></blockquote><div><br></div><div><span style=3D"color:rgb(34,34,34);font-= +family:arial,sans-serif;font-size:small;font-style:normal;font-variant-liga= +tures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal= +;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wo= +rd-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:init= +ial;text-decoration-color:initial;float:none;display:inline">I still need t= +o compute for historical blocks how many could have an omitted version. Wil= +l post back with that when I get results. If overt ASIC Boost made this les= +s effective, that would be unfortunate, but so be it.</span><br></div><div>= +=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0= +.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"l= +tr"><div>My feeling is that encoding of the headers and checkpoints/paralle= +l download are separate subjects for two BIPS.</div><div>About the checkpoi= +nts I don't grasp why they are useful since an attacker could lie about= + them but maybe I am missing something...</div></div></blockquote><div><br>= +</div><div>Yeah, I guess the background wasn't explained in the BIP its= +elf. After your original post on the mailing list, there were suggestions t= +hat instead of modifying the format of existing messages, it would be bette= +r do create a new headers message. And as long as we're designing a new= + headers message, we should change the semantics to allow parallel download= +. But if you want to download from peers in parallel, you need to get a sum= +mary of the blocks that they have. Hence the checkpoints message. So that i= +s why both of these messages are in the same BIP -- only together can they = +perform an efficient sync.</div><div><br></div><div>Regarding the reliabili= +ty of the checkpoints, I think it's strictly better than what we have n= +ow. Let's say a node is connected to 6 honest peers and 2 malicious pee= +rs. Even if the node does not know which ones are good or bad until it vali= +dates the headers, it sees that 6 of the peers are on the same chain, and c= +an download those headers in parallel from 6 different sources. So that'= +;s already a win.</div><div><br></div><div>Taken a step further though, I&#= +39;m really interested in treating the checkpoints as commitments to chain = +work and using random sampling to detect lying peers before downloading all= + of their headers. So imagine you are connected to two peers, one good one = +bad, where the good one claims a chain with X total work and the bad one cl= +aims a chain with Y total work. To determine quickly which is correct, you = +can randomly sample ranges of headers and check the proofs of work to see w= +hether it matches what the peer claimed. So basically you pick a checkpoint= + at random (weighted by the work delta) which commits to a total amount of = +work from the last checkpoint, then request all headers in between. If the = +peer responds with headers with the correct start hash, end hash, and start= + height (from the coinbase tx of the first header), then you can be somewha= +t more confident their total PoW matches the claimed amount.</div><div><br>= +</div><div>How many times do you need to sample? I don't know yet, but = +I've heard Benedikt Bunz is exploring this question with his research o= +n FlyClients [1], which was an inspiration for this.</div><div>=C2=A0</div>= +<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= +left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><span cl= +ass=3D"gmail-"><div>Bitfield allows great savings, however the encoding dep= +ends on the headers height a client ask for, this cause a little computatio= +nal burden on the node and the undesirable side effect of difficult caching= +. Variable length encoding cause caching difficulties too...<br></div></spa= +n><div>A simpler approach could be to encode the headers in groups of 2016 = +headers (the difficulty period) where the first header is complete and the = +others 2015 are missing the previous hash and the difficulty, this achieve = +comparable savings ~45%, allows better caching and has fixed length encodin= +g. This could be useful for the node by caching headers on a single file on= + disk and simply stream out the relative range when requested or to serve t= +he same encoded headers=C2=A0<span style=3D"color:rgb(34,34,34);font-family= +:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:= +normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-= +align:start;text-indent:0px;text-transform:none;white-space:normal;word-spa= +cing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;te= +xt-decoration-color:initial;float:none;display:inline">format</span> in oth= +er context like http, leveraging http caching infrastructure.</div></div></= +blockquote><div><br></div><div>I don't see too much of a problem with c= +aching. Most node implementations I know of keep all headers in memory anyw= +ay, often in contiguous segments of RAM for historical headers, so it shoul= +d be fairly inexpensive to serve queries. Beyond that, the response for a p= +articular query (start_height, end_hash, encoding version) can be cached, s= +o if some service wants to precompute max size responses for all start_heig= +ht multiples of 1,000, they could cache those.</div><div><br></div><div>-Ji= +m</div><div><br></div><div>[1] <a href=3D"https://www.youtube.com/watch?tim= +e_continue=3D8400&v=3DBPNs9EVxWrA">https://www.youtube.com/watch?time_c= +ontinue=3D8400&v=3DBPNs9EVxWrA</a></div><div><br></div><blockquote clas= +s=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid r= +gb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extr= +a"><br><div class=3D"gmail_quote"><span class=3D"gmail-">2018-03-28 1:31 GM= +T+02:00 Jim Posen via bitcoin-dev <span dir=3D"ltr"><<a href=3D"mailto:b= +itcoin-dev@lists.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.<= +wbr>linuxfoundation.org</a>></span>:<br></span><blockquote class=3D"gmai= +l_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,20= +4,204);padding-left:1ex"><div><div class=3D"gmail-h5"><div dir=3D"ltr">Base= +d on some ideas that were thrown around in this thread (<a href=3D"https://= +lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-December/015385.html" = +target=3D"_blank">https://lists.linuxfoundation<wbr>.org/pipermail/bitcoin-= +dev/<wbr>2017-December/015385.html</a>), I have been working on a P2P exten= +sion that will allow faster header sync mechanisms. The one-sentence summar= +y is that by encoding headers more efficiently (eg. omitting prev_hash) and= + downloading evenly spaced checkpoints throughout history (say every 1,000t= +h) from all peers first, we could speed up header sync, which would be a hu= +ge improvement for light clients.=C2=A0Here is a draft of the BIP:=C2=A0<a = +href=3D"https://github.com/jimpo/bips/blob/headers-sync/headersv2.mediawiki= +" target=3D"_blank">https://github.com/jimpo/<wbr>bips/blob/headers-sync/he= +aders<wbr>v2.mediawiki</a>. The full text is below as well.<div><div><br></= +div><div>I'd love to hear any feedback people have.</div></div><div><br= +></div><div>------------------------------<wbr>----------------------------= +</div><div><br></div><div><pre style=3D"color:rgb(0,0,0);font-style:normal;= +font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;lett= +er-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word= +-spacing:0px;text-decoration-style:initial;text-decoration-color:initial;wo= +rd-wrap:break-word;white-space:pre-wrap">=3D=3D Abstract =3D=3D + +This BIP describes a P2P network extension enabling faster, more reliable m= +ethods for syncing the block header chain. New P2P messages are proposed as= + more efficient replacements for <code>getheaders</code> and &l= +t;code>headers</code> during initial block download. The proposed = +header download protocol reduces bandwidth usage by ~40%-50% and supports d= +ownloading headers ranges from multiple peers in parallel, which is not pos= +sible with the current mechanism. This also enables sync strategies with be= +tter resistance to denial-of-service attacks. + +=3D=3D Motivation =3D=3D + +Since 2015, optimized Bitcoin clients fetch all block headers before blocks= + themselves in order to avoid downloading ones that are not part of the mos= +t work chain. The protocol currently in use for fetching headers leaves roo= +m for further optimization, specifically by compressing header data and dow= +nloading more headers simulaneously<ref><a href=3D"https://lists.linu= +xfoundation.org/pipermail/bitcoin-dev/2017-December/015385.html" target=3D"= +_blank">https://list<wbr>s.linuxfoundation.org/pipermai<wbr>l/bitcoin-dev/2= +017-December/<wbr>015385.html</a></ref>. Any savings here should have= + a large impact given that both full nodes and light clients must sync the = +header chain as a first step, and that the time to validate and index the h= +eaders is negligible compared to the time spent downloading them from the n= +etwork. Furthermore, some current implementations of headers syncing rely o= +n preconfigured checkpoints to discourage attackers attempting to fill up a= + victim's disk space with low-work headers. The proposed messages enabl= +e sync strategies that are resilient against these types of attacks. The P2= +P messages are designed to be flexible, supporting multiple header sync str= +ategies and leaving room for future innovations, while also compact. + +=3D=3D Definitions =3D=3D + +''double-SHA256'' is a hash algorithm defined by two invoca= +tions of SHA-256: <code>double-SHA256(x) =3D SHA256(SHA256(x))</co= +de>. + +=3D=3D Specification =3D=3D + +The key words "MUST", "MUST NOT", "REQUIRED",= + "SHALL", "SHALL NOT", "SHOULD", "SHOULD= + NOT", "RECOMMENDED", "MAY", and "OPTIONAL&qu= +ot; in this document are to be interpreted as described in RFC 2119. + +=3D=3D=3D New Structures =3D=3D=3D + +=3D=3D=3D=3D Compressed Headers =3D=3D=3D=3D + +Bitcoin headers are serialized by default in 80 bytes as follows: + +{| class=3D"wikitable" +! Field Name +! Data Type +! Byte Size +! Description +|- +| version +| int32_t +| 4 +| Block version information +|- +| prev_block +| uint256 +| 32 +| The hash of the previous block +|- +| merkle_root +| uint256 +| 32 +| The root hash of the transaction Merkle tree +|- +| timestamp +| uint32_t +| 4 +| A Unix timestamp of the block creation time, as reported by the miner +|- +| bits +| uint32_t +| 4 +| The calculated difficulty target for this block +|- +| nonce +| uint32_t +| 4 +| A nonce that is set such that the header's hash matches the difficult= +y target +|} + +When deserializing a correctly-formed sequence of block headers encoded in = +this way, it can be noted that: + +* The prev_block field should always match the double-SHA256 hash of the pr= +evious header, making it redundant +* According to Bitcoin consensus rules, the bits field only changes every 2= +016 blocks +* The version often matches that of a recent ancestor block +* The timestamp is often a small delta from the preceding header's time= +stamp + +To take advantage of these possible savings, this document defines a variab= +le-sized ''compressed encoding'' of block headers that occu= +r in a range. Note that no savings are possible when serializing a single h= +eader; it should only be used for vectors of sequential headers. The full h= +eaders are reconstructed using data from previous headers in the range. The= + serialization begins with an ''encoding indicator'', which= + is a bitfield specifying how each field is serialized. The bits of the ind= +icator have the following semantics: + +{| class=3D"wikitable" +! Bit Index +! Reconstruction +! Description +|- +| 0 +| <code>prev_block[i] =3D DSHA256(header[i-1])</code> +| The prev_block field is ommitted and assigned to the double-SHA256 hash o= +f the previous uncompressed header. +|- +| 1 +| <code>nbits[i] =3D nbits[i-1]</code> +| The nbits field is omitted and matches that of the previous header. +|- +| 2 +| <code>timestamp[i] =3D timestamp[i-1] + value</code> +| The timestamp is replaced by a 2-byte signed short int, representing an o= +ffset from the previous block's timestamp +|- +| 3 +| +| Interpreted along with bits 4 & 5. +|- +| 4 +| +| Interpreted along with bits 3 & 5. +|- +| 5 +| <code>version[i] =3D version[i - ((bit[3] << 2) + (bit[4] <= +;< 1) + bit[5])]</code> +| Bits 3, 4, and 5 are first interpreted as a 3-bit offset, with bit index = +3 as the most significant and bit index 5 as the least significant. If the = +offset is non-zero, the version field is omitted and assigned to the versio= +n of the block at the offset number of blocks prior. +|- +| 6 +| +| Reserved. +|- +| 7 +| +| Reserved. May be used in a future encoding version to signal another indi= +cator byte. +|} + +The compressed header format is versioned by a 256-bit unsigned integer. Th= +is document defines version 0. + +=3D=3D=3D=3D VarInt =3D=3D=3D=3D + +''VarInt'' is a variable-length unsigned integer encoding t= +hat supports a greater range of numbers than the standard ''Compact= +Size''. This encoding was introduced at the database layer in Bitco= +in Core<ref><a href=3D"https://github.com/bitcoin/bitcoin/commit/4d61= +44f97faf9d2a6c89f41d7d2360f21f0b71e2" target=3D"_blank">https://github.com/= +bi<wbr>tcoin/bitcoin/commit/4d6144f97<wbr>faf9d2a6c89f41d7d2360f21f0b71e<wb= +r>2</a></ref> in 2012, but is new to the Bitcoin P2P layer. + +This definition is per the code comments in Bitcoin Core written by Pieter = +Wuille: + +<pre> +Variable-length integers: bytes are a MSB base-128 encoding of the number. +The high bit in each byte signifies whether another digit follows. To make +the encoding is one-to-one, one is subtracted from all but the last digit. +Thus, the byte sequence a[] with length len, where all but the last byte +has bit 128 set, encodes the number: + + (a[len-1] & 0x7F) + sum(i=3D1..len-1, 128^i*((a[len-i-1] & 0x7F)+= +1)) + +Properties: +* Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes) +* Every integer has exactly one encoding +* Encoding does not depend on size of original integer type +* No redundancy: every (infinite) byte sequence corresponds to a list + of encoded integers. + +0: [0x00] 256: [0x81 0x00] +1: [0x01] 16383: [0xFE 0x7F] +127: [0x7F] 16384: [0xFF 0x00] +128: [0x80 0x00] 16511: [0x80 0xFF 0x7F] +255: [0x80 0x7F] 65535: [0x82 0xFD 0x7F] +2^32: [0x8E 0xFE 0xFE 0xFF 0x00] +</pre> + +=3D=3D=3D=3D Checkpoints =3D=3D=3D=3D + +A ''checkpoint'' is defined for a block as a tuple of its h= +ash and the chain work: + +{| class=3D"wikitable" +! Field Name +! Data Type +! Byte Size +! Description +|- +| block_hash +| uint256 +| 32 +| The hash of the block +|- +| chain_work +| VarInt +| Variable(1-20) +| A delta between the total work in the chain at the checkpoint block and a= + previous checkpoint, determined by context +|} + +=3D=3D=3D Service Bit =3D=3D=3D + +This BIP allocates a new service bit: + +{| class=3D"wikitable" +|- +| NODE_HEADERS_V2 +| <code>1 << ?</code> +| If enabled, the node MUST respond to <code>getcheckpts</code>= + and <code>getheaders2</code> queries +|} + +=3D=3D=3D New Messages =3D=3D=3D + +=3D=3D=3D=3D getcheckpts =3D=3D=3D=3D +<code>getcheckpts</code> is used to request block headers at a = +specified distance from each other which serve as checkpoints during parall= +el header download. The message contains the following fields: + +{| class=3D"wikitable" +! Field Name +! Data Type +! Byte Size +! Description +|- +| block_locator +| uint256[] +| Variable +| A vector of block hashes in descending order by height used to identify t= +he header chain of the requesting node +|- +| interval +| uint32_t +| 4 +| The distance in block height between requested block hashes +|} + +# Nodes SHOULD NOT send <code>getcheckpts</code> unless the pee= +r has set the <code>NODE_HEADERS_V2</code> service bit +# The hashes in <code>block_locator</code> MUST be in descendin= +g order by block height +# The block locator SHOULD be generated as it is in <code>getheaders&= +lt;/code> requests +# The receiving node MUST respond to valid requests with a <code>chec= +kpts</code> response where the interval is the same as in the request= + and the first checkpoint hash matches the first common block hash in the b= +lock locator + +=3D=3D=3D=3D checkpts =3D=3D=3D=3D +<code>checkpts</code> is sent in response to <code>getche= +ckpts</code>, listing block hashes at the specified interval. The mes= +sage contains the following fields: + +{| class=3D"wikitable" +! Field Name +! Data Type +! Byte Size +! Description +|- +| start_height +| uint32_t +| 4 +| The height of the first block in the active chain matching the request= +9;s block locator +|- +| end_height +| uint32_t +| 4 +| The height of the last block in the active chain +|- +| start_checkpoint +| Checkpoint +| 48 +| The checkpoint structure for the block in the active chain at height star= +t_height +|- +| end_checkpoint +| Checkpoint +| 48 +| The checkpoint structure for the block in the active chain at height end_= +height +|- +| interval +| uint32_t +| 4 +| The distance in block height between checkpoints +|- +| checkpoints_length +| CompactSize +| Variable(1-5) +| The number of checkpoints to follow +|- +| checkpoints +| Checkpoint[] +| checkoints_length * Variable(33-52) +| The checkpoints as specified below +|} + +# The interval SHOULD match the field in the <code>getcheckpts</co= +de> request +# The start_checkpoint SHOULD correspond to the first block hash in the loc= +ator from the <code>getcheckpts</code> request that is part of = +the active chain +# The end_checkpoint SHOULD correspond to the tip of the node's active = +chain +# The start_height MOST be set to the block height of the start_checkpoint +# The end_height MOST be set to the block height of the end_checkpoint +# If the interval is zero, the checkpoints vector MUST be empty +# If the interval is non-zero, checkpoints MUST correspond to blocks on the= + active chain between the start_checkpoint and the end_checkpoint (exclusiv= +e), where the difference in block height between each entry and the previou= +s one is equal to the interval +# The checkpoints_length MUST be less than or equal to 2,000 +# The node SHOULD include as many checkpoints on its active chain as are av= +ailable, up to the limit of 2,000 +# The chain_work field in the first checkpoint MUST be the total work in th= +e chain ending at that block +# The chain_work field in each subsequent checkpoint MUST be the difference= + in chain work between that block and the previous checkpoint +# The chain_work field in each checkpoint MUST be a properly-encoded VarInt= +, not exceeding 20 bytes + +=3D=3D=3D=3D getheaders2 =3D=3D=3D=3D +<code>getheaders2</code> is used to request compressed headers = +for a range of blocks. The message contains the following fields: + +{| class=3D"wikitable" +! Field Name +! Data Type +! Byte Size +! Description +|- +| max_version +| uint8_t +| 1 +| The maximum supported encoding version of the headers +|- +| flags +| uint8_t +| 1 +| A bitfield of message encoding flags +|- +| start_height +| uint32_t +| 4 +| The height of the first block header in the requested range +|- +| end_hash +| uint256 +| 32 +| The hash of the last block header in the requested range +|} + +# Nodes SHOULD NOT send <code>getheaders2</code> unless the pee= +r has set the <code>NODE_HEADERS_V2</code> service bit +# The height of the block with hash end_hash MUST be greater than or equal = +to start_height, and the difference MUST be strictly less than 3,000 +# The end_hash SHOULD match one in a previously received <code>checkp= +ts</code> message, otherwise the receiving node MAY disconnect +# The 0th bit (least significant order) of the flags field MAY be set to re= +quest the coinbase transaction and merkle branch for the block at height st= +art_height + +=3D=3D=3D=3D headers2 =3D=3D=3D=3D +<code>headers2</code> is sent in response to <code>gethea= +ders2</code>, listing the compressed headers in the requested range. = +The message contains the following fields: + +{| class=3D"wikitable" +! Field Name +! Data Type +! Byte Size +! Description +|- +| version +| uint8_t +| 1 +| The encoding version of the headers +|- +| flags +| uint8_t +| 1 +| A bitfield of message encoding flags +|- +| start_height +| uint32_t +| 4 +| The height of the first block header returned +|- +| headers_length +| CompactSize +| 1-3 +| The number of block headers to follow +|- +| headers +| CompressedHeader[] +| Variable +| The compressed block headers +|- +| start_block_coinbase_tx +| CTransaction +| Variable +| The coinbase transaction in the block at start_height +|- +| start_block_coinbase_branch +| uint256[] +| Variable +| A merkle branch linking the coinbase transaction in the block at start_he= +ight to its header +|} + +# The version MUST be less than or equal to the max_version field of the &l= +t;code>getheaders2</code> request +# Any bits set in the flags field of the <code>getheaders2</code&g= +t; request MAY be set in the response field +# Any bits not set in the flags field of the <code>getheaders2</co= +de> request MUST NOT be set in the response field +# The first header MUST be encoded with a 0-byte indicator (ie. the header = +is uncompressed) +# start_height MUST be set to the block height of the first header +# The hash of the last block SHOULD equal the end_hash of the <code>g= +etheaders2</code> request, ''even if the block is no longer p= +art of the active chain'' +# The length of the headers vector MUST be less than or equal to 3,000 +# The headers MUST be sequential in order of height, with each header a suc= +cessor of the previous one +# Each header SHOULD be optimally compressed +# The start_block_coinbase_tx should be the serialized coinbase transaction= + in the block corresponding to the first header +# The start_block_coinbase_branch should be a vector of right-hand-side has= +hes in the merkle branch linking the coinbase transaction to the first head= +er, in order from bottom of the tree to top +# If the 0th bit (least significant order) of the flags field is unset, the= + start_block_coinbase_tx and start_block_coinbase_branch fields MUST be omi= +tted + +=3D=3D=3D Sync Strategies =3D=3D=3D + +The general header sync protocol for clients now is to first request checkp= +oints from all peers with <code>getcheckpts</code>, then decide= + which peers to fetch ranges of headers from and download them with <cod= +e>getheaders2</code>. + +=3D=3D=3D=3D Forward Sequential Syncing =3D=3D=3D=3D + +Similar to the current sync protocol, a client may choose one peer to downl= +oad headers from, then fetch them in forward sequential order. Once this pe= +er is out of headers, the client performs the same routine with any peers o= +ffering more headers. + +With this strategy, the client is able to fully validate the block headers = +in order and abort if the peer serves an invalid one. On the other hand, th= +e peer may be able to serve a longer, lower-work chain than the global acti= +ve chain, wasting the client's time, memory, and storage space. + +=3D=3D=3D=3D Parallel Header Download =3D=3D=3D=3D + +In order to increase the throughput of header downloads, a node may downloa= +d multiple header ranges in parallel from all peers serving the same checkp= +oints, then validate them in sequential order. + +=3D=3D=3D=3D Random Sampling Proof-of-Work =3D=3D=3D=3D + +Similar the FlyClient<ref><a href=3D"https://www.youtube.com/watch?ti= +me_continue=3D8400&v=3DBPNs9EVxWrA" target=3D"_blank">https://www.yout<= +wbr>ube.com/watch?time_continue=3D<wbr>8400&v=3DBPNs9EVxWrA</a></ref= +> header download protocol, clients can select the peer claiming the gre= +atest total work chain and use random sampling to efficiently determine if = +the peer is likely to be reporting its chain work honestly. + +The client treats the checkpoint message as a commitment to chain work of i= +ntermediate ranges of headers, the client then randomly samples ranges of h= +eaders weighted by total work to determine whether the total chain work is = +valid before downloading all headers. To defend against malicious peers att= +empting to reuse earlier headers later in the chain to fake greater total w= +ork, the client should check the block height in the coinbase transaction f= +or all headers after the BIP 34 activation height. If the peer is found to = +be dishonest, they can be banned before the client downloads too many heade= +rs, otherwise the client chooses this as the primary sync peer for forward = +sequential sync or parallel download. + +=3D=3D Rationale =3D=3D + +* '''Why include the coinbase transaction in the headers messag= +es?''' The primary reason is that after BIP 34<ref><a hre= +f=3D"https://github.com/bitcoin/bips/blob/master/bip-0034.mediawiki" target= +=3D"_blank">https://github.com/bitc<wbr>oin/bips/blob/master/bip-0034.<wbr>= +mediawiki</a></ref> activation at block height 227,835, coinbase tran= +sactions constitute cryptographic commitments to a block's height in th= +e chain, which mitigates certain attacks during header sync. Furthermore, t= +he <code>getheaders2</code> message can be used as a simple way= + of requesting a coinbase transaction for a single header, which may be ind= +ependently useful. + +* '''Why not omit nBits entirely?''' The compressio= +n is designed to permit full decompression of all headers in a <code>= +headers2</code> message ''without'' requiring any oth= +er chain context. This is desirable so that proofs of work may be validated= + for arbitrary header ranges. While nBits can be computed knowing previous = +headers, this requires block headers that may not be sent in the same messa= +ge. + +=3D=3D Compatibility =3D=3D + +This is backwards compatible, as it defines new P2P messages which are avai= +lable if a service bit is signaled. There are no changes to consensus rules= +. + +=3D=3D Acknowledgements =3D=3D + +Thanks to Gregory Maxwell for suggestions on the compressed header encoding= + and the DOS-resistant sync strategies. Thanks to Suhas Daftuar for helpful= + discussions. + +Credit for the VarInt encoding goes to Pieter Wuille.</pre></div></div> +<br></div></div><span class=3D"gmail-">______________________________<wbr>_= +________________<br> +bitcoin-dev mailing list<br> +<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">= +bitcoin-dev@lists.linuxfoundat<wbr>ion.org</a><br> +<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" = +rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org= +/mailman/listinfo/bitcoin-d<wbr>ev</a><br> +<br></span></blockquote></div><span class=3D"gmail-"><br><br clear=3D"all">= +<div><br></div>-- <br><div class=3D"gmail-m_4501160072387634139gmail_signat= +ure"><div dir=3D"ltr">Riccardo Casatta - <a href=3D"https://twitter.com/RCa= +satta" target=3D"_blank">@RCasatta</a></div></div> +</span></div></div> +</blockquote></div><br></div></div> + +--94eb2c07077a8edf03056896a015-- + |