Re: [bitcoin-dev] Optimized Header Sync

author: Jim Posen <jim.posen@gmail.com> 2018-03-29 17:50:30 -0700
committer: bitcoindev <bitcoindev@gnusha.org> 2018-03-30 00:50:35 +0000
commit: 756f9be4c5e5bf60a194f85e4012aa5fec090faf (patch)
tree: 0d4c2990ee0f76b52ecd2d989c03038f948601ba
parent: c951a57b4e4de9908fb845f63f4d7bcaa2822a2d (diff)
download: pi-bitcoindev-756f9be4c5e5bf60a194f85e4012aa5fec090faf.tar.gz
pi-bitcoindev-756f9be4c5e5bf60a194f85e4012aa5fec090faf.zip
1 files changed, 1388 insertions, 0 deletions
diff --git a/4c/2a8a23bd28231456cff0adfa8953c89470ee19 b/4c/2a8a23bd28231456cff0adfa8953c89470ee19
new file mode 100644
index 000000000..a92705c5b
--- /dev/null
+++ b/4c/2a8a23bd28231456cff0adfa8953c89470ee19
@@ -0,0 +1,1388 @@
+Return-Path: <jim.posen@gmail.com>
+Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
+	[172.17.192.35])
+	by mail.linuxfoundation.org (Postfix) with ESMTPS id B6B503EE
+	for <bitcoin-dev@lists.linuxfoundation.org>;
+	Fri, 30 Mar 2018 00:50:35 +0000 (UTC)
+X-Greylist: whitelisted by SQLgrey-1.7.6
+Received: from mail-io0-f170.google.com (mail-io0-f170.google.com
+	[209.85.223.170])
+	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C715B5F5
+	for <bitcoin-dev@lists.linuxfoundation.org>;
+	Fri, 30 Mar 2018 00:50:32 +0000 (UTC)
+Received: by mail-io0-f170.google.com with SMTP id m83so9615955ioi.8
+	for <bitcoin-dev@lists.linuxfoundation.org>;
+	Thu, 29 Mar 2018 17:50:32 -0700 (PDT)
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
+	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
+	:cc; bh=Mk4+t64L7IAkJbALVJhVCJAE3rNiFdP2LrSXTVdzm9o=;
+	b=XZkVE0eUh3PSL/OV5K2RZ7OCn7/12md9lcAzwdwG1sEUQ0aRgPnxPqiyONoZNDz3Ep
+	i3+buqJgUYsXKzFpPKpgRgfBHGuDPhdg9oc+Am7VCs82tq0VDeCCEZHcezYyB9WymHj8
+	MjwtVyK8WI3G0hQ6pC5OKKTcL3YIFYeB1c8Yd4v+cFKZlrbgXwozijSCwFFaMs67xj3+
+	2lFErZvFUTF7hhOWlWPvlqxBUQZS7/7B1zbqaVyuwpJC+HGIDd3LLE51443sF4thljy1
+	mZgI3xOCyDlsjbXCTAhEhOpk8UvruoD6kCvSQ/FaMgl//RuuLKuWHBzD9anF3qfK8LpK
+	yE9Q==
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
+	d=1e100.net; s=20161025;
+	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
+	:message-id:subject:to:cc;
+	bh=Mk4+t64L7IAkJbALVJhVCJAE3rNiFdP2LrSXTVdzm9o=;
+	b=QsNcZ28RBh6IUQzOd4FD0w2VK2b5Gy65LeDcXYinIgt6Y2mf2YXSt32UeGXOUdlPxe
+	wUjm8fEws1GnshKYhB1ct1IOieALewTPEp8WoGzYhFwooKm+vtAFULNkX6EsVMcV7QMT
+	upKO5IBI1fo7dsT6eRq/cGDWtDOnesyXGgyihHUaYf63baagzJs6n2gl+U9gTf/x8ncb
+	mz8JkJ8nZKyPmZNpCzQsiTXSBEmd6No0HDd6By8vH10ZXrZdF9mfx6w2Revt8IIhxYcJ
+	0OisCyLe40tPBFv0kj3p4VNioTquJERnfLah3Yo39C2IrOqRYTff/lNyPjRIqqfo7D3U
+	NPPw==
+X-Gm-Message-State: AElRT7H0w4/gSGeb0L94oxhJ+5TR7QCjFZi1e0x/W8mSkjADJvSXfMCM
+	NMEQU2G8MLfY7P1lsUgHmyMAD5wcDrFQy34KmzQD7db+
+X-Google-Smtp-Source: AIpwx4+ToPnEtx6Rvc4sskxxADrJn4Mgd+Lod/ngX7mqaYug8fXWXradlOB5eGhwN1aSQOwqFdk412a1I8YHTFVXICI=
+X-Received: by 10.107.186.139 with SMTP id k133mr23074985iof.95.1522371031652; 
+	Thu, 29 Mar 2018 17:50:31 -0700 (PDT)
+MIME-Version: 1.0
+Received: by 10.107.52.80 with HTTP; Thu, 29 Mar 2018 17:50:30 -0700 (PDT)
+In-Reply-To: <CADabwBAjTRdVqsL+V=YdQ+kr8+LtSPOeSXUJOzKoPNdKEbAOWQ@mail.gmail.com>
+References: <CADZtCSg7+x-sg-ysgacXobRexOVwT+k9fr6a9S-6xU2w8f8m3A@mail.gmail.com>
+	<CADabwBAjTRdVqsL+V=YdQ+kr8+LtSPOeSXUJOzKoPNdKEbAOWQ@mail.gmail.com>
+From: Jim Posen <jim.posen@gmail.com>
+Date: Thu, 29 Mar 2018 17:50:30 -0700
+Message-ID: <CADZtCSjmQfBZoaO=MCyRoEn-AYe4A=1kDhxSVxVMw+O4k7YJfQ@mail.gmail.com>
+To: Riccardo Casatta <riccardo.casatta@gmail.com>
+Content-Type: multipart/alternative; boundary="94eb2c07077a8edf03056896a015"
+X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,
+	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE,
+	HTML_OBFUSCATE_05_10, 
+	RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
+X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
+	smtp1.linux-foundation.org
+X-Mailman-Approved-At: Fri, 30 Mar 2018 00:51:46 +0000
+Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
+Subject: Re: [bitcoin-dev] Optimized Header Sync
+X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
+X-Mailman-Version: 2.1.12
+Precedence: list
+List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
+List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
+	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
+List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
+List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
+List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
+List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
+	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
+X-List-Received-Date: Fri, 30 Mar 2018 00:50:35 -0000
+
+--94eb2c07077a8edf03056896a015
+Content-Type: text/plain; charset="UTF-8"
+Content-Transfer-Encoding: quoted-printable
+
+Thanks for giving it a read and for sparking the discussion with your
+observation about the 40% savings from dropping prev_hash!
+
+
+> Thought this wasn't effective in case overt asic boost get widely adopted=
+,
+> but then I understood that at the moment only two bits of version get
+> scrambled by that technique so this looks fine, maybe add a comment about
+> this so the reader doesn't get the same initial doubt I got.
+>
+
+I still need to compute for historical blocks how many could have an
+omitted version. Will post back with that when I get results. If overt ASIC
+Boost made this less effective, that would be unfortunate, but so be it.
+
+
+> My feeling is that encoding of the headers and checkpoints/parallel
+> download are separate subjects for two BIPS.
+> About the checkpoints I don't grasp why they are useful since an attacker
+> could lie about them but maybe I am missing something...
+>
+
+Yeah, I guess the background wasn't explained in the BIP itself. After your
+original post on the mailing list, there were suggestions that instead of
+modifying the format of existing messages, it would be better do create a
+new headers message. And as long as we're designing a new headers message,
+we should change the semantics to allow parallel download. But if you want
+to download from peers in parallel, you need to get a summary of the blocks
+that they have. Hence the checkpoints message. So that is why both of these
+messages are in the same BIP -- only together can they perform an efficient
+sync.
+
+Regarding the reliability of the checkpoints, I think it's strictly better
+than what we have now. Let's say a node is connected to 6 honest peers and
+2 malicious peers. Even if the node does not know which ones are good or
+bad until it validates the headers, it sees that 6 of the peers are on the
+same chain, and can download those headers in parallel from 6 different
+sources. So that's already a win.
+
+Taken a step further though, I'm really interested in treating the
+checkpoints as commitments to chain work and using random sampling to
+detect lying peers before downloading all of their headers. So imagine you
+are connected to two peers, one good one bad, where the good one claims a
+chain with X total work and the bad one claims a chain with Y total work.
+To determine quickly which is correct, you can randomly sample ranges of
+headers and check the proofs of work to see whether it matches what the
+peer claimed. So basically you pick a checkpoint at random (weighted by the
+work delta) which commits to a total amount of work from the last
+checkpoint, then request all headers in between. If the peer responds with
+headers with the correct start hash, end hash, and start height (from the
+coinbase tx of the first header), then you can be somewhat more confident
+their total PoW matches the claimed amount.
+
+How many times do you need to sample? I don't know yet, but I've heard
+Benedikt Bunz is exploring this question with his research on FlyClients
+[1], which was an inspiration for this.
+
+
+> Bitfield allows great savings, however the encoding depends on the header=
+s
+> height a client ask for, this cause a little computational burden on the
+> node and the undesirable side effect of difficult caching. Variable lengt=
+h
+> encoding cause caching difficulties too...
+> A simpler approach could be to encode the headers in groups of 2016
+> headers (the difficulty period) where the first header is complete and th=
+e
+> others 2015 are missing the previous hash and the difficulty, this achiev=
+e
+> comparable savings ~45%, allows better caching and has fixed length
+> encoding. This could be useful for the node by caching headers on a singl=
+e
+> file on disk and simply stream out the relative range when requested or t=
+o
+> serve the same encoded headers format in other context like http,
+> leveraging http caching infrastructure.
+>
+
+I don't see too much of a problem with caching. Most node implementations I
+know of keep all headers in memory anyway, often in contiguous segments of
+RAM for historical headers, so it should be fairly inexpensive to serve
+queries. Beyond that, the response for a particular query (start_height,
+end_hash, encoding version) can be cached, so if some service wants to
+precompute max size responses for all start_height multiples of 1,000, they
+could cache those.
+
+-Jim
+
+[1] https://www.youtube.com/watch?time_continue=3D8400&v=3DBPNs9EVxWrA
+
+
+> 2018-03-28 1:31 GMT+02:00 Jim Posen via bitcoin-dev <bitcoin-dev@lists.
+> linuxfoundation.org>:
+>
+>> Based on some ideas that were thrown around in this thread (
+>> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/
+>> 2017-December/015385.html), I have been working on a P2P extension that
+>> will allow faster header sync mechanisms. The one-sentence summary is th=
+at
+>> by encoding headers more efficiently (eg. omitting prev_hash) and
+>> downloading evenly spaced checkpoints throughout history (say every
+>> 1,000th) from all peers first, we could speed up header sync, which woul=
+d
+>> be a huge improvement for light clients. Here is a draft of the BIP:
+>> https://github.com/jimpo/bips/blob/headers-sync/headersv2.mediawiki. The
+>> full text is below as well.
+>>
+>> I'd love to hear any feedback people have.
+>>
+>> ----------------------------------------------------------
+>>
+>> =3D=3D Abstract =3D=3D
+>>
+>> This BIP describes a P2P network extension enabling faster, more reliabl=
+e methods for syncing the block header chain. New P2P messages are proposed=
+ as more efficient replacements for <code>getheaders</code> and <code>heade=
+rs</code> during initial block download. The proposed header download proto=
+col reduces bandwidth usage by ~40%-50% and supports downloading headers ra=
+nges from multiple peers in parallel, which is not possible with the curren=
+t mechanism. This also enables sync strategies with better resistance to de=
+nial-of-service attacks.
+>>
+>> =3D=3D Motivation =3D=3D
+>>
+>> Since 2015, optimized Bitcoin clients fetch all block headers before blo=
+cks themselves in order to avoid downloading ones that are not part of the =
+most work chain. The protocol currently in use for fetching headers leaves =
+room for further optimization, specifically by compressing header data and =
+downloading more headers simulaneously<ref>https://lists.linuxfoundation.or=
+g/pipermail/bitcoin-dev/2017-December/015385.html</ref>. Any savings here s=
+hould have a large impact given that both full nodes and light clients must=
+ sync the header chain as a first step, and that the time to validate and i=
+ndex the headers is negligible compared to the time spent downloading them =
+from the network. Furthermore, some current implementations of headers sync=
+ing rely on preconfigured checkpoints to discourage attackers attempting to=
+ fill up a victim's disk space with low-work headers. The proposed messages=
+ enable sync strategies that are resilient against these types of attacks. =
+The P2P messages are designed to be flexible, supporting multiple header sy=
+nc strategies and leaving room for future innovations, while also compact.
+>>
+>> =3D=3D Definitions =3D=3D
+>>
+>> ''double-SHA256'' is a hash algorithm defined by two invocations of SHA-=
+256: <code>double-SHA256(x) =3D SHA256(SHA256(x))</code>.
+>>
+>> =3D=3D Specification =3D=3D
+>>
+>> The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHO=
+ULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document a=
+re to be interpreted as described in RFC 2119.
+>>
+>> =3D=3D=3D New Structures =3D=3D=3D
+>>
+>> =3D=3D=3D=3D Compressed Headers =3D=3D=3D=3D
+>>
+>> Bitcoin headers are serialized by default in 80 bytes as follows:
+>>
+>> {| class=3D"wikitable"
+>> ! Field Name
+>> ! Data Type
+>> ! Byte Size
+>> ! Description
+>> |-
+>> | version
+>> | int32_t
+>> | 4
+>> | Block version information
+>> |-
+>> | prev_block
+>> | uint256
+>> | 32
+>> | The hash of the previous block
+>> |-
+>> | merkle_root
+>> | uint256
+>> | 32
+>> | The root hash of the transaction Merkle tree
+>> |-
+>> | timestamp
+>> | uint32_t
+>> | 4
+>> | A Unix timestamp of the block creation time, as reported by the miner
+>> |-
+>> | bits
+>> | uint32_t
+>> | 4
+>> | The calculated difficulty target for this block
+>> |-
+>> | nonce
+>> | uint32_t
+>> | 4
+>> | A nonce that is set such that the header's hash matches the difficulty=
+ target
+>> |}
+>>
+>> When deserializing a correctly-formed sequence of block headers encoded =
+in this way, it can be noted that:
+>>
+>> * The prev_block field should always match the double-SHA256 hash of the=
+ previous header, making it redundant
+>> * According to Bitcoin consensus rules, the bits field only changes ever=
+y 2016 blocks
+>> * The version often matches that of a recent ancestor block
+>> * The timestamp is often a small delta from the preceding header's times=
+tamp
+>>
+>> To take advantage of these possible savings, this document defines a var=
+iable-sized ''compressed encoding'' of block headers that occur in a range.=
+ Note that no savings are possible when serializing a single header; it sho=
+uld only be used for vectors of sequential headers. The full headers are re=
+constructed using data from previous headers in the range. The serializatio=
+n begins with an ''encoding indicator'', which is a bitfield specifying how=
+ each field is serialized. The bits of the indicator have the following sem=
+antics:
+>>
+>> {| class=3D"wikitable"
+>> ! Bit Index
+>> ! Reconstruction
+>> ! Description
+>> |-
+>> | 0
+>> | <code>prev_block[i] =3D DSHA256(header[i-1])</code>
+>> | The prev_block field is ommitted and assigned to the double-SHA256 has=
+h of the previous uncompressed header.
+>> |-
+>> | 1
+>> | <code>nbits[i] =3D nbits[i-1]</code>
+>> | The nbits field is omitted and matches that of the previous header.
+>> |-
+>> | 2
+>> | <code>timestamp[i] =3D timestamp[i-1] + value</code>
+>> | The timestamp is replaced by a 2-byte signed short int, representing a=
+n offset from the previous block's timestamp
+>> |-
+>> | 3
+>> |
+>> | Interpreted along with bits 4 & 5.
+>> |-
+>> | 4
+>> |
+>> | Interpreted along with bits 3 & 5.
+>> |-
+>> | 5
+>> | <code>version[i] =3D version[i - ((bit[3] << 2) + (bit[4] << 1) + bit[=
+5])]</code>
+>> | Bits 3, 4, and 5 are first interpreted as a 3-bit offset, with bit ind=
+ex 3 as the most significant and bit index 5 as the least significant. If t=
+he offset is non-zero, the version field is omitted and assigned to the ver=
+sion of the block at the offset number of blocks prior.
+>> |-
+>> | 6
+>> |
+>> | Reserved.
+>> |-
+>> | 7
+>> |
+>> | Reserved. May be used in a future encoding version to signal another i=
+ndicator byte.
+>> |}
+>>
+>> The compressed header format is versioned by a 256-bit unsigned integer.=
+ This document defines version 0.
+>>
+>> =3D=3D=3D=3D VarInt =3D=3D=3D=3D
+>>
+>> ''VarInt'' is a variable-length unsigned integer encoding that supports =
+a greater range of numbers than the standard ''CompactSize''. This encoding=
+ was introduced at the database layer in Bitcoin Core<ref>https://github.co=
+m/bitcoin/bitcoin/commit/4d6144f97faf9d2a6c89f41d7d2360f21f0b71e2</ref> in =
+2012, but is new to the Bitcoin P2P layer.
+>>
+>> This definition is per the code comments in Bitcoin Core written by Piet=
+er Wuille:
+>>
+>> <pre>
+>> Variable-length integers: bytes are a MSB base-128 encoding of the numbe=
+r.
+>> The high bit in each byte signifies whether another digit follows. To ma=
+ke
+>> the encoding is one-to-one, one is subtracted from all but the last digi=
+t.
+>> Thus, the byte sequence a[] with length len, where all but the last byte
+>> has bit 128 set, encodes the number:
+>>
+>>   (a[len-1] & 0x7F) + sum(i=3D1..len-1, 128^i*((a[len-i-1] & 0x7F)+1))
+>>
+>> Properties:
+>> * Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes)
+>> * Every integer has exactly one encoding
+>> * Encoding does not depend on size of original integer type
+>> * No redundancy: every (infinite) byte sequence corresponds to a list
+>>   of encoded integers.
+>>
+>> 0:         [0x00]  256:        [0x81 0x00]
+>> 1:         [0x01]  16383:      [0xFE 0x7F]
+>> 127:       [0x7F]  16384:      [0xFF 0x00]
+>> 128:  [0x80 0x00]  16511: [0x80 0xFF 0x7F]
+>> 255:  [0x80 0x7F]  65535: [0x82 0xFD 0x7F]
+>> 2^32:           [0x8E 0xFE 0xFE 0xFF 0x00]
+>> </pre>
+>>
+>> =3D=3D=3D=3D Checkpoints =3D=3D=3D=3D
+>>
+>> A ''checkpoint'' is defined for a block as a tuple of its hash and the c=
+hain work:
+>>
+>> {| class=3D"wikitable"
+>> ! Field Name
+>> ! Data Type
+>> ! Byte Size
+>> ! Description
+>> |-
+>> | block_hash
+>> | uint256
+>> | 32
+>> | The hash of the block
+>> |-
+>> | chain_work
+>> | VarInt
+>> | Variable(1-20)
+>> | A delta between the total work in the chain at the checkpoint block an=
+d a previous checkpoint, determined by context
+>> |}
+>>
+>> =3D=3D=3D Service Bit =3D=3D=3D
+>>
+>> This BIP allocates a new service bit:
+>>
+>> {| class=3D"wikitable"
+>> |-
+>> | NODE_HEADERS_V2
+>> | <code>1 << ?</code>
+>> | If enabled, the node MUST respond to <code>getcheckpts</code> and <cod=
+e>getheaders2</code> queries
+>> |}
+>>
+>> =3D=3D=3D New Messages =3D=3D=3D
+>>
+>> =3D=3D=3D=3D getcheckpts =3D=3D=3D=3D
+>> <code>getcheckpts</code> is used to request block headers at a specified=
+ distance from each other which serve as checkpoints during parallel header=
+ download. The message contains the following fields:
+>>
+>> {| class=3D"wikitable"
+>> ! Field Name
+>> ! Data Type
+>> ! Byte Size
+>> ! Description
+>> |-
+>> | block_locator
+>> | uint256[]
+>> | Variable
+>> | A vector of block hashes in descending order by height used to identif=
+y the header chain of the requesting node
+>> |-
+>> | interval
+>> | uint32_t
+>> | 4
+>> | The distance in block height between requested block hashes
+>> |}
+>>
+>> # Nodes SHOULD NOT send <code>getcheckpts</code> unless the peer has set=
+ the <code>NODE_HEADERS_V2</code> service bit
+>> # The hashes in <code>block_locator</code> MUST be in descending order b=
+y block height
+>> # The block locator SHOULD be generated as it is in <code>getheaders</co=
+de> requests
+>> # The receiving node MUST respond to valid requests with a <code>checkpt=
+s</code> response where the interval is the same as in the request and the =
+first checkpoint hash matches the first common block hash in the block loca=
+tor
+>>
+>> =3D=3D=3D=3D checkpts =3D=3D=3D=3D
+>> <code>checkpts</code> is sent in response to <code>getcheckpts</code>, l=
+isting block hashes at the specified interval. The message contains the fol=
+lowing fields:
+>>
+>> {| class=3D"wikitable"
+>> ! Field Name
+>> ! Data Type
+>> ! Byte Size
+>> ! Description
+>> |-
+>> | start_height
+>> | uint32_t
+>> | 4
+>> | The height of the first block in the active chain matching the request=
+'s block locator
+>> |-
+>> | end_height
+>> | uint32_t
+>> | 4
+>> | The height of the last block in the active chain
+>> |-
+>> | start_checkpoint
+>> | Checkpoint
+>> | 48
+>> | The checkpoint structure for the block in the active chain at height s=
+tart_height
+>> |-
+>> | end_checkpoint
+>> | Checkpoint
+>> | 48
+>> | The checkpoint structure for the block in the active chain at height e=
+nd_height
+>> |-
+>> | interval
+>> | uint32_t
+>> | 4
+>> | The distance in block height between checkpoints
+>> |-
+>> | checkpoints_length
+>> | CompactSize
+>> | Variable(1-5)
+>> | The number of checkpoints to follow
+>> |-
+>> | checkpoints
+>> | Checkpoint[]
+>> | checkoints_length * Variable(33-52)
+>> | The checkpoints as specified below
+>> |}
+>>
+>> # The interval SHOULD match the field in the <code>getcheckpts</code> re=
+quest
+>> # The start_checkpoint SHOULD correspond to the first block hash in the =
+locator from the <code>getcheckpts</code> request that is part of the activ=
+e chain
+>> # The end_checkpoint SHOULD correspond to the tip of the node's active c=
+hain
+>> # The start_height MOST be set to the block height of the start_checkpoi=
+nt
+>> # The end_height MOST be set to the block height of the end_checkpoint
+>> # If the interval is zero, the checkpoints vector MUST be empty
+>> # If the interval is non-zero, checkpoints MUST correspond to blocks on =
+the active chain between the start_checkpoint and the end_checkpoint (exclu=
+sive), where the difference in block height between each entry and the prev=
+ious one is equal to the interval
+>> # The checkpoints_length MUST be less than or equal to 2,000
+>> # The node SHOULD include as many checkpoints on its active chain as are=
+ available, up to the limit of 2,000
+>> # The chain_work field in the first checkpoint MUST be the total work in=
+ the chain ending at that block
+>> # The chain_work field in each subsequent checkpoint MUST be the differe=
+nce in chain work between that block and the previous checkpoint
+>> # The chain_work field in each checkpoint MUST be a properly-encoded Var=
+Int, not exceeding 20 bytes
+>>
+>> =3D=3D=3D=3D getheaders2 =3D=3D=3D=3D
+>> <code>getheaders2</code> is used to request compressed headers for a ran=
+ge of blocks. The message contains the following fields:
+>>
+>> {| class=3D"wikitable"
+>> ! Field Name
+>> ! Data Type
+>> ! Byte Size
+>> ! Description
+>> |-
+>> | max_version
+>> | uint8_t
+>> | 1
+>> | The maximum supported encoding version of the headers
+>> |-
+>> | flags
+>> | uint8_t
+>> | 1
+>> | A bitfield of message encoding flags
+>> |-
+>> | start_height
+>> | uint32_t
+>> | 4
+>> | The height of the first block header in the requested range
+>> |-
+>> | end_hash
+>> | uint256
+>> | 32
+>> | The hash of the last block header in the requested range
+>> |}
+>>
+>> # Nodes SHOULD NOT send <code>getheaders2</code> unless the peer has set=
+ the <code>NODE_HEADERS_V2</code> service bit
+>> # The height of the block with hash end_hash MUST be greater than or equ=
+al to start_height, and the difference MUST be strictly less than 3,000
+>> # The end_hash SHOULD match one in a previously received <code>checkpts<=
+/code> message, otherwise the receiving node MAY disconnect
+>> # The 0th bit (least significant order) of the flags field MAY be set to=
+ request the coinbase transaction and merkle branch for the block at height=
+ start_height
+>>
+>> =3D=3D=3D=3D headers2 =3D=3D=3D=3D
+>> <code>headers2</code> is sent in response to <code>getheaders2</code>, l=
+isting the compressed headers in the requested range. The message contains =
+the following fields:
+>>
+>> {| class=3D"wikitable"
+>> ! Field Name
+>> ! Data Type
+>> ! Byte Size
+>> ! Description
+>> |-
+>> | version
+>> | uint8_t
+>> | 1
+>> | The encoding version of the headers
+>> |-
+>> | flags
+>> | uint8_t
+>> | 1
+>> | A bitfield of message encoding flags
+>> |-
+>> | start_height
+>> | uint32_t
+>> | 4
+>> | The height of the first block header returned
+>> |-
+>> | headers_length
+>> | CompactSize
+>> | 1-3
+>> | The number of block headers to follow
+>> |-
+>> | headers
+>> | CompressedHeader[]
+>> | Variable
+>> | The compressed block headers
+>> |-
+>> | start_block_coinbase_tx
+>> | CTransaction
+>> | Variable
+>> | The coinbase transaction in the block at start_height
+>> |-
+>> | start_block_coinbase_branch
+>> | uint256[]
+>> | Variable
+>> | A merkle branch linking the coinbase transaction in the block at start=
+_height to its header
+>> |}
+>>
+>> # The version MUST be less than or equal to the max_version field of the=
+ <code>getheaders2</code> request
+>> # Any bits set in the flags field of the <code>getheaders2</code> reques=
+t MAY be set in the response field
+>> # Any bits not set in the flags field of the <code>getheaders2</code> re=
+quest MUST NOT be set in the response field
+>> # The first header MUST be encoded with a 0-byte indicator (ie. the head=
+er is uncompressed)
+>> # start_height MUST be set to the block height of the first header
+>> # The hash of the last block SHOULD equal the end_hash of the <code>geth=
+eaders2</code> request, ''even if the block is no longer part of the active=
+ chain''
+>> # The length of the headers vector MUST be less than or equal to 3,000
+>> # The headers MUST be sequential in order of height, with each header a =
+successor of the previous one
+>> # Each header SHOULD be optimally compressed
+>> # The start_block_coinbase_tx should be the serialized coinbase transact=
+ion in the block corresponding to the first header
+>> # The start_block_coinbase_branch should be a vector of right-hand-side =
+hashes in the merkle branch linking the coinbase transaction to the first h=
+eader, in order from bottom of the tree to top
+>> # If the 0th bit (least significant order) of the flags field is unset, =
+the start_block_coinbase_tx and start_block_coinbase_branch fields MUST be =
+omitted
+>>
+>> =3D=3D=3D Sync Strategies =3D=3D=3D
+>>
+>> The general header sync protocol for clients now is to first request che=
+ckpoints from all peers with <code>getcheckpts</code>, then decide which pe=
+ers to fetch ranges of headers from and download them with <code>getheaders=
+2</code>.
+>>
+>> =3D=3D=3D=3D Forward Sequential Syncing =3D=3D=3D=3D
+>>
+>> Similar to the current sync protocol, a client may choose one peer to do=
+wnload headers from, then fetch them in forward sequential order. Once this=
+ peer is out of headers, the client performs the same routine with any peer=
+s offering more headers.
+>>
+>> With this strategy, the client is able to fully validate the block heade=
+rs in order and abort if the peer serves an invalid one. On the other hand,=
+ the peer may be able to serve a longer, lower-work chain than the global a=
+ctive chain, wasting the client's time, memory, and storage space.
+>>
+>> =3D=3D=3D=3D Parallel Header Download =3D=3D=3D=3D
+>>
+>> In order to increase the throughput of header downloads, a node may down=
+load multiple header ranges in parallel from all peers serving the same che=
+ckpoints, then validate them in sequential order.
+>>
+>> =3D=3D=3D=3D Random Sampling Proof-of-Work  =3D=3D=3D=3D
+>>
+>> Similar the FlyClient<ref>https://www.youtube.com/watch?time_continue=3D=
+8400&v=3DBPNs9EVxWrA</ref> header download protocol, clients can select the=
+ peer claiming the greatest total work chain and use random sampling to eff=
+iciently determine if the peer is likely to be reporting its chain work hon=
+estly.
+>>
+>> The client treats the checkpoint message as a commitment to chain work o=
+f intermediate ranges of headers, the client then randomly samples ranges o=
+f headers weighted by total work to determine whether the total chain work =
+is valid before downloading all headers. To defend against malicious peers =
+attempting to reuse earlier headers later in the chain to fake greater tota=
+l work, the client should check the block height in the coinbase transactio=
+n for all headers after the BIP 34 activation height. If the peer is found =
+to be dishonest, they can be banned before the client downloads too many he=
+aders, otherwise the client chooses this as the primary sync peer for forwa=
+rd sequential sync or parallel download.
+>>
+>> =3D=3D Rationale =3D=3D
+>>
+>> * '''Why include the coinbase transaction in the headers messages?''' Th=
+e primary reason is that after BIP 34<ref>https://github.com/bitcoin/bips/b=
+lob/master/bip-0034.mediawiki</ref> activation at block height 227,835, coi=
+nbase transactions constitute cryptographic commitments to a block's height=
+ in the chain, which mitigates certain attacks during header sync. Furtherm=
+ore, the <code>getheaders2</code> message can be used as a simple way of re=
+questing a coinbase transaction for a single header, which may be independe=
+ntly useful.
+>>
+>> * '''Why not omit nBits entirely?''' The compression is designed to perm=
+it full decompression of all headers in a <code>headers2</code> message ''w=
+ithout'' requiring any other chain context. This is desirable so that proof=
+s of work may be validated for arbitrary header ranges. While nBits can be =
+computed knowing previous headers, this requires block headers that may not=
+ be sent in the same message.
+>>
+>> =3D=3D Compatibility =3D=3D
+>>
+>> This is backwards compatible, as it defines new P2P messages which are a=
+vailable if a service bit is signaled. There are no changes to consensus ru=
+les.
+>>
+>> =3D=3D Acknowledgements =3D=3D
+>>
+>> Thanks to Gregory Maxwell for suggestions on the compressed header encod=
+ing and the DOS-resistant sync strategies. Thanks to Suhas Daftuar for help=
+ful discussions.
+>>
+>> Credit for the VarInt encoding goes to Pieter Wuille.
+>>
+>>
+>> _______________________________________________
+>> bitcoin-dev mailing list
+>> bitcoin-dev@lists.linuxfoundation.org
+>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
+>>
+>>
+>
+>
+> --
+> Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>
+>
+
+--94eb2c07077a8edf03056896a015
+Content-Type: text/html; charset="UTF-8"
+Content-Transfer-Encoding: quoted-printable
+
+<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div=
+>Thanks for giving it a read and for sparking the discussion with your obse=
+rvation about the 40% savings from dropping prev_hash!</div><div>=C2=A0</di=
+v><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;borde=
+r-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>T=
+hought this wasn&#39;t effective in case overt asic boost get widely adopte=
+d, but then I understood that at the moment only two bits of version get sc=
+rambled by that technique so this looks fine, maybe add a comment about thi=
+s so the reader doesn&#39;t get the same initial doubt I got.<br></div></di=
+v></blockquote><div><br></div><div><span style=3D"color:rgb(34,34,34);font-=
+family:arial,sans-serif;font-size:small;font-style:normal;font-variant-liga=
+tures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal=
+;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wo=
+rd-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:init=
+ial;text-decoration-color:initial;float:none;display:inline">I still need t=
+o compute for historical blocks how many could have an omitted version. Wil=
+l post back with that when I get results. If overt ASIC Boost made this les=
+s effective, that would be unfortunate, but so be it.</span><br></div><div>=
+=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0=
+.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"l=
+tr"><div>My feeling is that encoding of the headers and checkpoints/paralle=
+l download are separate subjects for two BIPS.</div><div>About the checkpoi=
+nts I don&#39;t grasp why they are useful since an attacker could lie about=
+ them but maybe I am missing something...</div></div></blockquote><div><br>=
+</div><div>Yeah, I guess the background wasn&#39;t explained in the BIP its=
+elf. After your original post on the mailing list, there were suggestions t=
+hat instead of modifying the format of existing messages, it would be bette=
+r do create a new headers message. And as long as we&#39;re designing a new=
+ headers message, we should change the semantics to allow parallel download=
+. But if you want to download from peers in parallel, you need to get a sum=
+mary of the blocks that they have. Hence the checkpoints message. So that i=
+s why both of these messages are in the same BIP -- only together can they =
+perform an efficient sync.</div><div><br></div><div>Regarding the reliabili=
+ty of the checkpoints, I think it&#39;s strictly better than what we have n=
+ow. Let&#39;s say a node is connected to 6 honest peers and 2 malicious pee=
+rs. Even if the node does not know which ones are good or bad until it vali=
+dates the headers, it sees that 6 of the peers are on the same chain, and c=
+an download those headers in parallel from 6 different sources. So that&#39=
+;s already a win.</div><div><br></div><div>Taken a step further though, I&#=
+39;m really interested in treating the checkpoints as commitments to chain =
+work and using random sampling to detect lying peers before downloading all=
+ of their headers. So imagine you are connected to two peers, one good one =
+bad, where the good one claims a chain with X total work and the bad one cl=
+aims a chain with Y total work. To determine quickly which is correct, you =
+can randomly sample ranges of headers and check the proofs of work to see w=
+hether it matches what the peer claimed. So basically you pick a checkpoint=
+ at random (weighted by the work delta) which commits to a total amount of =
+work from the last checkpoint, then request all headers in between. If the =
+peer responds with headers with the correct start hash, end hash, and start=
+ height (from the coinbase tx of the first header), then you can be somewha=
+t more confident their total PoW matches the claimed amount.</div><div><br>=
+</div><div>How many times do you need to sample? I don&#39;t know yet, but =
+I&#39;ve heard Benedikt Bunz is exploring this question with his research o=
+n FlyClients [1], which was an inspiration for this.</div><div>=C2=A0</div>=
+<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
+left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><span cl=
+ass=3D"gmail-"><div>Bitfield allows great savings, however the encoding dep=
+ends on the headers height a client ask for, this cause a little computatio=
+nal burden on the node and the undesirable side effect of difficult caching=
+. Variable length encoding cause caching difficulties too...<br></div></spa=
+n><div>A simpler approach could be to encode the headers in groups of 2016 =
+headers (the difficulty period) where the first header is complete and the =
+others 2015 are missing the previous hash and the difficulty, this achieve =
+comparable savings ~45%, allows better caching and has fixed length encodin=
+g. This could be useful for the node by caching headers on a single file on=
+ disk and simply stream out the relative range when requested or to serve t=
+he same encoded headers=C2=A0<span style=3D"color:rgb(34,34,34);font-family=
+:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:=
+normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-=
+align:start;text-indent:0px;text-transform:none;white-space:normal;word-spa=
+cing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;te=
+xt-decoration-color:initial;float:none;display:inline">format</span> in oth=
+er context like http, leveraging http caching infrastructure.</div></div></=
+blockquote><div><br></div><div>I don&#39;t see too much of a problem with c=
+aching. Most node implementations I know of keep all headers in memory anyw=
+ay, often in contiguous segments of RAM for historical headers, so it shoul=
+d be fairly inexpensive to serve queries. Beyond that, the response for a p=
+articular query (start_height, end_hash, encoding version) can be cached, s=
+o if some service wants to precompute max size responses for all start_heig=
+ht multiples of 1,000, they could cache those.</div><div><br></div><div>-Ji=
+m</div><div><br></div><div>[1] <a href=3D"https://www.youtube.com/watch?tim=
+e_continue=3D8400&amp;v=3DBPNs9EVxWrA">https://www.youtube.com/watch?time_c=
+ontinue=3D8400&amp;v=3DBPNs9EVxWrA</a></div><div><br></div><blockquote clas=
+s=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid r=
+gb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extr=
+a"><br><div class=3D"gmail_quote"><span class=3D"gmail-">2018-03-28 1:31 GM=
+T+02:00 Jim Posen via bitcoin-dev <span dir=3D"ltr">&lt;<a href=3D"mailto:b=
+itcoin-dev@lists.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.<=
+wbr>linuxfoundation.org</a>&gt;</span>:<br></span><blockquote class=3D"gmai=
+l_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,20=
+4,204);padding-left:1ex"><div><div class=3D"gmail-h5"><div dir=3D"ltr">Base=
+d on some ideas that were thrown around in this thread (<a href=3D"https://=
+lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-December/015385.html" =
+target=3D"_blank">https://lists.linuxfoundation<wbr>.org/pipermail/bitcoin-=
+dev/<wbr>2017-December/015385.html</a>), I have been working on a P2P exten=
+sion that will allow faster header sync mechanisms. The one-sentence summar=
+y is that by encoding headers more efficiently (eg. omitting prev_hash) and=
+ downloading evenly spaced checkpoints throughout history (say every 1,000t=
+h) from all peers first, we could speed up header sync, which would be a hu=
+ge improvement for light clients.=C2=A0Here is a draft of the BIP:=C2=A0<a =
+href=3D"https://github.com/jimpo/bips/blob/headers-sync/headersv2.mediawiki=
+" target=3D"_blank">https://github.com/jimpo/<wbr>bips/blob/headers-sync/he=
+aders<wbr>v2.mediawiki</a>. The full text is below as well.<div><div><br></=
+div><div>I&#39;d love to hear any feedback people have.</div></div><div><br=
+></div><div>------------------------------<wbr>----------------------------=
+</div><div><br></div><div><pre style=3D"color:rgb(0,0,0);font-style:normal;=
+font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;lett=
+er-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word=
+-spacing:0px;text-decoration-style:initial;text-decoration-color:initial;wo=
+rd-wrap:break-word;white-space:pre-wrap">=3D=3D Abstract =3D=3D
+
+This BIP describes a P2P network extension enabling faster, more reliable m=
+ethods for syncing the block header chain. New P2P messages are proposed as=
+ more efficient replacements for &lt;code&gt;getheaders&lt;/code&gt; and &l=
+t;code&gt;headers&lt;/code&gt; during initial block download. The proposed =
+header download protocol reduces bandwidth usage by ~40%-50% and supports d=
+ownloading headers ranges from multiple peers in parallel, which is not pos=
+sible with the current mechanism. This also enables sync strategies with be=
+tter resistance to denial-of-service attacks.
+
+=3D=3D Motivation =3D=3D
+
+Since 2015, optimized Bitcoin clients fetch all block headers before blocks=
+ themselves in order to avoid downloading ones that are not part of the mos=
+t work chain. The protocol currently in use for fetching headers leaves roo=
+m for further optimization, specifically by compressing header data and dow=
+nloading more headers simulaneously&lt;ref&gt;<a href=3D"https://lists.linu=
+xfoundation.org/pipermail/bitcoin-dev/2017-December/015385.html" target=3D"=
+_blank">https://list<wbr>s.linuxfoundation.org/pipermai<wbr>l/bitcoin-dev/2=
+017-December/<wbr>015385.html</a>&lt;/ref&gt;. Any savings here should have=
+ a large impact given that both full nodes and light clients must sync the =
+header chain as a first step, and that the time to validate and index the h=
+eaders is negligible compared to the time spent downloading them from the n=
+etwork. Furthermore, some current implementations of headers syncing rely o=
+n preconfigured checkpoints to discourage attackers attempting to fill up a=
+ victim&#39;s disk space with low-work headers. The proposed messages enabl=
+e sync strategies that are resilient against these types of attacks. The P2=
+P messages are designed to be flexible, supporting multiple header sync str=
+ategies and leaving room for future innovations, while also compact.
+
+=3D=3D Definitions =3D=3D
+
+&#39;&#39;double-SHA256&#39;&#39; is a hash algorithm defined by two invoca=
+tions of SHA-256: &lt;code&gt;double-SHA256(x) =3D SHA256(SHA256(x))&lt;/co=
+de&gt;.
+
+=3D=3D Specification =3D=3D
+
+The key words &quot;MUST&quot;, &quot;MUST NOT&quot;, &quot;REQUIRED&quot;,=
+ &quot;SHALL&quot;, &quot;SHALL NOT&quot;, &quot;SHOULD&quot;, &quot;SHOULD=
+ NOT&quot;, &quot;RECOMMENDED&quot;, &quot;MAY&quot;, and &quot;OPTIONAL&qu=
+ot; in this document are to be interpreted as described in RFC 2119.
+
+=3D=3D=3D New Structures =3D=3D=3D
+
+=3D=3D=3D=3D Compressed Headers =3D=3D=3D=3D
+
+Bitcoin headers are serialized by default in 80 bytes as follows:
+
+{| class=3D&quot;wikitable&quot;
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| version
+| int32_t
+| 4
+| Block version information
+|-
+| prev_block
+| uint256
+| 32
+| The hash of the previous block
+|-
+| merkle_root
+| uint256
+| 32
+| The root hash of the transaction Merkle tree
+|-
+| timestamp
+| uint32_t
+| 4
+| A Unix timestamp of the block creation time, as reported by the miner
+|-
+| bits
+| uint32_t
+| 4
+| The calculated difficulty target for this block
+|-
+| nonce
+| uint32_t
+| 4
+| A nonce that is set such that the header&#39;s hash matches the difficult=
+y target
+|}
+
+When deserializing a correctly-formed sequence of block headers encoded in =
+this way, it can be noted that:
+
+* The prev_block field should always match the double-SHA256 hash of the pr=
+evious header, making it redundant
+* According to Bitcoin consensus rules, the bits field only changes every 2=
+016 blocks
+* The version often matches that of a recent ancestor block
+* The timestamp is often a small delta from the preceding header&#39;s time=
+stamp
+
+To take advantage of these possible savings, this document defines a variab=
+le-sized &#39;&#39;compressed encoding&#39;&#39; of block headers that occu=
+r in a range. Note that no savings are possible when serializing a single h=
+eader; it should only be used for vectors of sequential headers. The full h=
+eaders are reconstructed using data from previous headers in the range. The=
+ serialization begins with an &#39;&#39;encoding indicator&#39;&#39;, which=
+ is a bitfield specifying how each field is serialized. The bits of the ind=
+icator have the following semantics:
+
+{| class=3D&quot;wikitable&quot;
+! Bit Index
+! Reconstruction
+! Description
+|-
+| 0
+| &lt;code&gt;prev_block[i] =3D DSHA256(header[i-1])&lt;/code&gt;
+| The prev_block field is ommitted and assigned to the double-SHA256 hash o=
+f the previous uncompressed header.
+|-
+| 1
+| &lt;code&gt;nbits[i] =3D nbits[i-1]&lt;/code&gt;
+| The nbits field is omitted and matches that of the previous header.
+|-
+| 2
+| &lt;code&gt;timestamp[i] =3D timestamp[i-1] + value&lt;/code&gt;
+| The timestamp is replaced by a 2-byte signed short int, representing an o=
+ffset from the previous block&#39;s timestamp
+|-
+| 3
+|
+| Interpreted along with bits 4 &amp; 5.
+|-
+| 4
+|
+| Interpreted along with bits 3 &amp; 5.
+|-
+| 5
+| &lt;code&gt;version[i] =3D version[i - ((bit[3] &lt;&lt; 2) + (bit[4] &lt=
+;&lt; 1) + bit[5])]&lt;/code&gt;
+| Bits 3, 4, and 5 are first interpreted as a 3-bit offset, with bit index =
+3 as the most significant and bit index 5 as the least significant. If the =
+offset is non-zero, the version field is omitted and assigned to the versio=
+n of the block at the offset number of blocks prior.
+|-
+| 6
+|
+| Reserved.
+|-
+| 7
+|
+| Reserved. May be used in a future encoding version to signal another indi=
+cator byte.
+|}
+
+The compressed header format is versioned by a 256-bit unsigned integer. Th=
+is document defines version 0.
+
+=3D=3D=3D=3D VarInt =3D=3D=3D=3D
+
+&#39;&#39;VarInt&#39;&#39; is a variable-length unsigned integer encoding t=
+hat supports a greater range of numbers than the standard &#39;&#39;Compact=
+Size&#39;&#39;. This encoding was introduced at the database layer in Bitco=
+in Core&lt;ref&gt;<a href=3D"https://github.com/bitcoin/bitcoin/commit/4d61=
+44f97faf9d2a6c89f41d7d2360f21f0b71e2" target=3D"_blank">https://github.com/=
+bi<wbr>tcoin/bitcoin/commit/4d6144f97<wbr>faf9d2a6c89f41d7d2360f21f0b71e<wb=
+r>2</a>&lt;/ref&gt; in 2012, but is new to the Bitcoin P2P layer.
+
+This definition is per the code comments in Bitcoin Core written by Pieter =
+Wuille:
+
+&lt;pre&gt;
+Variable-length integers: bytes are a MSB base-128 encoding of the number.
+The high bit in each byte signifies whether another digit follows. To make
+the encoding is one-to-one, one is subtracted from all but the last digit.
+Thus, the byte sequence a[] with length len, where all but the last byte
+has bit 128 set, encodes the number:
+
+  (a[len-1] &amp; 0x7F) + sum(i=3D1..len-1, 128^i*((a[len-i-1] &amp; 0x7F)+=
+1))
+
+Properties:
+* Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes)
+* Every integer has exactly one encoding
+* Encoding does not depend on size of original integer type
+* No redundancy: every (infinite) byte sequence corresponds to a list
+  of encoded integers.
+
+0:         [0x00]  256:        [0x81 0x00]
+1:         [0x01]  16383:      [0xFE 0x7F]
+127:       [0x7F]  16384:      [0xFF 0x00]
+128:  [0x80 0x00]  16511: [0x80 0xFF 0x7F]
+255:  [0x80 0x7F]  65535: [0x82 0xFD 0x7F]
+2^32:           [0x8E 0xFE 0xFE 0xFF 0x00]
+&lt;/pre&gt;
+
+=3D=3D=3D=3D Checkpoints =3D=3D=3D=3D
+
+A &#39;&#39;checkpoint&#39;&#39; is defined for a block as a tuple of its h=
+ash and the chain work:
+
+{| class=3D&quot;wikitable&quot;
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| block_hash
+| uint256
+| 32
+| The hash of the block
+|-
+| chain_work
+| VarInt
+| Variable(1-20)
+| A delta between the total work in the chain at the checkpoint block and a=
+ previous checkpoint, determined by context
+|}
+
+=3D=3D=3D Service Bit =3D=3D=3D
+
+This BIP allocates a new service bit:
+
+{| class=3D&quot;wikitable&quot;
+|-
+| NODE_HEADERS_V2
+| &lt;code&gt;1 &lt;&lt; ?&lt;/code&gt;
+| If enabled, the node MUST respond to &lt;code&gt;getcheckpts&lt;/code&gt;=
+ and &lt;code&gt;getheaders2&lt;/code&gt; queries
+|}
+
+=3D=3D=3D New Messages =3D=3D=3D
+
+=3D=3D=3D=3D getcheckpts =3D=3D=3D=3D
+&lt;code&gt;getcheckpts&lt;/code&gt; is used to request block headers at a =
+specified distance from each other which serve as checkpoints during parall=
+el header download. The message contains the following fields:
+
+{| class=3D&quot;wikitable&quot;
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| block_locator
+| uint256[]
+| Variable
+| A vector of block hashes in descending order by height used to identify t=
+he header chain of the requesting node
+|-
+| interval
+| uint32_t
+| 4
+| The distance in block height between requested block hashes
+|}
+
+# Nodes SHOULD NOT send &lt;code&gt;getcheckpts&lt;/code&gt; unless the pee=
+r has set the &lt;code&gt;NODE_HEADERS_V2&lt;/code&gt; service bit
+# The hashes in &lt;code&gt;block_locator&lt;/code&gt; MUST be in descendin=
+g order by block height
+# The block locator SHOULD be generated as it is in &lt;code&gt;getheaders&=
+lt;/code&gt; requests
+# The receiving node MUST respond to valid requests with a &lt;code&gt;chec=
+kpts&lt;/code&gt; response where the interval is the same as in the request=
+ and the first checkpoint hash matches the first common block hash in the b=
+lock locator
+
+=3D=3D=3D=3D checkpts =3D=3D=3D=3D
+&lt;code&gt;checkpts&lt;/code&gt; is sent in response to &lt;code&gt;getche=
+ckpts&lt;/code&gt;, listing block hashes at the specified interval. The mes=
+sage contains the following fields:
+
+{| class=3D&quot;wikitable&quot;
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| start_height
+| uint32_t
+| 4
+| The height of the first block in the active chain matching the request&#3=
+9;s block locator
+|-
+| end_height
+| uint32_t
+| 4
+| The height of the last block in the active chain
+|-
+| start_checkpoint
+| Checkpoint
+| 48
+| The checkpoint structure for the block in the active chain at height star=
+t_height
+|-
+| end_checkpoint
+| Checkpoint
+| 48
+| The checkpoint structure for the block in the active chain at height end_=
+height
+|-
+| interval
+| uint32_t
+| 4
+| The distance in block height between checkpoints
+|-
+| checkpoints_length
+| CompactSize
+| Variable(1-5)
+| The number of checkpoints to follow
+|-
+| checkpoints
+| Checkpoint[]
+| checkoints_length * Variable(33-52)
+| The checkpoints as specified below
+|}
+
+# The interval SHOULD match the field in the &lt;code&gt;getcheckpts&lt;/co=
+de&gt; request
+# The start_checkpoint SHOULD correspond to the first block hash in the loc=
+ator from the &lt;code&gt;getcheckpts&lt;/code&gt; request that is part of =
+the active chain
+# The end_checkpoint SHOULD correspond to the tip of the node&#39;s active =
+chain
+# The start_height MOST be set to the block height of the start_checkpoint
+# The end_height MOST be set to the block height of the end_checkpoint
+# If the interval is zero, the checkpoints vector MUST be empty
+# If the interval is non-zero, checkpoints MUST correspond to blocks on the=
+ active chain between the start_checkpoint and the end_checkpoint (exclusiv=
+e), where the difference in block height between each entry and the previou=
+s one is equal to the interval
+# The checkpoints_length MUST be less than or equal to 2,000
+# The node SHOULD include as many checkpoints on its active chain as are av=
+ailable, up to the limit of 2,000
+# The chain_work field in the first checkpoint MUST be the total work in th=
+e chain ending at that block
+# The chain_work field in each subsequent checkpoint MUST be the difference=
+ in chain work between that block and the previous checkpoint
+# The chain_work field in each checkpoint MUST be a properly-encoded VarInt=
+, not exceeding 20 bytes
+
+=3D=3D=3D=3D getheaders2 =3D=3D=3D=3D
+&lt;code&gt;getheaders2&lt;/code&gt; is used to request compressed headers =
+for a range of blocks. The message contains the following fields:
+
+{| class=3D&quot;wikitable&quot;
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| max_version
+| uint8_t
+| 1
+| The maximum supported encoding version of the headers
+|-
+| flags
+| uint8_t
+| 1
+| A bitfield of message encoding flags
+|-
+| start_height
+| uint32_t
+| 4
+| The height of the first block header in the requested range
+|-
+| end_hash
+| uint256
+| 32
+| The hash of the last block header in the requested range
+|}
+
+# Nodes SHOULD NOT send &lt;code&gt;getheaders2&lt;/code&gt; unless the pee=
+r has set the &lt;code&gt;NODE_HEADERS_V2&lt;/code&gt; service bit
+# The height of the block with hash end_hash MUST be greater than or equal =
+to start_height, and the difference MUST be strictly less than 3,000
+# The end_hash SHOULD match one in a previously received &lt;code&gt;checkp=
+ts&lt;/code&gt; message, otherwise the receiving node MAY disconnect
+# The 0th bit (least significant order) of the flags field MAY be set to re=
+quest the coinbase transaction and merkle branch for the block at height st=
+art_height
+
+=3D=3D=3D=3D headers2 =3D=3D=3D=3D
+&lt;code&gt;headers2&lt;/code&gt; is sent in response to &lt;code&gt;gethea=
+ders2&lt;/code&gt;, listing the compressed headers in the requested range. =
+The message contains the following fields:
+
+{| class=3D&quot;wikitable&quot;
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| version
+| uint8_t
+| 1
+| The encoding version of the headers
+|-
+| flags
+| uint8_t
+| 1
+| A bitfield of message encoding flags
+|-
+| start_height
+| uint32_t
+| 4
+| The height of the first block header returned
+|-
+| headers_length
+| CompactSize
+| 1-3
+| The number of block headers to follow
+|-
+| headers
+| CompressedHeader[]
+| Variable
+| The compressed block headers
+|-
+| start_block_coinbase_tx
+| CTransaction
+| Variable
+| The coinbase transaction in the block at start_height
+|-
+| start_block_coinbase_branch
+| uint256[]
+| Variable
+| A merkle branch linking the coinbase transaction in the block at start_he=
+ight to its header
+|}
+
+# The version MUST be less than or equal to the max_version field of the &l=
+t;code&gt;getheaders2&lt;/code&gt; request
+# Any bits set in the flags field of the &lt;code&gt;getheaders2&lt;/code&g=
+t; request MAY be set in the response field
+# Any bits not set in the flags field of the &lt;code&gt;getheaders2&lt;/co=
+de&gt; request MUST NOT be set in the response field
+# The first header MUST be encoded with a 0-byte indicator (ie. the header =
+is uncompressed)
+# start_height MUST be set to the block height of the first header
+# The hash of the last block SHOULD equal the end_hash of the &lt;code&gt;g=
+etheaders2&lt;/code&gt; request, &#39;&#39;even if the block is no longer p=
+art of the active chain&#39;&#39;
+# The length of the headers vector MUST be less than or equal to 3,000
+# The headers MUST be sequential in order of height, with each header a suc=
+cessor of the previous one
+# Each header SHOULD be optimally compressed
+# The start_block_coinbase_tx should be the serialized coinbase transaction=
+ in the block corresponding to the first header
+# The start_block_coinbase_branch should be a vector of right-hand-side has=
+hes in the merkle branch linking the coinbase transaction to the first head=
+er, in order from bottom of the tree to top
+# If the 0th bit (least significant order) of the flags field is unset, the=
+ start_block_coinbase_tx and start_block_coinbase_branch fields MUST be omi=
+tted
+
+=3D=3D=3D Sync Strategies =3D=3D=3D
+
+The general header sync protocol for clients now is to first request checkp=
+oints from all peers with &lt;code&gt;getcheckpts&lt;/code&gt;, then decide=
+ which peers to fetch ranges of headers from and download them with &lt;cod=
+e&gt;getheaders2&lt;/code&gt;.
+
+=3D=3D=3D=3D Forward Sequential Syncing =3D=3D=3D=3D
+
+Similar to the current sync protocol, a client may choose one peer to downl=
+oad headers from, then fetch them in forward sequential order. Once this pe=
+er is out of headers, the client performs the same routine with any peers o=
+ffering more headers.
+
+With this strategy, the client is able to fully validate the block headers =
+in order and abort if the peer serves an invalid one. On the other hand, th=
+e peer may be able to serve a longer, lower-work chain than the global acti=
+ve chain, wasting the client&#39;s time, memory, and storage space.
+
+=3D=3D=3D=3D Parallel Header Download =3D=3D=3D=3D
+
+In order to increase the throughput of header downloads, a node may downloa=
+d multiple header ranges in parallel from all peers serving the same checkp=
+oints, then validate them in sequential order.
+
+=3D=3D=3D=3D Random Sampling Proof-of-Work  =3D=3D=3D=3D
+
+Similar the FlyClient&lt;ref&gt;<a href=3D"https://www.youtube.com/watch?ti=
+me_continue=3D8400&amp;v=3DBPNs9EVxWrA" target=3D"_blank">https://www.yout<=
+wbr>ube.com/watch?time_continue=3D<wbr>8400&amp;v=3DBPNs9EVxWrA</a>&lt;/ref=
+&gt; header download protocol, clients can select the peer claiming the gre=
+atest total work chain and use random sampling to efficiently determine if =
+the peer is likely to be reporting its chain work honestly.
+
+The client treats the checkpoint message as a commitment to chain work of i=
+ntermediate ranges of headers, the client then randomly samples ranges of h=
+eaders weighted by total work to determine whether the total chain work is =
+valid before downloading all headers. To defend against malicious peers att=
+empting to reuse earlier headers later in the chain to fake greater total w=
+ork, the client should check the block height in the coinbase transaction f=
+or all headers after the BIP 34 activation height. If the peer is found to =
+be dishonest, they can be banned before the client downloads too many heade=
+rs, otherwise the client chooses this as the primary sync peer for forward =
+sequential sync or parallel download.
+
+=3D=3D Rationale =3D=3D
+
+* &#39;&#39;&#39;Why include the coinbase transaction in the headers messag=
+es?&#39;&#39;&#39; The primary reason is that after BIP 34&lt;ref&gt;<a hre=
+f=3D"https://github.com/bitcoin/bips/blob/master/bip-0034.mediawiki" target=
+=3D"_blank">https://github.com/bitc<wbr>oin/bips/blob/master/bip-0034.<wbr>=
+mediawiki</a>&lt;/ref&gt; activation at block height 227,835, coinbase tran=
+sactions constitute cryptographic commitments to a block&#39;s height in th=
+e chain, which mitigates certain attacks during header sync. Furthermore, t=
+he &lt;code&gt;getheaders2&lt;/code&gt; message can be used as a simple way=
+ of requesting a coinbase transaction for a single header, which may be ind=
+ependently useful.
+
+* &#39;&#39;&#39;Why not omit nBits entirely?&#39;&#39;&#39; The compressio=
+n is designed to permit full decompression of all headers in a &lt;code&gt;=
+headers2&lt;/code&gt; message &#39;&#39;without&#39;&#39; requiring any oth=
+er chain context. This is desirable so that proofs of work may be validated=
+ for arbitrary header ranges. While nBits can be computed knowing previous =
+headers, this requires block headers that may not be sent in the same messa=
+ge.
+
+=3D=3D Compatibility =3D=3D
+
+This is backwards compatible, as it defines new P2P messages which are avai=
+lable if a service bit is signaled. There are no changes to consensus rules=
+.
+
+=3D=3D Acknowledgements =3D=3D
+
+Thanks to Gregory Maxwell for suggestions on the compressed header encoding=
+ and the DOS-resistant sync strategies. Thanks to Suhas Daftuar for helpful=
+ discussions.
+
+Credit for the VarInt encoding goes to Pieter Wuille.</pre></div></div>
+<br></div></div><span class=3D"gmail-">______________________________<wbr>_=
+________________<br>
+bitcoin-dev mailing list<br>
+<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">=
+bitcoin-dev@lists.linuxfoundat<wbr>ion.org</a><br>
+<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
+rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org=
+/mailman/listinfo/bitcoin-d<wbr>ev</a><br>
+<br></span></blockquote></div><span class=3D"gmail-"><br><br clear=3D"all">=
+<div><br></div>-- <br><div class=3D"gmail-m_4501160072387634139gmail_signat=
+ure"><div dir=3D"ltr">Riccardo Casatta - <a href=3D"https://twitter.com/RCa=
+satta" target=3D"_blank">@RCasatta</a></div></div>
+</span></div></div>
+</blockquote></div><br></div></div>
+
+--94eb2c07077a8edf03056896a015--
+
author	Jim Posen <jim.posen@gmail.com>	2018-03-29 17:50:30 -0700
committer	bitcoindev <bitcoindev@gnusha.org>	2018-03-30 00:50:35 +0000
commit	756f9be4c5e5bf60a194f85e4012aa5fec090faf (patch)
tree	0d4c2990ee0f76b52ecd2d989c03038f948601ba
parent	c951a57b4e4de9908fb845f63f4d7bcaa2822a2d (diff)
download	pi-bitcoindev-756f9be4c5e5bf60a194f85e4012aa5fec090faf.tar.gz pi-bitcoindev-756f9be4c5e5bf60a194f85e4012aa5fec090faf.zip