summaryrefslogtreecommitdiff
path: root/ea/a256dc011103e4470e564a75cc19590bb36dea
blob: b41c5af98d2582554b038ee5c67fc534a169419f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
Return-Path: <lf-lists@mattcorallo.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 0EB4F323
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Wed, 18 May 2016 01:49:15 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from mail.bluematt.me (mail.bluematt.me [192.241.179.72])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id B70D7199
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Wed, 18 May 2016 01:49:13 +0000 (UTC)
Received: from [172.17.0.2] (gw.vpn.bluematt.me [162.243.132.6])
	by mail.bluematt.me (Postfix) with ESMTPSA id 3562C6602C;
	Wed, 18 May 2016 01:49:12 +0000 (UTC)
To: Pieter Wuille <pieter.wuille@gmail.com>,
	Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
References: <5727D102.1020807@mattcorallo.com> <5730C37E.2000004@gmail.com>
From: Matt Corallo <lf-lists@mattcorallo.com>
Message-ID: <573BCA16.2050704@mattcorallo.com>
Date: Wed, 18 May 2016 01:49:10 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
	Thunderbird/38.7.0
MIME-Version: 1.0
In-Reply-To: <5730C37E.2000004@gmail.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Subject: Re: [bitcoin-dev] Compact Block Relay BIP
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2016 01:49:15 -0000

Implemented a few of your suggestions.

Also opened a formal pull request for the BIP at
https://github.com/bitcoin/bips/pull/389 and the code at
https://github.com/bitcoin/bitcoin/pull/8068.

On 05/09/16 17:06, Pieter Wuille via bitcoin-dev wrote:
> On 05/03/2016 12:13 AM, lf-lists at mattcorallo.com (Matt Corallo) wrote:
>> Hi all,
>>
>> The following is a BIP-formatted design spec for compact block relay
>> designed to limit on wire bytes during block relay. You can find the
>> latest version of this document at
>> https://github.com/TheBlueMatt/bips/blob/master/bip-TODO.mediawiki.
> 
> Hi Matt,
> 
> thank you for working on this!
> 
>> ===New data structures===
>> Several new data structures are added to the P2P network to relay
>> compact blocks: PrefilledTransaction, HeaderAndShortIDs,
>> BlockTransactionsRequest, and BlockTransactions. Additionally, we
>> introduce a new variable-length integer encoding for use in these data
>> structures.
>>
>> For the purposes of this section, CompactSize refers to the
>> variable-length integer encoding used across the existing P2P protocol
>> to encode array lengths, among other things, in 1, 3, 5 or 9 bytes.
> 
> This is a not, but I think it's a bit strange to have two separate
> variable length integers in the same specification. I understand is one
> is already the default for variable-length integers currently, and there
> are reasons to use the other one for efficiency reasons in some places,
> but perhaps we should aim to get everything using the latter?

Fixed, the whole thing now uses New Varints.

>> ====New VarInt====
>> Variable-length integers: bytes are a MSB base-128 encoding of the number.
>> The high bit in each byte signifies whether another digit follows. To make
>> sure the encoding is one-to-one, one is subtracted from all but the last
>> digit.
> 
> Maybe it's worth mentioning that it is based on ASN.1 BER's compressed
> integer format (see
> https://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf
> section 8.1.3.5), though with a small modification to make every integer
> have a single unique encoding.
> 
>> ====HeaderAndShortIDs====
>> A HeaderAndShortIDs structure is used to relay a block header, the short
>> transactions IDs used for matching already-available transactions, and a
>> select few transactions which we expect a peer may be missing.
>>
>> |shortids||List of uint64_ts||8*shortids_length bytes||Little
>> Endian||The short transaction IDs calculated from the transactions which
>> were not provided explicitly in prefilledtxn
> 
> I tried to derive what length of short ids is actually necessary (some
> write-up is on
> https://gist.github.com/sipa/b2eb2e486156b5509ac711edd16153ed but it's
> incomplete).
> 
> For any reasonable numbers I can come up with (in a very wide range),
> the number of bits needed is very well approximated by:
> 
>   log2(#receiver_mempool_txn * #block_txn_not_in_receiver_mempool /
> acceptable_per_block_failure_rate)
> 
> For example, with 20000 mempool transactions, 2500 transactions in a
> block, 95% hitrate, and a chance of 1 in 10000 blocks to fail to
> reconstruct, needed_bits = log2(20000 * 2500 * (1 - 0.95) / 0.0001) =
> 34.54, or 5 byte txids would suffice.
> 
> Note that 1 in 10000 failures may sound like a lot, but this is for each
> individual connection, and since every transmission uses separately
> salted identifiers, occasional failures should not affect global
> propagation. Given that transmission failures due to timeouts, network
> connectivity, ... already occur much more frequently than once every few
> gigabytes (what 10000 blocks corresponds to), that's probably already
> more than enough.
> 
> In short: I believe 5 or 6 byte txids should be enough, but perhaps it
> makes sense to allow the sender to choose (so he can weigh trying
> multiple nonces against increasing the short txid length).

I switched to 6-byte short txids.

>> ====Short transaction IDs====
>> Short transaction IDs are used to represent a transaction without
>> sending a full 256-bit hash. They are calculated by:
>> # single-SHA256 hashing the block header with the nonce appended (in
>> little-endian)
>> # XORing each 8-byte chunk of the double-SHA256 transaction hash with
>> each corresponding 8-byte chunk of the hash from the previous step
>> # Adding each of the XORed 8-byte chunks together (in little-endian)
>> iteratively to find the short transaction ID
> 
> An alternative would be using SipHash-1-3 (a form of SipHash with
> reduced iteration counts; the default is SipHash-2-4). SipHash was
> designed as a Message Authentication Code, where the security
> requirements are much stronger than in our case (in particular, we don't
> care about observers being able to finding the key, as the key is just
> public knowledge here). One of the designers of SipHash has commented
> that SipHash-1-3 for collision resistance in hash tables may be enough:
> https://github.com/rust-lang/rust/issues/29754#issuecomment-156073946
> 
> Using SipHash-1-3 on modern hardware would take ~32 CPU cycles per txid.

Switched to SipHash2-4.

>> ===Implementation Notes===
> 
> There are a few more heuristics that MAY be used to improve performance:
> 
> * Receivers should treat short txids in blocks that match multiple
> mempool transactions as non-matches, and request the transactions. This
> significantly reduces the failure to reconstruct.

Done.

> * When constructing a compact block to send, the sender can verify it
> against its own mempool to check for collisions, and if so, choose to
> either try another nonce, or increase the short txid length.

Additionally we should compare to the orphan pool (which apparently
helps a lot).