summaryrefslogtreecommitdiff
path: root/a7/41e873d9c36e40f588f4fa008dec771bc5821a
blob: 2fd7d1c1bed61b3f4490f4c4f7794ca80a74eba7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
Return-Path: <pieter.wuille@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 7DC91411
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  9 May 2016 17:06:10 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-pf0-f175.google.com (mail-pf0-f175.google.com
	[209.85.192.175])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id A5BBC145
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  9 May 2016 17:06:09 +0000 (UTC)
Received: by mail-pf0-f175.google.com with SMTP id 206so78578995pfu.0
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon, 09 May 2016 10:06:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=subject:references:to:from:message-id:date:user-agent:mime-version
	:in-reply-to:content-transfer-encoding;
	bh=6wSQVfR2Rd/Z0LERx9QbLpR1j9Mz4Vn8ZUBynUdLa04=;
	b=Zd/+dBjMIxaWE7tjvmN0OvnvDxa54XH1R/zAO+UApwLerMJgBSXl4yN9hWbWuu32G0
	cfOWUsp+mDdiHuSneT/vaTOmah/kgs7ouds6UGV3DaZLmbeY2UklE5taddQJzbewmqtw
	77szYpPgPG51qgGiyAJrCzRZEsI3e9OoGWVMj6GMlHkhdkEA6A3j39J8cgxXLn/QpE+t
	ZY1ip29ynDqDqBTdbfpuw0HLYiCUuMMXQqKyT6r3HBctr2V8zuvDKRoMLIDOf5oRqqtq
	z45UIBOFnbnKrE9OWCsxW4fj0XYCZ5s5Tp/y+0UQM92U7AecfQNeHxY7DrlYFSlsch0X
	/SPw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:subject:references:to:from:message-id:date
	:user-agent:mime-version:in-reply-to:content-transfer-encoding;
	bh=6wSQVfR2Rd/Z0LERx9QbLpR1j9Mz4Vn8ZUBynUdLa04=;
	b=jVdpQdzLssU3Jmkadg3KTiW5+hljDXdftx3UG/IobM/mALvukQcpQYd76Br4wF2tw8
	aObMqMgtdVfljhq8cxyJWgdrqGWAH3rMNUPuBrnf6KbRf+5Bbiy3CfUIwmSNolkRFrJd
	Ntb+HReiRXRWjUvmhas/E1XWMjE9xYVrhLNH57wh8b2fjwNQgYqpBADdxaj76XNGvRQX
	334+aRrUy5Y2x1uPM7ySsLuc1E9Qh8VD1jxioe0rWMXHWs75fb0I1pbwH/cmt/Tx6lS6
	7VRmy0xnmZD+fi/O5S9syQVkR529joH5u++MT+p/5G1Wns7O30R/cy4H6j+U+RA1ySDj
	Gtvg==
X-Gm-Message-State: AOPr4FXDc7mJeeuyqcabWow0ezC9/+mqmGGvp0j3a1vqqbqK7wYeDjP+IOLAIPS9ydj4Jg==
X-Received: by 10.98.95.4 with SMTP id t4mr51994988pfb.162.1462813569285;
	Mon, 09 May 2016 10:06:09 -0700 (PDT)
Received: from [10.21.103.192] (c-98-234-64-218.hsd1.ca.comcast.net.
	[98.234.64.218]) by smtp.googlemail.com with ESMTPSA id
	p187sm17224187pfb.3.2016.05.09.10.06.07
	for <bitcoin-dev@lists.linuxfoundation.org>
	(version=TLSv1/SSLv3 cipher=OTHER);
	Mon, 09 May 2016 10:06:07 -0700 (PDT)
References: <5727D102.1020807@mattcorallo.com>
To: bitcoin-dev@lists.linuxfoundation.org
From: Pieter Wuille <pieter.wuille@gmail.com>
Message-ID: <5730C37E.2000004@gmail.com>
Date: Mon, 9 May 2016 19:06:06 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
	Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <5727D102.1020807@mattcorallo.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM,
	RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Subject: Re: [bitcoin-dev] Compact Block Relay BIP
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Mon, 09 May 2016 17:06:10 -0000

On 05/03/2016 12:13 AM, lf-lists at mattcorallo.com (Matt Corallo) wrote:
> Hi all,
> 
> The following is a BIP-formatted design spec for compact block relay
> designed to limit on wire bytes during block relay. You can find the
> latest version of this document at
> https://github.com/TheBlueMatt/bips/blob/master/bip-TODO.mediawiki.

Hi Matt,

thank you for working on this!

> ===New data structures===
> Several new data structures are added to the P2P network to relay
> compact blocks: PrefilledTransaction, HeaderAndShortIDs,
> BlockTransactionsRequest, and BlockTransactions. Additionally, we
> introduce a new variable-length integer encoding for use in these data
> structures.
> 
> For the purposes of this section, CompactSize refers to the
> variable-length integer encoding used across the existing P2P protocol
> to encode array lengths, among other things, in 1, 3, 5 or 9 bytes.

This is a not, but I think it's a bit strange to have two separate
variable length integers in the same specification. I understand is one
is already the default for variable-length integers currently, and there
are reasons to use the other one for efficiency reasons in some places,
but perhaps we should aim to get everything using the latter?

> ====New VarInt====
> Variable-length integers: bytes are a MSB base-128 encoding of the number.
> The high bit in each byte signifies whether another digit follows. To make
> sure the encoding is one-to-one, one is subtracted from all but the last
> digit.

Maybe it's worth mentioning that it is based on ASN.1 BER's compressed
integer format (see
https://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf
section 8.1.3.5), though with a small modification to make every integer
have a single unique encoding.

> ====HeaderAndShortIDs====
> A HeaderAndShortIDs structure is used to relay a block header, the short
> transactions IDs used for matching already-available transactions, and a
> select few transactions which we expect a peer may be missing.
> 
> |shortids||List of uint64_ts||8*shortids_length bytes||Little
> Endian||The short transaction IDs calculated from the transactions which
> were not provided explicitly in prefilledtxn

I tried to derive what length of short ids is actually necessary (some
write-up is on
https://gist.github.com/sipa/b2eb2e486156b5509ac711edd16153ed but it's
incomplete).

For any reasonable numbers I can come up with (in a very wide range),
the number of bits needed is very well approximated by:

  log2(#receiver_mempool_txn * #block_txn_not_in_receiver_mempool /
acceptable_per_block_failure_rate)

For example, with 20000 mempool transactions, 2500 transactions in a
block, 95% hitrate, and a chance of 1 in 10000 blocks to fail to
reconstruct, needed_bits = log2(20000 * 2500 * (1 - 0.95) / 0.0001) =
34.54, or 5 byte txids would suffice.

Note that 1 in 10000 failures may sound like a lot, but this is for each
individual connection, and since every transmission uses separately
salted identifiers, occasional failures should not affect global
propagation. Given that transmission failures due to timeouts, network
connectivity, ... already occur much more frequently than once every few
gigabytes (what 10000 blocks corresponds to), that's probably already
more than enough.

In short: I believe 5 or 6 byte txids should be enough, but perhaps it
makes sense to allow the sender to choose (so he can weigh trying
multiple nonces against increasing the short txid length).

> ====Short transaction IDs====
> Short transaction IDs are used to represent a transaction without
> sending a full 256-bit hash. They are calculated by:
> # single-SHA256 hashing the block header with the nonce appended (in
> little-endian)
> # XORing each 8-byte chunk of the double-SHA256 transaction hash with
> each corresponding 8-byte chunk of the hash from the previous step
> # Adding each of the XORed 8-byte chunks together (in little-endian)
> iteratively to find the short transaction ID

An alternative would be using SipHash-1-3 (a form of SipHash with
reduced iteration counts; the default is SipHash-2-4). SipHash was
designed as a Message Authentication Code, where the security
requirements are much stronger than in our case (in particular, we don't
care about observers being able to finding the key, as the key is just
public knowledge here). One of the designers of SipHash has commented
that SipHash-1-3 for collision resistance in hash tables may be enough:
https://github.com/rust-lang/rust/issues/29754#issuecomment-156073946

Using SipHash-1-3 on modern hardware would take ~32 CPU cycles per txid.

> ===Implementation Notes===

There are a few more heuristics that MAY be used to improve performance:

* Receivers should treat short txids in blocks that match multiple
mempool transactions as non-matches, and request the transactions. This
significantly reduces the failure to reconstruct.

* When constructing a compact block to send, the sender can verify it
against its own mempool to check for collisions, and if so, choose to
either try another nonce, or increase the short txid length.

Cheers,

-- 
Pieter