Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 0EB4F323 for ; Wed, 18 May 2016 01:49:15 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from mail.bluematt.me (mail.bluematt.me [192.241.179.72]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id B70D7199 for ; Wed, 18 May 2016 01:49:13 +0000 (UTC) Received: from [172.17.0.2] (gw.vpn.bluematt.me [162.243.132.6]) by mail.bluematt.me (Postfix) with ESMTPSA id 3562C6602C; Wed, 18 May 2016 01:49:12 +0000 (UTC) To: Pieter Wuille , Bitcoin Protocol Discussion References: <5727D102.1020807@mattcorallo.com> <5730C37E.2000004@gmail.com> From: Matt Corallo Message-ID: <573BCA16.2050704@mattcorallo.com> Date: Wed, 18 May 2016 01:49:10 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 In-Reply-To: <5730C37E.2000004@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] Compact Block Relay BIP X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2016 01:49:15 -0000 Implemented a few of your suggestions. Also opened a formal pull request for the BIP at https://github.com/bitcoin/bips/pull/389 and the code at https://github.com/bitcoin/bitcoin/pull/8068. On 05/09/16 17:06, Pieter Wuille via bitcoin-dev wrote: > On 05/03/2016 12:13 AM, lf-lists at mattcorallo.com (Matt Corallo) wrote: >> Hi all, >> >> The following is a BIP-formatted design spec for compact block relay >> designed to limit on wire bytes during block relay. You can find the >> latest version of this document at >> https://github.com/TheBlueMatt/bips/blob/master/bip-TODO.mediawiki. > > Hi Matt, > > thank you for working on this! > >> ===New data structures=== >> Several new data structures are added to the P2P network to relay >> compact blocks: PrefilledTransaction, HeaderAndShortIDs, >> BlockTransactionsRequest, and BlockTransactions. Additionally, we >> introduce a new variable-length integer encoding for use in these data >> structures. >> >> For the purposes of this section, CompactSize refers to the >> variable-length integer encoding used across the existing P2P protocol >> to encode array lengths, among other things, in 1, 3, 5 or 9 bytes. > > This is a not, but I think it's a bit strange to have two separate > variable length integers in the same specification. I understand is one > is already the default for variable-length integers currently, and there > are reasons to use the other one for efficiency reasons in some places, > but perhaps we should aim to get everything using the latter? Fixed, the whole thing now uses New Varints. >> ====New VarInt==== >> Variable-length integers: bytes are a MSB base-128 encoding of the number. >> The high bit in each byte signifies whether another digit follows. To make >> sure the encoding is one-to-one, one is subtracted from all but the last >> digit. > > Maybe it's worth mentioning that it is based on ASN.1 BER's compressed > integer format (see > https://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf > section 8.1.3.5), though with a small modification to make every integer > have a single unique encoding. > >> ====HeaderAndShortIDs==== >> A HeaderAndShortIDs structure is used to relay a block header, the short >> transactions IDs used for matching already-available transactions, and a >> select few transactions which we expect a peer may be missing. >> >> |shortids||List of uint64_ts||8*shortids_length bytes||Little >> Endian||The short transaction IDs calculated from the transactions which >> were not provided explicitly in prefilledtxn > > I tried to derive what length of short ids is actually necessary (some > write-up is on > https://gist.github.com/sipa/b2eb2e486156b5509ac711edd16153ed but it's > incomplete). > > For any reasonable numbers I can come up with (in a very wide range), > the number of bits needed is very well approximated by: > > log2(#receiver_mempool_txn * #block_txn_not_in_receiver_mempool / > acceptable_per_block_failure_rate) > > For example, with 20000 mempool transactions, 2500 transactions in a > block, 95% hitrate, and a chance of 1 in 10000 blocks to fail to > reconstruct, needed_bits = log2(20000 * 2500 * (1 - 0.95) / 0.0001) = > 34.54, or 5 byte txids would suffice. > > Note that 1 in 10000 failures may sound like a lot, but this is for each > individual connection, and since every transmission uses separately > salted identifiers, occasional failures should not affect global > propagation. Given that transmission failures due to timeouts, network > connectivity, ... already occur much more frequently than once every few > gigabytes (what 10000 blocks corresponds to), that's probably already > more than enough. > > In short: I believe 5 or 6 byte txids should be enough, but perhaps it > makes sense to allow the sender to choose (so he can weigh trying > multiple nonces against increasing the short txid length). I switched to 6-byte short txids. >> ====Short transaction IDs==== >> Short transaction IDs are used to represent a transaction without >> sending a full 256-bit hash. They are calculated by: >> # single-SHA256 hashing the block header with the nonce appended (in >> little-endian) >> # XORing each 8-byte chunk of the double-SHA256 transaction hash with >> each corresponding 8-byte chunk of the hash from the previous step >> # Adding each of the XORed 8-byte chunks together (in little-endian) >> iteratively to find the short transaction ID > > An alternative would be using SipHash-1-3 (a form of SipHash with > reduced iteration counts; the default is SipHash-2-4). SipHash was > designed as a Message Authentication Code, where the security > requirements are much stronger than in our case (in particular, we don't > care about observers being able to finding the key, as the key is just > public knowledge here). One of the designers of SipHash has commented > that SipHash-1-3 for collision resistance in hash tables may be enough: > https://github.com/rust-lang/rust/issues/29754#issuecomment-156073946 > > Using SipHash-1-3 on modern hardware would take ~32 CPU cycles per txid. Switched to SipHash2-4. >> ===Implementation Notes=== > > There are a few more heuristics that MAY be used to improve performance: > > * Receivers should treat short txids in blocks that match multiple > mempool transactions as non-matches, and request the transactions. This > significantly reduces the failure to reconstruct. Done. > * When constructing a compact block to send, the sender can verify it > against its own mempool to check for collisions, and if so, choose to > either try another nonce, or increase the short txid length. Additionally we should compare to the orphan pool (which apparently helps a lot).