Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 1E71E409 for ; Tue, 10 May 2016 10:07:30 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-lf0-f54.google.com (mail-lf0-f54.google.com [209.85.215.54]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 59CA71DF for ; Tue, 10 May 2016 10:07:29 +0000 (UTC) Received: by mail-lf0-f54.google.com with SMTP id m64so9013264lfd.1 for ; Tue, 10 May 2016 03:07:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=4hrDfPyyHzAiHNgEmIA0aKR6sOEsCGXv5ZH8m9QP92k=; b=gE32J+bo0gwsC/iWu8E2bda01tfXcaA95PNdz4ClQ6vkQFNk+8apK5kXsT9+yo2T/y 581J6ffcgN+yHrkaelu43FuLygdKZOVu24qkFkpYynEK/Lcqc2GAnXzrCGGtFZ/jGmt9 6I/P4Y1t+2PgKFM/1hwiLnT48ONQRTVYTEQJdVjA9wtqbSBnTV9l4tABiSnU5NRxCnt0 Ggq6EtszgwJ7fuRaZDVB4kmG8SsWTH1S/bRn/sGjYWnYXBJfZsHEPnC2tH8TwdBRsK9C l28ILm2eL6+qQgZnT7DfhRbSjAF7k1eZYTisaghGP8Ki3ZYoHi70u5HprPSQagSStn9B W6cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=4hrDfPyyHzAiHNgEmIA0aKR6sOEsCGXv5ZH8m9QP92k=; b=C8/Yi+VcsOAQeqDnyd9JE2gfR3EoiAXIE00CasjWhSaW3ZJKbu9rAHGYxFpeOSJ2xk 8goboGogGWThO2EXRjYd6O2sRJQaC8CXhxXcKtNebHIUbuxmFkMBeFOcgw8ZoU8lAN5E sxscxU+/ei+sTRq44aViqk7wHKjiRwBiLfmVLZWirdX95DS/W47FKONgecIOyNWf5E/e QtwMzaOkL9gzTjhqtvaU9+PZav7qBxnilPGFx9rIUMOMC6EJU8Y7J4SfGm7vUmdXI3Z4 TL/Y5RF1Ml/clp3r511zToZJOoBsSsMuqqbU99BEDjq4wZ8W7FCmVRyqUN90MUN8InU6 H7mA== X-Gm-Message-State: AOPr4FUq3UA3khod5057RC6z0KlOsFbgqPYiWqU6N+cKijie+/tZ7g7yOIzg+fPyvB5izP64J4MnCwroMFdYhA== MIME-Version: 1.0 X-Received: by 10.25.170.140 with SMTP id t134mr16914333lfe.17.1462874847724; Tue, 10 May 2016 03:07:27 -0700 (PDT) Sender: gmaxwell@gmail.com Received: by 10.112.65.108 with HTTP; Tue, 10 May 2016 03:07:27 -0700 (PDT) In-Reply-To: <87twi6zdl8.fsf@rustcorp.com.au> References: <5727D102.1020807@mattcorallo.com> <5730C37E.2000004@gmail.com> <87twi6zdl8.fsf@rustcorp.com.au> Date: Tue, 10 May 2016 10:07:27 +0000 X-Google-Sender-Auth: ixmzKf4IQBwjFUhIEvkGAYgnKyI Message-ID: From: Gregory Maxwell To: Rusty Russell , Bitcoin Protocol Discussion Content-Type: text/plain; charset=UTF-8 X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] Compact Block Relay BIP X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 May 2016 10:07:30 -0000 On Tue, May 10, 2016 at 5:28 AM, Rusty Russell via bitcoin-dev wrote: > I used variable-length bit encodings, and used the shortest encoding > which is unique to you (including mempool). It's a little more work, > but for an average node transmitting a block with 1300 txs and another > ~3000 in the mempool, you expect about 12 bits per transaction. IOW, > about 1/5 of your current size. Critically, we might be able to fit in > two or three TCP packets. Hm. 12 bits sounds very small even giving those figures. Why failure rate were you targeting? I've mostly been thing in terms of 3000 txn, and 20k mempools, and blocks which are 90% consistent with the remote mempool, targeting 1/100000 failure rates (which is roughly where it should be to put it well below link failure levels). If going down the path of more complexity, set reconciliation is enormously more efficient (e.g. 90% reduction), which no amount of packing/twiddling can achieve. But the savings of going from 20kb to 3kb is not interesting enough to justify it*. My expectation is that later we'll deploy set reconciliation to fix relay efficiency, where the savings is _much_ larger, and then with the infrastructure in place we could define another compactblock mode that used it. (*Not interesting because it mostly reduces exposure to loss and the gods of TCP, but since those are the long poles in the latency tent, it's best to escape them entirely, see Matt's udp_wip branch.) > I would also avoid the nonce to save recalculating for each node, and > instead define an id as: Doing this would greatly increase the cost of a collision though, as it would happen in many places in the network at once over the on the network at once, rather than just happening on a single link, thus hardly impacting overall propagation. (The downside of the nonce is that you get an exponential increase in the rate that a collision happens "somewhere", but links fail "somewhere" all the time-- propagation overall doesn't care about that.) Using the same nonce means you also would not get a recovery gain from jointly decoding using compact blocks sent from multiple peers (which you'll have anyways in high bandwidth mode). With a nonce a sender does have the option of reusing what they got-- but the actual encoding cost is negligible, for a 2500 transaction block its 27 microseconds (once per block, shared across all peers) using Pieter's suggestion of siphash 1-3 instead of the cheaper construct in the current draft. Of course, if you're going to check your whole mempool to reroll the nonce, thats another matter-- but that seems wasteful compared to just using a table driven size with a known negligible failure rate. 64-bits as a maximum length is high enough that the collision rate would be negligible even under fairly unrealistic assumptions-- so long as it's salted. :) > As Peter R points out, we could later enhance receiver to brute force > collisions (you could speed that by sending a XOR of all the txids, but > really if there are more than a few collisions, give up). The band between "no collisions" and "infeasible many" is fairly narrow. You can add a small amount more space to the ids and immediately be in the no collision zone. Some earlier work we had would send small amount of erasure coding data of the next couple bytes of the IDs. E.g. the receiver in all the IDs you know, mark totally unknown IDs as erased and the let the error correction fix the rest. This let you algebraically resolve collisions _far_ beyond what could be feasibly bruteforced. Pieter went and implemented... but the added cost of encoding and software complexity seem not worth it.