Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 5DE9B88A for ; Thu, 1 Jun 2017 21:33:49 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from mail.bluematt.me (mail.bluematt.me [192.241.179.72]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 38B0B1E1 for ; Thu, 1 Jun 2017 21:33:48 +0000 (UTC) Received: from [30.226.235.44] (unknown [172.56.34.216]) by mail.bluematt.me (Postfix) with ESMTPSA id 8D5D41397A4; Sun, 18 Jan 1970 08:00:16 +0000 (UTC) Date: Thu, 01 Jun 2017 21:33:43 +0000 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable To: Olaoluwa Osuntokun , Olaoluwa Osuntokun via bitcoin-dev , Arnoud Kouwenhoven - Pukaki Corp via bitcoin-dev From: Matt Corallo Message-ID: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jun 2017 21:33:49 -0000 Quick comment before I finish reading it completely, looks like you have no= way to match the input prevouts being spent, which is rather nice from a "= watch for this output being spent" pov=2E On June 1, 2017 3:01:14 PM EDT, Olaoluwa Osuntokun via bitcoin-dev wrote: >Hi y'all, > >Alex Akselrod and I would like to propose a new light client BIP for >consideration: >* >https://github=2Ecom/Roasbeef/bips/blob/master/gcs_light_client=2Emediawi= ki > >This BIP proposal describes a concrete specification (along with a >reference implementations[1][2][3]) for the much discussed client-side >filtering reversal of BIP-37=2E The precise details are described in the >BIP, but as a summary: we've implemented a new light-client mode that >uses >client-side filtering based off of Golomb-Rice coded sets=2E Full-nodes >maintain an additional index of the chain, and serve this compact >filter >(the index) to light clients which request them=2E Light clients then >fetch >these filters, query the locally and _maybe_ fetch the block if a >relevant >item matches=2E The cool part is that blocks can be fetched from _any_ >source, once the light client deems it necessary=2E Our primary >motivation >for this work was enabling a light client mode for lnd[4] in order to >support a more light-weight back end paving the way for the usage of >Lightning on mobile phones and other devices=2E We've integrated neutrino >as a back end for lnd, and will be making the updated code public very >soon=2E > >One specific area we'd like feedback on is the parameter selection=2E >Unlike >BIP-37 which allows clients to dynamically tune their false positive >rate, >our proposal uses a _fixed_ false-positive=2E Within the document, it's >currently specified as P =3D 1/2^20=2E We've done a bit of analysis and >optimization attempting to optimize the following sum: >filter_download_bandwidth + expected_block_false_positive_bandwidth=2E >Alex >has made a JS calculator that allows y'all to explore the affect of >tweaking the false positive rate in addition to the following >variables: >the number of items the wallet is scanning for, the size of the blocks, >number of blocks fetched, and the size of the filters themselves=2E The >calculator calculates the expected bandwidth utilization using the CDF >of >the Geometric Distribution=2E The calculator can be found here: >https://aakselrod=2Egithub=2Eio/gcs_calc=2Ehtml=2E Alex also has an empir= ical >script he's been running on actual data, and the results seem to match >up >rather nicely=2E > >We we're excited to see that Karl Johan Alm (kallewoof) has done some >(rather extensive!) analysis of his own, focusing on a distinct >encoding >type [5]=2E I haven't had the time yet to dig into his report yet, but I >think I've read enough to extract the key difference in our encodings: >his >filters use a binomial encoding _directly_ on the filter contents, will >we >instead create a Golomb-Coded set with the contents being _hashes_ (we >use >siphash) of the filter items=2E > >Using a fixed fp=3D20, I have some stats detailing the total index size, >as >well as averages for both mainnet and testnet=2E For mainnet, using the >filter contents as currently described in the BIP (basic + extended), >the >total size of the index comes out to 6=2E9GB=2E The break down is as >follows: > > * total size: 6976047156 > * total avg: 14997=2E220622758816 > * total median: 3801 > * total max: 79155 > * regular size: 3117183743 > * regular avg: 6701=2E372750217131 > * regular median: 1734 > * regular max: 67533 > * extended size: 3858863413 > * extended avg: 8295=2E847872541684 > * extended median: 2041 > * extended max: 52508 > >In order to consider the average+median filter sizes in a world worth >larger blocks, I also ran the index for testnet: > > * total size: 2753238530 > * total avg: 5918=2E95736054141 > * total median: 60202 > * total max: 74983 > * regular size: 1165148878 > * regular avg: 2504=2E856172982827 > * regular median: 24812 > * regular max: 64554 > * extended size: 1588089652 > * extended avg: 3414=2E1011875585823 > * extended median: 35260 > * extended max: 41731 > >Finally, here are the testnet stats which take into account the >increase >in the maximum filter size due to segwit's block-size increase=2E The max >filter sizes are a bit larger due to some of the habitual blocks I >created last year when testing segwit (transactions with 30k inputs, >30k >outputs, etc)=2E > > * total size: 585087597 > * total avg: 520=2E8839608674402 > * total median: 20 > * total max: 164598 > * regular size: 299325029 > * regular avg: 266=2E4790836307566 > * regular median: 13 > * regular max: 164583 > * extended size: 285762568 > * extended avg: 254=2E4048772366836 > * extended median: 7 > * extended max: 127631 > >For those that are interested in the raw data, I've uploaded a CSV file >of raw data for each block (mainnet + testnet), which can be found >here: > * mainnet: (14MB): >https://www=2Edropbox=2Ecom/s/4yk2u8dj06njbuv/mainnet-gcs-stats=2Ecsv?dl= =3D0 > * testnet: (25MB): >https://www=2Edropbox=2Ecom/s/w7dmmcbocnmjfbo/gcs-stats-testnet=2Ecsv?dl= =3D0 > > >We look forward to getting feedback from all of y'all! > >-- Laolu > > >[1]: https://github=2Ecom/lightninglabs/neutrino >[2]: https://github=2Ecom/Roasbeef/btcd/tree/segwit-cbf >[3]: https://github=2Ecom/Roasbeef/btcutil/tree/gcs/gcs >[4]: https://github=2Ecom/lightningnetwork/lnd/ > >-- Laolu