Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 38FD1901 for ; Mon, 5 Jun 2017 02:07:11 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from mo.garage.hdemail.jp (mo.garage.hdemail.jp [46.51.242.127]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 48BFC196 for ; Mon, 5 Jun 2017 02:07:10 +0000 (UTC) Received: from ip-10-217-1-36.ap-northeast-1.compute.internal (localhost.localdomain [127.0.0.1]) by mo.garage.hdemail.jp (hde-mf-postfix) with SMTP id 6F39D14C0C5 for ; Mon, 5 Jun 2017 11:07:06 +0900 (JST) (envelope-from karljohan-alm@garage.co.jp) X-Received: from unknown (HELO mo.garage.hdemail.jp) (127.0.0.1) by 0 with SMTP; 5 Jun 2017 11:07:02 +0900 X-Received: from mo.garage.hdemail.jp (localhost.localdomain [127.0.0.1]) by mo.garage.hdemail.jp (hde-ma-postfix) with ESMTP id D4A7E4C085 for ; Mon, 5 Jun 2017 11:07:01 +0900 (JST) (envelope-from karljohan-alm@garage.co.jp) Received: from gw27.oz.hdemail.jp (ip-10-127-141-142.ap-northeast-1.compute.internal [10.127.141.142]) by mo.garage.hdemail.jp (hde-mf-postfix) with ESMTP id 4738014C0C1 for ; Mon, 5 Jun 2017 11:07:00 +0900 (JST) (envelope-from karljohan-alm@garage.co.jp) X-Durian-MailFrom: karljohan-alm@garage.co.jp X-Durian-RcptTo: bitcoin-dev@lists.linuxfoundation.org Received: from gw27.oz.hdemail.jp (gw27.oz.hdemail.jp [127.0.0.1]) by gw27.oz.hdemail.jp (gw27.oz.hdemail.jp [127.0.0.1]); Mon, 5 Jun 2017 11:06:55 +0900 X-Received: from mail-qt0-f198.google.com (lb1.oz.lo.hdemail.jp [54.248.222.53]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by gw27.oz.hdemail.jp (Postfix) with ESMTP id F41D4148C117 for ; Mon, 5 Jun 2017 11:06:54 +0900 (JST) X-Received: by mail-qt0-f198.google.com with SMTP id y31so55990259qty.7 for ; Sun, 04 Jun 2017 19:06:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=AE7yjGjMeJqgLJiTA4DPZJ1lOuLnHBLRHO68oi5mjII=; b=MCVbP+2sxVSXtGxUbnDCGmSm2130TOL1Jwg/F1FIUCRMM88wgn7Y4basF3650UPbct XZ73kkBrsHH3fBtEQ6rrCQrLOeCiyvZJGbnr1WrYmE/vf+gUia8nvp/spNSP+ANQOnU3 Y/ucr1vqVZo45/jAk5E+fOriJeLSKT+aS1+RbZq+GILZo9XFy7Ophc2J0pkAzyM0noDG KIm0feTWFFOt7WkSBk87wNsYqa2xzf86DlWGKlA7zVLIj2UtRdpMlDlNzee+Yy0eb+79 iX9r6k56SMOpjGARX4ZTSxqxa87hh/fX/NjpnXKxzj8EGq8MaQAMeokKdi3uc7zdKiXf TC7Q== X-Gm-Message-State: AKS2vOzzwEhw3nqGZ7fQK0joI0MwlBxx5sznlssbqhZpMcKuPSYNf0EV IX2lUAl1ApuDqcoJN2YxouMRWNtfmqs1tBuBipmmDbmVDHWm0i8Jqmn1EnzcwzTqXTrVO+GUp3l AoS9b/7fUP7n4hXu50AMaSXRH0xEu7oDPD9dPnahnhmUSRg== X-Received: by 10.200.4.162 with SMTP id s34mr22716993qtg.35.1496628412855; Sun, 04 Jun 2017 19:06:52 -0700 (PDT) X-Received: by 10.200.4.162 with SMTP id s34mr22716973qtg.35.1496628412633; Sun, 04 Jun 2017 19:06:52 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.12.146.7 with HTTP; Sun, 4 Jun 2017 19:06:32 -0700 (PDT) In-Reply-To: References: From: Karl Johan Alm Date: Mon, 5 Jun 2017 11:06:32 +0900 Message-ID: To: Alex Akselrod Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Mon, 05 Jun 2017 02:17:45 +0000 Cc: Bitcoin Dev Subject: Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jun 2017 02:07:11 -0000 On Sat, Jun 3, 2017 at 2:55 AM, Alex Akselrod via bitcoin-dev wrote: > Without a soft fork, this is the only way for light clients to verify that > peers aren't lying to them. Clients can request headers (just hashes of the > filters and the previous headers, creating a chain) and look for conflicts > between peers. If a conflict is found at a certain block, the client can > download the block, generate a filter, calculate the header by hashing > together the previous header and the generated filter, and banning any peers > that don't match. A full node could prune old filters if you wanted and > recalculate them as necessary if you just keep the filter header chain info > as really old filters are unlikely to be requested by correctly written > software but you can't guarantee every client will follow best practices > either. Ahh, so you actually make a separate digest chain with prev hashes and everything. Once/if committed digests are soft forked in, it seems a bit overkill but maybe it's worth it. (I was always assuming committed digests in coinbase would come after people started using this, and that people could just ask a couple of random peers for the digest hash and ensure everyone gave the same answer as the hash of the downloaded digest..). > The simulations are based on completely random data within given parameters. I noticed an increase in FP hits when using real data sampled from real scriptPubKeys and such. Address reuse and other weird stuff. See "lies.h" in github repo for experiments and chainsim.c initial part of main where wallets get random stuff from the chain. > I will definitely try to reproduce my experiments with Golomb-Coded > sets and see what I come up with. It seems like you've got a little > less than half the size of my digests for 1-block digests but I > haven't tried making digests for all blocks (and lots of early blocks > are empty). > > > Filters for empty blocks only take a few bytes and sometimes zero when the > coinbase output is a burn that doesn't push any data (example will be in the > test vectors that I'll have ready shortly). I created digests for all blocks up until block #469805 and actually ended up with 5.8 GB, which is 1.1 GB lower than what you have, but may be worse perf-wise on false positive rates and such. > How fast are these to create? Would it make sense to provide digests > on demand in some cases, rather than keeping them around indefinitely? > > > They're pretty fast and can be pruned if desired, as mentioned above, as > long as the header chain is kept. For comparison, creating the digests above (469805 of them) took roughly 30 mins on my end, but using the kstats format so probably higher on an actual node (should get around to profiling that...).