summaryrefslogtreecommitdiff
path: root/9a/fb06c19e6a578d2a3aeaa8611907bdeae5728e
blob: 0a058212aa43baf10be753e386d510cc4514f1e7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
Return-Path: <karljohan-alm@garage.co.jp>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id EA60F949
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  2 Jun 2017 06:00:58 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from mo.garage.hdemail.jp (mo.garage.hdemail.jp [46.51.242.127])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id E61BE193
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  2 Jun 2017 06:00:57 +0000 (UTC)
Received: from ip-10-217-1-36.ap-northeast-1.compute.internal
	(localhost.localdomain [127.0.0.1])
	by mo.garage.hdemail.jp (hde-mf-postfix) with SMTP id 49F9114C0CA
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  2 Jun 2017 15:00:56 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
X-Received: from unknown (HELO mo.garage.hdemail.jp) (127.0.0.1)
	by 0 with SMTP; 2 Jun 2017 15:00:56 +0900
X-Received: from mo.garage.hdemail.jp (localhost.localdomain [127.0.0.1])
	by mo.garage.hdemail.jp (hde-ma-postfix) with ESMTP id 3921D4C085
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  2 Jun 2017 15:00:56 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
Received: from gw21.oz.hdemail.jp
	(ip-10-172-186-126.ap-northeast-1.compute.internal [10.172.186.126])
	by mo.garage.hdemail.jp (hde-mf-postfix) with ESMTP id 300D714C0CA
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  2 Jun 2017 15:00:56 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
X-Durian-MailFrom: karljohan-alm@garage.co.jp
X-Durian-RcptTo: bitcoin-dev@lists.linuxfoundation.org
Received: from gw21.oz.hdemail.jp (gw21.oz.hdemail.jp [127.0.0.1])
	by gw21.oz.hdemail.jp (gw21.oz.hdemail.jp [127.0.0.1]);
	Fri, 2 Jun 2017 15:00:53 +0900
X-Received: from mail-qt0-f200.google.com (lb1.oz.lo.hdemail.jp
	[54.248.222.53])
	(using TLSv1 with cipher AES128-SHA (128/128 bits))
	(No client certificate requested)
	by gw21.oz.hdemail.jp (Postfix) with ESMTP id 33107148C13C
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri,  2 Jun 2017 15:00:53 +0900 (JST)
X-Received: by mail-qt0-f200.google.com with SMTP id g53so16743079qta.3
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu, 01 Jun 2017 23:00:53 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc;
	bh=FXjjkfpcgOyyAnEXTphUXQpUC2BSgVhGjOp00nM8QrA=;
	b=Fs0CzULcpLHsw6A30waGAMC4Qx7xjbA5lNu5oBJdN7egztQR4kekgdBch//Dvfd2/6
	fDKcTlGaW75j2FvmeXrKUZ2rMohN8568o0OYxkd9eJODcz2ViiBYlDviAzam2OUVMfEC
	giqcj8t97YQJLmMUQyF4nnZ3ZeQbaBuz2EQKeqt3li1oWU2TKD6n1UmmvvdUhhwkMA30
	IKZXtOudM5bPzJfFozwHFJDIeffZH3RYrhrKgdTXJXoJzJGFVrL4Hy4JwmCL9rn6HPHs
	JVwCU10yP2cSqr1UexncrII8nngKvUno5+I6GIZw1r2c+CNOzLA9sbAPgUKZL5uyH5ue
	SnMw==
X-Gm-Message-State: AODbwcCp2jPAe+HzWoGGzpxxd6OWLD4fnyVLsV7+k4Eoif2wACZR7dlN
	A+5s1QQDW8ojykDZqBb03L9XSGpKGhShRTKiYBdyFs3nV+0VokeYeNqrZiu3vCZrfxc4RZ59LvT
	AXQ8YTAwDG/fkHlobeFEcRFPYG1kHVaWJ2LL05yBK95iJEQ==
X-Received: by 10.237.35.211 with SMTP id k19mr6720483qtc.125.1496383251118;
	Thu, 01 Jun 2017 23:00:51 -0700 (PDT)
X-Received: by 10.237.35.211 with SMTP id k19mr6720453qtc.125.1496383250870;
	Thu, 01 Jun 2017 23:00:50 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.12.146.10 with HTTP; Thu, 1 Jun 2017 23:00:30 -0700 (PDT)
In-Reply-To: <CAO3Pvs8ccTkgrecJG6KFbBW+9moHF-FTU+4qNfayeE3hM9uRrg@mail.gmail.com>
References: <CAO3Pvs8ccTkgrecJG6KFbBW+9moHF-FTU+4qNfayeE3hM9uRrg@mail.gmail.com>
From: Karl Johan Alm <karljohan-alm@garage.co.jp>
Date: Fri, 2 Jun 2017 15:00:30 +0900
Message-ID: <CALJw2w5gUgbdX7XnxPsK2FZ6PZ5cSTgmCEqiPu7-S4gwXBM-_Q@mail.gmail.com>
To: Olaoluwa Osuntokun <laolu32@gmail.com>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Fri, 02 Jun 2017 12:09:27 +0000
Cc: Arnoud Kouwenhoven - Pukaki Corp via bitcoin-dev
	<bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for
 Light Clients
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Jun 2017 06:00:59 -0000

Hello,

Really wish I'd known you were working on this a few weeks ago, but
such is life. Hopefully I can provide some useful feedback.

On Fri, Jun 2, 2017 at 4:01 AM, Olaoluwa Osuntokun via bitcoin-dev
<bitcoin-dev@lists.linuxfoundation.org> wrote:
> Full-nodes
> maintain an additional index of the chain, and serve this compact filter
> (the index) to light clients which request them. Light clients then fetch
> these filters, query the locally and _maybe_ fetch the block if a relevant
> item matches.

Is it necessary to maintain the index all the way to the beginning of
the chain? When would clients request "really old digests" and why?

> One specific area we'd like feedback on is the parameter selection. Unlike
> BIP-37 which allows clients to dynamically tune their false positive rate,
> our proposal uses a _fixed_ false-positive. Within the document, it's
> currently specified as P = 1/2^20. We've done a bit of analysis and
> optimization attempting to optimize the following sum:
> filter_download_bandwidth + expected_block_false_positive_bandwidth. Alex
> has made a JS calculator that allows y'all to explore the affect of
> tweaking the false positive rate in addition to the following variables:
> the number of items the wallet is scanning for, the size of the blocks,
> number of blocks fetched, and the size of the filters themselves. The
> calculator calculates the expected bandwidth utilization using the CDF of
> the Geometric Distribution. The calculator can be found here:
> https://aakselrod.github.io/gcs_calc.html. Alex also has an empirical
> script he's been running on actual data, and the results seem to match up
> rather nicely.

I haven't tried the tool yet, and maybe it will answer some of my questions.

On what data were the simulated wallets on actual data based? How did
false positive rates for wallets with lots of items (pubkeys etc) play
out? Is there a maximum number of items for a wallet before it becomes
too bandwidth costly to use digests?

> We we're excited to see that Karl Johan Alm (kallewoof) has done some
> (rather extensive!) analysis of his own, focusing on a distinct encoding
> type [5]. I haven't had the time yet to dig into his report yet, but I
> think I've read enough to extract the key difference in our encodings: his
> filters use a binomial encoding _directly_ on the filter contents, will we
> instead create a Golomb-Coded set with the contents being _hashes_ (we use
> siphash) of the filter items.

I will definitely try to reproduce my experiments with Golomb-Coded
sets and see what I come up with. It seems like you've got a little
less than half the size of my digests for 1-block digests but I
haven't tried making digests for all blocks (and lots of early blocks
are empty).

On the BIP proposal itself:

In Compact Filter Header Chain, you mention that clients should
download filters from nodes if filter_headers is not identical, and
ban offending nodes. What about temporary forks in the chain? What
about longer forks? In general, I am curious how you will deal with
reorgs and temporary non-consensus related chain splits.

I am also curious if you have considered digests containing multiple
blocks. Retaining a permanent binsearchable record of the entire chain
is obviously too space costly, but keeping the last X blocks as
binsearchable could speed up syncing for clients tremendously, I feel.

It may also be space efficient to ONLY store older digests in chunks
of e.g. 8 blocks. A client syncing up finding a match in an 8-block
chunk would have to grab those 8 blocks, but if it's not recent, that
may be acceptable. It may even be possible to make 4-, 2-, 1-block
digests on demand.

How fast are these to create? Would it make sense to provide digests
on demand in some cases, rather than keeping them around indefinitely?