summaryrefslogtreecommitdiff
path: root/4f/ab0348f4f02817d7059de610b12843518bfaa0
blob: bd4f02df3065c848adbb6fafe0617cd301543f64 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
Return-Path: <karljohan-alm@garage.co.jp>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id BF5434A3
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu,  1 Jun 2017 02:12:18 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from mo.garage.hdemail.jp (mo.garage.hdemail.jp [46.51.242.127])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id DD3B915F
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu,  1 Jun 2017 02:12:17 +0000 (UTC)
Received: from ip-10-217-1-36.ap-northeast-1.compute.internal
	(localhost.localdomain [127.0.0.1])
	by mo.garage.hdemail.jp (hde-mf-postfix) with SMTP id 62F6D14C0C5
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu,  1 Jun 2017 11:12:16 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
X-Received: from unknown (HELO mo.garage.hdemail.jp) (127.0.0.1)
	by 0 with SMTP; 1 Jun 2017 11:12:16 +0900
X-Received: from mo.garage.hdemail.jp (localhost.localdomain [127.0.0.1])
	by mo.garage.hdemail.jp (hde-ma-postfix) with ESMTP id 57E0A4C085
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu,  1 Jun 2017 11:12:16 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
Received: from gw19.oz.hdemail.jp
	(ip-10-125-134-49.ap-northeast-1.compute.internal [10.125.134.49])
	by mo.garage.hdemail.jp (hde-mf-postfix) with ESMTP id 5493A14C0C5
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu,  1 Jun 2017 11:12:16 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
X-Durian-MailFrom: karljohan-alm@garage.co.jp
X-Durian-RcptTo: bitcoin-dev@lists.linuxfoundation.org
Received: from gw19.oz.hdemail.jp (localhost [127.0.0.1])
	by gw19.oz.hdemail.jp (localhost [127.0.0.1]);
	Thu, 1 Jun 2017 11:12:11 +0900
X-Received: from mail-qk0-f197.google.com (lb1.oz.lo.hdemail.jp
	[54.248.222.53])
	(using TLSv1 with cipher AES128-SHA (128/128 bits))
	(No client certificate requested)
	by gw19.oz.hdemail.jp (Postfix) with ESMTP id 185B8148C0D0
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Thu,  1 Jun 2017 11:12:11 +0900 (JST)
X-Received: by mail-qk0-f197.google.com with SMTP id k9so10125908qkh.10
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Wed, 31 May 2017 19:12:10 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
	bh=3lcRQO3QQ01aVFC5zmcu47x+gjandIReTwlVljWVSdU=;
	b=XXhAEnRux3U6vJsNz+MkX9e1rG4T8W2EzRmJKzizvepIkehen3wgsMizaGgYzpvLpk
	ea1PdyiZhEikQkxI8whvJFUdvgpEdGp18eNVPJQeukFbo0J4sQqipbkoGqIVFiKcKOuQ
	ERjIlQnc3coOPlikju527BYryIt/8G02Q296DKouzNsG37Iy2vJq0Y9DoFC7iYtP3GJp
	NbzlioW39YYp2SB+ZnAgMF9gfurzfJg/bW7R2L4dMqFC4ia84vc86L29qlHSFFMprO1c
	Voqq1bT58JT+lIh5kQhgV3mcTNPNCYWL0VeiRGUGD4ZvqHN7gBSXnEYzulztTsfHSNbR
	Kixw==
X-Gm-Message-State: AODbwcB/VlFzLv8zzIX/xFNhwSTSR2bDPcEctbl5NsuUJjXgjhfJZG21
	Q3XBFGAT/B1ILFh2kitY24Kh/HvdCke4SBT3HTJrjf04BxrSjQ7ZkJums7EfnYovuqHMGxmY/f7
	tWcNvehUdGp2uykb4c+b3bULgiwFvb+5Fgd3q/NVVCBYwkw==
X-Received: by 10.237.53.3 with SMTP id a3mr32749483qte.207.1496283129052;
	Wed, 31 May 2017 19:12:09 -0700 (PDT)
X-Received: by 10.237.53.3 with SMTP id a3mr32749471qte.207.1496283128867;
	Wed, 31 May 2017 19:12:08 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.12.137.38 with HTTP; Wed, 31 May 2017 19:11:48 -0700 (PDT)
From: Karl Johan Alm <karljohan-alm@garage.co.jp>
Date: Thu, 1 Jun 2017 11:11:48 +0900
Message-ID: <CALJw2w6zrdbn-xvDk9CpayqoDedbi5pFuwDyipeNO4hd9WjOog@mail.gmail.com>
To: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Thu, 01 Jun 2017 02:15:55 +0000
Subject: [bitcoin-dev] Block Filter Digest profiling
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Jun 2017 02:12:18 -0000

Hello,

I have spent a fair bit of time trying to nail how exactly block
filter digests[1] should be done to optimize bandwidth, space,
resource usage.

The report can be found here: http://bc-2.jp/bfd-profile.pdf

This graph shows bandwidth use of 200 wallets simulated over 5000
blocks: http://bc-2.jp/bandwidth_bfd.png (black line is "sync once per
block" wallet, yellow is "sync once per 144 blocks" wallet, red is
average across all wallets).

An interesting insight made during the experiments: when allowing
digests to contain multiple blocks, the false positive rate of high
block count digests can be higher than normal, because the probability
of a false positive hit for a given entry in multiple digests,
assuming their sizes differ, is almost completely independent.

The results look rather promising to me, but I would like to hear
comments, in particular on the approach taken, if I made any faulty
assumptions, bad math mistakes, etc.

I am also curious what people consider to be acceptable costs in terms
of bandwidth use and memory (I couldn't find any stats on bandwidth
use of bloom filters). In the profiling, I restricted the field sizes
to 2^27 = 128 MB. I assumed this was appropriate as these fields are
very short lived, and in worst case, a client *could* do the scan and
decode simultaneously, without allocating up the space for the field
at all. For high block count digests (e.g. 1024 blocks), this is
sometimes overfilled. I wonder if 2^28 (256 MB) fields would be at all
acceptable or if an over-filled (high false positive rate) field is
better.

For that matter, I am not entirely sure 1024-block digests are
necessary, but they do come with an average 15 kb/block which is
pretty good.

I also wonder if the serialization approach taken is overkill or not.
It does save some space instead of simply storing "BBBAAAAA" but adds
some complexity that may not be warranted.

[1] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-May/012636.html