summaryrefslogtreecommitdiff
path: root/4d/ecb73d4fddd2a30050f5712dbb0e8943e80c04
blob: a363c0a6e94bf65e77a8f38b85696153fabb60ba (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
Return-Path: <david.vorick@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id BF871720
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 31 Mar 2017 18:23:03 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-wr0-f170.google.com (mail-wr0-f170.google.com
	[209.85.128.170])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 07C75192
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 31 Mar 2017 18:23:02 +0000 (UTC)
Received: by mail-wr0-f170.google.com with SMTP id l43so114217397wre.1
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 31 Mar 2017 11:23:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:cc; bh=5DtG5G8XQEW/CpiDvOQnczxa+3bzUWBeks1JSCo0o1s=;
	b=iz5imxCnqDcCPy9j7PonZDLLjktuUrXWGSkm4b6G/YQUz5JiIUHqx59qbMkpS5T5Ez
	IQWdaOobNGgTDJmR/IvQvwV63mm7+fCbCqZ0txbbWKfxdxVBFmJd5tX5VtCq/3c95VQs
	Wod4afrKcgYMhgWpLtEXtwRT03NGcF62Yt+DJ04UTAm1AC3qJN1+qXqqvUpynL1qKJAU
	H6oa1uD7ic3bycG09Xsa2uvSykirmelOCidTi6JLU+vFG/yaj/eFl49FsJzt9lvAY18t
	LLfm5HPnOtK3ghogRJ5khWwyipuQdMmnvU2wjaMiLbD/PFHYabvVfQe/3r2oheTen7YO
	zmpg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc;
	bh=5DtG5G8XQEW/CpiDvOQnczxa+3bzUWBeks1JSCo0o1s=;
	b=XO4boyp/a3w8635NHEw3rZsHi4P4oPmT4WxhLSvfG2XnJnY7A1D9WEQBxIg2QpFSXw
	A0Qb2B156qx5eC0+oVf1b35TcaZhQT775DVrB3VAtOcyg1Alc/g2xcXKvd3rG7ROsgfs
	NBu3//RdgT6N5c8ZD6fCNwa+Kfp+Mfx0kFxVsMt8oSx9Hnp5tgFF6B9PWDwip7bFni7K
	yGnq5+OvQY32RTm3A6mrLJOMR0kD6yGkg4lsVeghbm2nlrvKf8P2iz3eo7asIU9Zie/2
	uaehWle5Te+pV/PKayRXTt2hfQjTKVIAD230vxePL0YktAmJopDkRLWN6h65NzAInd+n
	z8hg==
X-Gm-Message-State: AFeK/H3menUzJJspw5VKs/OZnUGWAhreBSFpCPyLRTV+zQRpCV+yg/kRn13A6ruRmkrip88xRPBcVw6LEK+wng==
X-Received: by 10.223.148.102 with SMTP id 93mr3844416wrq.144.1490984581537;
	Fri, 31 Mar 2017 11:23:01 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.55.9 with HTTP; Fri, 31 Mar 2017 11:23:00 -0700 (PDT)
In-Reply-To: <CAD1TkXvmo8ygbFdPJxdiL5-QiN6+ujeSgOJ7_4eit43aZ2bzyg@mail.gmail.com>
References: <CAEgR2PEG1UMqY0hzUH4YE_an=qOvQUgfXreSRsoMWfFWxG3Vqg@mail.gmail.com>
	<CAFVRnyq9Qgw88RZqenjQTDZHEWeuNCdh12Dq7wCGZdu9ZuEN9w@mail.gmail.com>
	<CAD1TkXvd4yLHZDAdMi78WwJ_siO1Vt7=DgYiBmP45ffVveuHBg@mail.gmail.com>
	<SINPR04MB1949AB581C6870184445E0C4C2340@SINPR04MB1949.apcprd04.prod.outlook.com>
	<CAD1TkXsj53JRYhqot2aHSQR+HEDKm7+6S5kGtaLYBCoc24PuWg@mail.gmail.com>
	<SINPR04MB1949A0AF3AD33B4664417068C2370@SINPR04MB1949.apcprd04.prod.outlook.com>
	<CAD1TkXtPZ7w+qYqr_hvyeq95aJ2ge1YYkoC1taDkzv1vEMKpog@mail.gmail.com>
	<SINPR04MB1949BE883C69CFF1477AFAEFC2370@SINPR04MB1949.apcprd04.prod.outlook.com>
	<CAD1TkXvXYX0f+jMMc41vhANuKfw-rNg9tUOG0bCS=T-YGYYjPw@mail.gmail.com>
	<CAFVRnyqSMVj2Ttc4_5vuk73Z5yRJdxeSodvkdjqsrHbgghcmUQ@mail.gmail.com>
	<CAD1TkXvmo8ygbFdPJxdiL5-QiN6+ujeSgOJ7_4eit43aZ2bzyg@mail.gmail.com>
From: David Vorick <david.vorick@gmail.com>
Date: Fri, 31 Mar 2017 14:23:00 -0400
Message-ID: <CAFVRnyr-Z9YWtT3r+7-fGejzgxKhH3-kQuo8JQFqKDpyZNBBdg@mail.gmail.com>
To: Jared Lee Richardson <jaredr26@gmail.com>
Content-Type: multipart/alternative; boundary=94eb2c0d2574592900054c0ae660
X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,LOTS_OF_MONEY,
	RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM autolearn=no version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Cc: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Hard fork proposal from last week's meeting
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 31 Mar 2017 18:23:03 -0000

--94eb2c0d2574592900054c0ae660
Content-Type: text/plain; charset=UTF-8

Sure, your math is pretty much entirely irrelevant because scaling systems
to massive sizes doesn't work that way.

At 400B transactions per year we're looking at block sizes of 4.5 GB, and a
database size of petabytes. How much RAM do you need to process blocks like
that? Can you fit that much RAM into a single machine? Okay, you can't fit
that much RAM into a single machine. So you have to rework the code to
operate on a computer cluster.

Already we've hit a significant problem. You aren't going to rewrite
Bitcoin to do block validation on a computer cluster overnight. Further,
are storage costs consistent when we're talking about setting up clusters?
Are bandwidth costs consistent when we're talking about setting up
clusters? Are RAM and CPU costs consistent when we're talking about setting
up clusters? No, they aren't. Clusters are a lot more expensive to set up
per-resource because they need to talk to eachother and synchronize with
eachother and you have a LOT more parts, so you have to build in
redundancies that aren't necessary in non-clusters.

Also worth pointing out that peak transaction volumes are typically 20-50x
the size of typical transaction volumes. So your cluster isn't going to
need to plan to handle 15k transactions per second, you're really looking
at more like 200k or even 500k transactions per second to handle
peak-volumes. And if it can't, you're still going to see full blocks.

You'd need a handful of experts just to maintain such a thing. Disks are
going to be failing every day when you are storing multiple PB, so you
can't just count a flat cost of $20/TB and expect that to work. You're
going to need redundancy and tolerance so that you don't lose the system
when a few of your hard drives all fail within minutes of eachother. And
you need a way to rebuild everything without taking the system offline.

This isn't even my area of expertise. I'm sure there are a dozen other
significant issues that one of the Visa architects could tell you about
when dealing with mission-critical data at this scale.

--------

Massive systems operate very differently and are much more costly per-unit
than tiny systems. Once we grow the blocksize large enough that a single
computer can't do all the processing all by itself we get into a world of
much harder, much more expensive scaling problems. Especially because we're
talking about a distributed system where the nodes don't even trust each
other. And transaction processing is largely non-parallel. You have to
check each transaction against each other transaction to make sure that
they aren't double spending eachother. This takes synchronization and
prevents 500 CPUs from all crunching the data concurrently. You have to be
a lot more clever than that to get things working and consistent.

When talking about scalability problems, you should ask yourself what other
systems in the world operate at the scales you are talking about. None of
them have cost structures in the 6 digit range, and I'd bet (without
actually knowing) that none of them have cost structures in the 7 digit
range either. In fact I know from working in a related industry that the
cost structures for the datacenters (plus the support engineers, plus the
software management, etc.) that do airline ticket processing are above $5
million per year for the larger airlines. Visa is probably even more
expensive than that (though I can only speculate).

--94eb2c0d2574592900054c0ae660
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Sure, your math is pretty much entirely irrelevant be=
cause scaling systems to massive sizes doesn&#39;t work that way.<br><br></=
div><div>At 400B transactions per year we&#39;re looking at block sizes of =
4.5 GB, and a database size of petabytes. How much RAM do you need to proce=
ss blocks like that? Can you fit that much RAM into a single machine? Okay,=
 you can&#39;t fit that much RAM into a single machine. So you have to rewo=
rk the code to operate on a computer cluster.<br><br></div><div>Already we&=
#39;ve hit a significant problem. You aren&#39;t going to rewrite Bitcoin t=
o do block validation on a computer cluster overnight. Further, are storage=
 costs consistent when we&#39;re talking about setting up clusters? Are ban=
dwidth costs consistent when we&#39;re talking about setting up clusters? A=
re RAM and CPU costs consistent when we&#39;re talking about setting up clu=
sters? No, they aren&#39;t. Clusters are a lot more expensive to set up per=
-resource because they need to talk to eachother and synchronize with eacho=
ther and you have a LOT more parts, so you have to build in redundancies th=
at aren&#39;t necessary in non-clusters.<br><br></div><div>Also worth point=
ing out that peak transaction volumes are typically 20-50x the size of typi=
cal transaction volumes. So your cluster isn&#39;t going to need to plan to=
 handle 15k transactions per second, you&#39;re really looking at more like=
 200k or even 500k transactions per second to handle peak-volumes. And if i=
t can&#39;t, you&#39;re still going to see full blocks.<br><br></div><div>Y=
ou&#39;d need a handful of experts just to maintain such a thing. Disks are=
 going to be failing every day when you are storing multiple PB, so you can=
&#39;t just count a flat cost of $20/TB and expect that to work. You&#39;re=
 going to need redundancy and tolerance so that you don&#39;t lose the syst=
em when a few of your hard drives all fail within minutes of eachother. And=
 you need a way to rebuild everything without taking the system offline.<br=
><br></div><div>This isn&#39;t even my area of expertise. I&#39;m sure ther=
e are a dozen other significant issues that one of the Visa architects coul=
d tell you about when dealing with mission-critical data at this scale.<br>=
<br>--------<br><br></div><div>Massive systems operate very differently and=
 are much more costly per-unit than tiny systems. Once we grow the blocksiz=
e large enough that a single computer can&#39;t do all the processing all b=
y itself we get into a world of much harder, much more expensive scaling pr=
oblems. Especially because we&#39;re talking about a distributed system whe=
re the nodes don&#39;t even trust each other. And transaction processing is=
 largely non-parallel. You have to check each transaction against each othe=
r transaction to make sure that they aren&#39;t double spending eachother. =
This takes synchronization and prevents 500 CPUs from all crunching the dat=
a concurrently. You have to be a lot more clever than that to get things wo=
rking and consistent.<br><br></div><div>When talking about scalability prob=
lems, you should ask yourself what other systems in the world operate at th=
e scales you are talking about. None of them have cost structures in the 6 =
digit range, and I&#39;d bet (without actually knowing) that none of them h=
ave cost structures in the 7 digit range either. In fact I know from workin=
g in a related industry that the cost structures for the datacenters (plus =
the support engineers, plus the software management, etc.) that do airline =
ticket processing are above $5 million per year for the larger airlines. Vi=
sa is probably even more expensive than that (though I can only speculate).=
<br></div></div>

--94eb2c0d2574592900054c0ae660--