summaryrefslogtreecommitdiff
path: root/d0/949812c41fe336703ed201840c755556162508
blob: efc60d62db8461bc44f6ffbabf0a57d3f9d3f6fe (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
Received: from sog-mx-3.v43.ch3.sourceforge.com ([172.29.43.193]
	helo=mx.sourceforge.net)
	by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <mh.in.england@gmail.com>) id 1UPW0J-0001vb-87
	for bitcoin-development@lists.sourceforge.net;
	Tue, 09 Apr 2013 10:42:19 +0000
Received-SPF: pass (sog-mx-3.v43.ch3.sourceforge.com: domain of gmail.com
	designates 209.85.214.180 as permitted sender)
	client-ip=209.85.214.180; envelope-from=mh.in.england@gmail.com;
	helo=mail-ob0-f180.google.com; 
Received: from mail-ob0-f180.google.com ([209.85.214.180])
	by sog-mx-3.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128)
	(Exim 4.76) id 1UPW0I-0000JJ-5b
	for bitcoin-development@lists.sourceforge.net;
	Tue, 09 Apr 2013 10:42:19 +0000
Received: by mail-ob0-f180.google.com with SMTP id un3so2507516obb.25
	for <bitcoin-development@lists.sourceforge.net>;
	Tue, 09 Apr 2013 03:42:12 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.182.105.2 with SMTP id gi2mr8789538obb.15.1365504132699;
	Tue, 09 Apr 2013 03:42:12 -0700 (PDT)
Sender: mh.in.england@gmail.com
Received: by 10.76.162.198 with HTTP; Tue, 9 Apr 2013 03:42:12 -0700 (PDT)
In-Reply-To: <CA+8xBpc5iV=prakWKkNFa0O+tgyhoHxJ9Xwz6ubhPRUBf_95KA@mail.gmail.com>
References: <CA+8xBpc5iV=prakWKkNFa0O+tgyhoHxJ9Xwz6ubhPRUBf_95KA@mail.gmail.com>
Date: Tue, 9 Apr 2013 12:42:12 +0200
X-Google-Sender-Auth: aQxMZbXYc4RqQErJjTmX6Q4g88c
Message-ID: <CANEZrP1EKaHbpdC6X=9mvyJHC_cvW7u5p9nqM7EwkEypAg4Xmg@mail.gmail.com>
From: Mike Hearn <mike@plan99.net>
To: Jeff Garzik <jgarzik@exmulti.com>
Content-Type: multipart/alternative; boundary=e89a8ff1cdf0c52ed604d9eb3452
X-Spam-Score: -0.5 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
	sender-domain
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	(mh.in.england[at]gmail.com)
	-0.0 SPF_PASS               SPF: sender matches SPF record
	1.0 HTML_MESSAGE           BODY: HTML included in message
	0.1 DKIM_SIGNED            Message has a DKIM or DK signature,
	not necessarily valid
	-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-Headers-End: 1UPW0I-0000JJ-5b
Cc: Bitcoin Development <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] On-going data spam
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Tue, 09 Apr 2013 10:42:19 -0000

--e89a8ff1cdf0c52ed604d9eb3452
Content-Type: text/plain; charset=UTF-8

OK, as the start of that conversation is now on the list, I might as well
post the other thoughts we had. Or at least that I had :)

It's tempting to see this kind of abuse through the lens of fees, because
we only have a few hammers and so everything looks like a kind of nail. The
problem is the moment you try to define "abuse" economically you end up
excluding legitimate and beneficial uses as well. Maybe Peters patch for
uneconomical outputs is different because of how it works. But mostly it's
true. In this case, fees would never work - Peter said the guy who uploaded
Wikileaks paid something like $500 to do it. I guess by now it's more like
$600-$700. It's hard for regular end users to compete with that kind of
wild-eyed dedication to "the cause".

The root problem here is people believe the block chain is a data structure
that will live forever and be served by everyone for free, in perpetuity,
and is thus the perfect place for "uncensorable" stuff. That's a reasonable
assumption given how Bitcoin works today. But there's no reason it will be
true in the long run (I know this can be an unpopular viewpoint).

Firstly, legal issues - I think it's very unlikely any sane court would
care about illegal stuff in the block chain given you need special tools to
extract it (mens rea). Besides, I guess most end users will end up on SPV
clients as they mature. So these users already don't have a copy of the
entire block chain. I don't worry too much about this.

Secondly, the need to host blocks forever. In future, many (most?) full
nodes will be pruning, and won't actually store old blocks at all. They'll
just have the utxo database, some undo blocks and some number of old blocks
for serving, probably whatever fits in the amount of disk space the user is
willing to allocate. But very old blocks will have been deleted.

This leads to the question of what incentives people have to not prune. The
obvious incentive is money - charge for access to older parts of the chain.
The fewer people that host it, the more you can charge. In the worst case
scenario where, you know, only 10 different organizations store a copy of
the chain, it might mean that bootstrapping a new node in a trust-less
manner is expensive. But I really doubt it'd ever get so few. Serving large
static datasets just isn't that expensive. Also, you don't actually need to
replay from the genesis block to bring up a new code, you can copy the UTXO
database from somewhere else. By comparing the databases of lots of
different nodes together, the chances of you being in a matrix-like sybil
world can be reduced to "beyond reasonable doubt". Maybe nodes would charge
for copies of their database too, but ideally there are lots of nodes and
so the charge for that should be so close to zero as makes no odds - you
can trivially undercut someone by buying access to the dataset and then
reselling it for a bit less, so the price should converge on the actual
cost of providing the service. Which will be very cheap.

There was one last thought I had, which is that if there's a shorter team
need to discourage this kind of thing we can use a network/bandwith related
hack by changing the protocol. Nodes can serve up blocks encrypted under a
random key. You only get the key when you finish the download. A blacklist
can apply to Bloom filtering such that transactions which are known to be
"abusive" require you to fully download the block rather than select the
transactions with a filter. This means that people can still access the
data in the chain, but the older it gets the slower and more bandwidth
intensive it becomes. Stuffing Wikileaks into the chain sounds good when a
20 line Python script can extract it "instantly". If someone who wants the
files has to download gigabytes of padding around it first, suddenly
hosting it on a Tor hidden service becomes more attractive.

--e89a8ff1cdf0c52ed604d9eb3452
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">OK, as the start of that conversation is now on the list, =
I might as well post the other thoughts we had. Or at least that I had :)<d=
iv><br></div><div style>It&#39;s tempting to see this kind of abuse through=
 the lens of fees, because we only have a few hammers and so everything loo=
ks like a kind of nail. The problem is the moment you try to define &quot;a=
buse&quot; economically you end up excluding legitimate and beneficial uses=
 as well. Maybe Peters patch for uneconomical outputs is different because =
of how it works. But mostly it&#39;s true. In this case, fees would never w=
ork - Peter said the guy who uploaded Wikileaks paid something like $500 to=
 do it. I guess by now it&#39;s more like $600-$700. It&#39;s hard for regu=
lar end users to compete with that kind of wild-eyed dedication to &quot;th=
e cause&quot;.</div>
<div style><br></div><div style>The root problem here is people believe the=
 block chain is a data structure that will live forever and be served by ev=
eryone for free, in perpetuity, and is thus the perfect place for &quot;unc=
ensorable&quot; stuff. That&#39;s a reasonable assumption given how Bitcoin=
 works today. But there&#39;s no reason it will be true in the long run (I =
know this can be an unpopular viewpoint).</div>
<div style><br></div><div style>Firstly, legal issues - I think it&#39;s ve=
ry unlikely any sane court would care about illegal stuff in the block chai=
n given you need special tools to extract it (mens rea). Besides, I guess m=
ost end users will end up on SPV clients as they mature. So these users alr=
eady don&#39;t have a copy of the entire block chain. I don&#39;t worry too=
 much about this.</div>
<div style><br></div><div style>Secondly, the need to host blocks forever. =
In future, many (most?) full nodes will be pruning, and won&#39;t actually =
store old blocks at all. They&#39;ll just have the utxo database, some undo=
 blocks and some number of old blocks for serving, probably whatever fits i=
n the amount of disk space the user is willing to allocate. But very old bl=
ocks will have been deleted.=C2=A0</div>
<div style><br></div><div style>This leads to the question of what incentiv=
es people have to not prune. The obvious incentive is money - charge for ac=
cess to older parts of the chain. The fewer people that host it, the more y=
ou can charge. In the worst case scenario where, you know, only 10 differen=
t organizations store a copy of the chain, it might mean that bootstrapping=
 a new node in a trust-less manner is expensive. But I really doubt it&#39;=
d ever get so few. Serving large static datasets just isn&#39;t that expens=
ive. Also, you don&#39;t actually need to replay from the genesis block to =
bring up a new code, you can copy the UTXO database from somewhere else. By=
 comparing the databases of lots of different nodes together, the chances o=
f you being in a matrix-like sybil world can be reduced to &quot;beyond rea=
sonable doubt&quot;. Maybe nodes would charge for copies of their database =
too, but ideally there are lots of nodes and so the charge for that should =
be so close to zero as makes no odds - you can trivially undercut someone b=
y buying access to the dataset and then reselling it for a bit less, so the=
 price should converge on the actual cost of providing the service. Which w=
ill be very cheap.</div>
<div style><br></div><div style>There was one last thought I had, which is =
that if there&#39;s a shorter team need to discourage this kind of thing we=
 can use a network/bandwith related hack by changing the protocol. Nodes ca=
n serve up blocks encrypted under a random key. You only get the key when y=
ou finish the download. A blacklist can apply to Bloom filtering such that =
transactions which are known to be &quot;abusive&quot; require you to fully=
 download the block rather than select the transactions with a filter. This=
 means that people can still access the data in the chain, but the older it=
 gets the slower and more bandwidth intensive it becomes. Stuffing Wikileak=
s into the chain sounds good when a 20 line Python script can extract it &q=
uot;instantly&quot;. If someone who wants the files has to download gigabyt=
es of padding around it first, suddenly hosting it on a Tor hidden service =
becomes more attractive.</div>
<div style><br></div><div style><br></div></div>

--e89a8ff1cdf0c52ed604d9eb3452--