Received-SPF: pass (sog-mx-4.v43.ch3.sourceforge.com: domain of gmail.com
	designates 209.85.223.174 as permitted sender)
	client-ip=209.85.223.174; envelope-from=pieter.wuille@gmail.com;
	helo=mail-ie0-f174.google.com; 
MIME-Version: 1.0
In-Reply-To: <CANEZrP2X9A0kBvN8=+G+dn_uqbSYfNhw7dm4od_yfJqDUoxHWg@mail.gmail.com>
References: <CAPg+sBjSe23eADMxu-1mx0Kg2LGkN+BSNByq0PtZcMxAMh0uTg@mail.gmail.com>
	<CANEZrP3FA-5z3gAC1aYbG2EOKM2eDyv7zX3S9+ia2ZJ0LPkKiA@mail.gmail.com>
	<CAPg+sBjz8SbqU=2YXrXzwzmvz+NUbokD6KbPwZ5QAXSqCdi++g@mail.gmail.com>
	<CANEZrP2X9A0kBvN8=+G+dn_uqbSYfNhw7dm4od_yfJqDUoxHWg@mail.gmail.com>
Date: Fri, 3 May 2013 14:30:19 +0200
Message-ID: <CAPg+sBgz2pLOkc3WL1sG3pJpdVqUZRwEfO9YaC-62vQyWLLW2Q@mail.gmail.com>
From: Pieter Wuille <pieter.wuille@gmail.com>
To: Mike Hearn <mike@plan99.net>
Content-Type: multipart/alternative; boundary=089e0149c57ea48ba304dbcf831f
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] Service bits for pruned nodes
Precedence: list

--089e0149c57ea48ba304dbcf831f
Content-Type: text/plain; charset=ISO-8859-1

(generic comment on the discussion that spawned off: ideas about how to
allow additional protocols for block exchange are certainly interesting,
and in the long term we should certainly consider that. For now I'd like to
keep this about the more immediate way forward with making the P2P protocol
not break in the presence of pruning nodes)

On Sun, Apr 28, 2013 at 6:57 PM, Mike Hearn <mike@plan99.net> wrote:

> That's true. It can be perhaps be represented as "I keep the last N
> blocks" and then most likely for any given node the policy doesn't change
> all that fast, so if you know the best chain height you can calculate which
> nodes have what.
>

Yes, I like that better than broadcasting the exact height starting at
which you serve (though I would put that information immediately in the
version announcement). I don't think we can rely on the addr broadcasting
mechanism for fast information exchange anyway. One more problem with this:
DNS seeds cannot convey this information (neither do they currently convey
service bits, but at least those can be indexed separately, and served
explicitly through asking for a specific subdomain or so).

So to summarize:
* Add a field to addr messages (after protocol number increase) that
maintains number of top blocks served)?
* Add a field to version message to announce the actual first block served?
* Add service bits to separately enable "relaying/verifying node" and
"serves (part of) the historic chain"? My original reason for suggesting
this was different, I think better compatibility with DNS seeds may be a
good reason for this. You could ask the seed first for a subset that at
least serves some part of the historic chain, until you hit a node that has
enough, and once caught up, ask for nodes that relay.

Disconnecting in case something is requested that isn't served seems like
>> an acceptable behaviour, yes. A specific message indicating data is pruned
>> may be more flexible, but more complex to handle too.
>>
>
> Well, old nodes would ignore it and new nodes wouldn't need it?
>

I'm sure there will be cases where a new node connects based on outdated
information. I'm just stating that I agree with the generic policy of "if a
node requests something it should have known the peer doesn't serve, it is
fair to be disconnected."


>  The reason for splitting them is that I think over time these may be
>> handled by different implementations. You could have stupid
>> storage/bandwidth nodes that just keep the blockchain around, and others
>> that validate it. Even if that doesn't happen implementation-wise, I think
>> these are sufficiently independent functions to start thinking about them
>> as such.
>>
>
> Maybe so, with a "last N blocks" in addr messages though such nodes could
> just set their advertised history to zero and not have to deal with serving
> blocks to nodes.
>
> If you have a node that serves the chain but doesn't validate it, how does
> it know what the best chain is? Just whatever the hardest is?
>

Maybe it validates, maybe it doesn't. What matters is that it doesn't
guarantee relaying fresh blocks and transactions. Maybe it does validate,
maybe it just stores any blocks, and uses a validating node to know what to
announce as best chain, or it uses an SPV mechanism to determine that. Or
it only validates and relays blocks, but not transactions. My point is that
"serving historic data" and "relaying fresh data" are separate
responsibilities, and there's no need to require them to be combined.

-- 
Pieter

--089e0149c57ea48ba304dbcf831f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div style>(generic comment on the discussion that spawned=
 off: ideas about how to allow additional protocols for block exchange are =
certainly interesting, and in the long term we should certainly consider th=
at. For now I&#39;d like to keep this about the more immediate way forward =
with making the P2P protocol not break in the presence of pruning nodes)</d=
iv>
<div style><br></div>On Sun, Apr 28, 2013 at 6:57 PM, Mike Hearn <span dir=
=3D"ltr">&lt;<a href=3D"mailto:mike@plan99.net" target=3D"_blank">mike@plan=
99.net</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><div class=3D"gm=
ail_quote">
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra">=
<div class=3D"gmail_quote"><div>That&#39;s true. It can be perhaps be repre=
sented as &quot;I keep the last N blocks&quot; and then most likely for any=
 given node the policy doesn&#39;t change all that fast, so if you know the=
 best chain height you can calculate which nodes have what.</div>
</div></div></div></blockquote><div><br></div><div style>Yes, I like that b=
etter than broadcasting the exact height starting at which you serve (thoug=
h I would put that information immediately in the version announcement). I =
don&#39;t think we can rely on the addr broadcasting mechanism for fast inf=
ormation exchange anyway. One more problem with this: DNS seeds cannot conv=
ey this information (neither do they currently convey service bits, but at =
least those can be indexed separately, and served explicitly through asking=
 for a specific subdomain or so).</div>
<div style><br></div><div style>So to summarize:</div><div style>* Add a fi=
eld to addr messages (after protocol number increase) that maintains number=
 of top blocks served)?</div><div style>* Add a field to version message to=
 announce the actual first block served?</div>
<div style>* Add service bits to separately enable &quot;relaying/verifying=
 node&quot; and &quot;serves (part of) the historic chain&quot;? My origina=
l reason for suggesting this was different, I think better compatibility wi=
th DNS seeds may be a good reason for this. You could ask the seed first fo=
r a subset that at least serves some part of the historic chain, until you =
hit a node that has enough, and once caught up, ask for nodes that relay.</=
div>
<div style><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 =
0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div c=
lass=3D"gmail_extra"><div class=3D"gmail_quote"><div class=3D"im"><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so=
lid;padding-left:1ex">
<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div=
>
<div><span style=3D"color:rgb(34,34,34)">Disconnecting in case something is=
 requested that isn&#39;t served seems like an acceptable behaviour, yes. A=
 specific message indicating data is pruned may be more flexible, but more =
complex to handle too.=A0</span><span style=3D"color:rgb(34,34,34)">=A0</sp=
an></div>
</div></div></div></div></blockquote></div></div></div></div></blockquote><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
 #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra"><=
div class=3D"gmail_quote">
<div class=3D"im"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div clas=
s=3D"gmail_extra"><div class=3D"gmail_quote"><div>
</div></div></div></div></blockquote><div><br></div></div><div>Well, old no=
des would ignore it and new nodes wouldn&#39;t need it?</div></div></div></=
div></blockquote><div><br></div><div style>I&#39;m sure there will be cases=
 where a new node connects based on outdated information. I&#39;m just stat=
ing that I agree with the generic policy of &quot;if a node requests someth=
ing it should have known the peer doesn&#39;t serve, it is fair to be disco=
nnected.&quot;</div>
<div style>=A0<span style=3D"color:rgb(80,0,80)">=A0</span></div><blockquot=
e class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc sol=
id;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra"><div class=
=3D"gmail_quote">
<div class=3D"im"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div=
>
<div><span style=3D"color:rgb(34,34,34)">The reason for splitting them is t=
hat I think over time these may be handled by different implementations. Yo=
u could have stupid storage/bandwidth nodes that just keep the blockchain a=
round, and others that validate it. Even if that doesn&#39;t happen impleme=
ntation-wise, I think these are sufficiently independent functions to start=
 thinking about them as such.</span></div>

</div></div></div></div></blockquote><div><br></div></div><div>Maybe so, wi=
th a &quot;last N blocks&quot; in addr messages though such nodes could jus=
t set their advertised history to zero and not have to deal with serving bl=
ocks to nodes.</div>

<div><br></div><div>If you have a node that serves the chain but doesn&#39;=
t validate it, how does it know what the best chain is? Just whatever the h=
ardest is?</div></div></div></div></blockquote><div><br></div><div style>
Maybe it validates, maybe it doesn&#39;t. What matters is that it doesn&#39=
;t guarantee relaying fresh blocks and transactions. Maybe it does validate=
, maybe it just stores any blocks, and uses a validating node to know what =
to announce as best chain, or it uses an SPV mechanism to determine that. O=
r it only validates and relays blocks, but not transactions. My point is th=
at &quot;serving historic data&quot; and &quot;relaying fresh data&quot; ar=
e separate responsibilities, and there&#39;s no need to require them to be =
combined.</div>
<div style><br></div><div style>--=A0</div><div style>Pieter</div><div styl=
e><br></div></div></div></div>

--089e0149c57ea48ba304dbcf831f--