Received: from sog-mx-4.v43.ch3.sourceforge.com ([172.29.43.194] helo=mx.sourceforge.net) by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1UYF86-00048W-F7 for bitcoin-development@lists.sourceforge.net; Fri, 03 May 2013 12:30:26 +0000 Received-SPF: pass (sog-mx-4.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.223.174 as permitted sender) client-ip=209.85.223.174; envelope-from=pieter.wuille@gmail.com; helo=mail-ie0-f174.google.com; Received: from mail-ie0-f174.google.com ([209.85.223.174]) by sog-mx-4.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1UYF85-0003Gj-Dl for bitcoin-development@lists.sourceforge.net; Fri, 03 May 2013 12:30:26 +0000 Received: by mail-ie0-f174.google.com with SMTP id 10so1760160ied.33 for ; Fri, 03 May 2013 05:30:20 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.50.178.172 with SMTP id cz12mr8417054igc.48.1367584220112; Fri, 03 May 2013 05:30:20 -0700 (PDT) Received: by 10.50.112.100 with HTTP; Fri, 3 May 2013 05:30:19 -0700 (PDT) In-Reply-To: References: Date: Fri, 3 May 2013 14:30:19 +0200 Message-ID: From: Pieter Wuille To: Mike Hearn Content-Type: multipart/alternative; boundary=089e0149c57ea48ba304dbcf831f X-Spam-Score: -0.6 (/) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (pieter.wuille[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 1.0 HTML_MESSAGE BODY: HTML included in message -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1UYF85-0003Gj-Dl Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] Service bits for pruned nodes X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 May 2013 12:30:26 -0000 --089e0149c57ea48ba304dbcf831f Content-Type: text/plain; charset=ISO-8859-1 (generic comment on the discussion that spawned off: ideas about how to allow additional protocols for block exchange are certainly interesting, and in the long term we should certainly consider that. For now I'd like to keep this about the more immediate way forward with making the P2P protocol not break in the presence of pruning nodes) On Sun, Apr 28, 2013 at 6:57 PM, Mike Hearn wrote: > That's true. It can be perhaps be represented as "I keep the last N > blocks" and then most likely for any given node the policy doesn't change > all that fast, so if you know the best chain height you can calculate which > nodes have what. > Yes, I like that better than broadcasting the exact height starting at which you serve (though I would put that information immediately in the version announcement). I don't think we can rely on the addr broadcasting mechanism for fast information exchange anyway. One more problem with this: DNS seeds cannot convey this information (neither do they currently convey service bits, but at least those can be indexed separately, and served explicitly through asking for a specific subdomain or so). So to summarize: * Add a field to addr messages (after protocol number increase) that maintains number of top blocks served)? * Add a field to version message to announce the actual first block served? * Add service bits to separately enable "relaying/verifying node" and "serves (part of) the historic chain"? My original reason for suggesting this was different, I think better compatibility with DNS seeds may be a good reason for this. You could ask the seed first for a subset that at least serves some part of the historic chain, until you hit a node that has enough, and once caught up, ask for nodes that relay. Disconnecting in case something is requested that isn't served seems like >> an acceptable behaviour, yes. A specific message indicating data is pruned >> may be more flexible, but more complex to handle too. >> > > Well, old nodes would ignore it and new nodes wouldn't need it? > I'm sure there will be cases where a new node connects based on outdated information. I'm just stating that I agree with the generic policy of "if a node requests something it should have known the peer doesn't serve, it is fair to be disconnected." > The reason for splitting them is that I think over time these may be >> handled by different implementations. You could have stupid >> storage/bandwidth nodes that just keep the blockchain around, and others >> that validate it. Even if that doesn't happen implementation-wise, I think >> these are sufficiently independent functions to start thinking about them >> as such. >> > > Maybe so, with a "last N blocks" in addr messages though such nodes could > just set their advertised history to zero and not have to deal with serving > blocks to nodes. > > If you have a node that serves the chain but doesn't validate it, how does > it know what the best chain is? Just whatever the hardest is? > Maybe it validates, maybe it doesn't. What matters is that it doesn't guarantee relaying fresh blocks and transactions. Maybe it does validate, maybe it just stores any blocks, and uses a validating node to know what to announce as best chain, or it uses an SPV mechanism to determine that. Or it only validates and relays blocks, but not transactions. My point is that "serving historic data" and "relaying fresh data" are separate responsibilities, and there's no need to require them to be combined. -- Pieter --089e0149c57ea48ba304dbcf831f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
(generic comment on the discussion that spawned= off: ideas about how to allow additional protocols for block exchange are = certainly interesting, and in the long term we should certainly consider th= at. For now I'd like to keep this about the more immediate way forward = with making the P2P protocol not break in the presence of pruning nodes)

On Sun, Apr 28, 2013 at 6:57 PM, Mike Hearn <mike@plan= 99.net> wrote:
=
That's true. It can be perhaps be repre= sented as "I keep the last N blocks" and then most likely for any= given node the policy doesn't change all that fast, so if you know the= best chain height you can calculate which nodes have what.

Yes, I like that b= etter than broadcasting the exact height starting at which you serve (thoug= h I would put that information immediately in the version announcement). I = don't think we can rely on the addr broadcasting mechanism for fast inf= ormation exchange anyway. One more problem with this: DNS seeds cannot conv= ey this information (neither do they currently convey service bits, but at = least those can be indexed separately, and served explicitly through asking= for a specific subdomain or so).

So to summarize:
* Add a fi= eld to addr messages (after protocol number increase) that maintains number= of top blocks served)?
* Add a field to version message to= announce the actual first block served?
* Add service bits to separately enable "relaying/verifying= node" and "serves (part of) the historic chain"? My origina= l reason for suggesting this was different, I think better compatibility wi= th DNS seeds may be a good reason for this. You could ask the seed first fo= r a subset that at least serves some part of the historic chain, until you = hit a node that has enough, and once caught up, ask for nodes that relay.

Disconnecting in case something is= requested that isn't served seems like an acceptable behaviour, yes. A= specific message indicating data is pruned may be more flexible, but more = complex to handle too.=A0=A0
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
<= div class=3D"gmail_quote">

Well, old no= des would ignore it and new nodes wouldn't need it?

I'm sure there will be cases= where a new node connects based on outdated information. I'm just stat= ing that I agree with the generic policy of "if a node requests someth= ing it should have known the peer doesn't serve, it is fair to be disco= nnected."
=A0=A0
The reason for splitting them is t= hat I think over time these may be handled by different implementations. Yo= u could have stupid storage/bandwidth nodes that just keep the blockchain a= round, and others that validate it. Even if that doesn't happen impleme= ntation-wise, I think these are sufficiently independent functions to start= thinking about them as such.

Maybe so, wi= th a "last N blocks" in addr messages though such nodes could jus= t set their advertised history to zero and not have to deal with serving bl= ocks to nodes.

If you have a node that serves the chain but doesn'= t validate it, how does it know what the best chain is? Just whatever the h= ardest is?

Maybe it validates, maybe it doesn't. What matters is that it doesn'= ;t guarantee relaying fresh blocks and transactions. Maybe it does validate= , maybe it just stores any blocks, and uses a validating node to know what = to announce as best chain, or it uses an SPV mechanism to determine that. O= r it only validates and relays blocks, but not transactions. My point is th= at "serving historic data" and "relaying fresh data" ar= e separate responsibilities, and there's no need to require them to be = combined.

--=A0
Pieter

--089e0149c57ea48ba304dbcf831f--