Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of gmail.com
	designates 209.85.220.41 as permitted sender)
	client-ip=209.85.220.41; envelope-from=gappleto97@gmail.com;
	helo=mail-pa0-f41.google.com; 
MIME-Version: 1.0
In-Reply-To: <CAE-z3OWR72Og78RLuXEPjzRR8gCEjAuFk2nq-JzDtt_2pKSmHQ@mail.gmail.com>
References: <CANJO25J1WRHtfQLVXUB2s_sjj39pTPWmixAcXNJ3t-5os8RPmQ@mail.gmail.com>
	<CANJO25JTtfmfsOQYOzJeksJn3CoKE3W8iLGsRko-_xd4XhB3ZA@mail.gmail.com>
	<CAJHLa0O5OxaX5g3u=dnCY6Lz_gK3QZgQEPNcWNVRD4JziwAmvg@mail.gmail.com>
	<20150512171640.GA32606@savin.petertodd.org>
	<CAE-z3OV3VdSoiTSfASwYHr1CjZSqio303sqGq_1Y9yaYgov2sw@mail.gmail.com>
	<CAAS2fgRzGkcJbWbJmFN2-NSJGUcLdPKp0q7FjM0x7WDvHoRq=g@mail.gmail.com>
	<CAE-z3OWR72Og78RLuXEPjzRR8gCEjAuFk2nq-JzDtt_2pKSmHQ@mail.gmail.com>
Date: Tue, 12 May 2015 18:09:44 -0400
Message-ID: <CANJO25Ls2Hbv=GYPBq85M-3=Jna5x4=BYE67km7SbaJg3dgFdA@mail.gmail.com>
From: gabe appleton <gappleto97@gmail.com>
To: Tier Nolan <tier.nolan@gmail.com>
Content-Type: multipart/alternative; boundary=047d7b6da9f07bed800515e9c085
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] Proposed additional options for pruned
	nodes
Precedence: list

--047d7b6da9f07bed800515e9c085
Content-Type: text/plain; charset=UTF-8

This is exactly the sort of solution I was hoping for. It seems this is the
minimal modification to make it work, and, if someone was willing to work
with me, I would love to help implement this.

My only concern would be if the - - max-size flag is not included than this
delivers significantly less benefit to the end user. Still a good chunk,
but possibly not enough.
On May 12, 2015 6:03 PM, "Tier Nolan" <tier.nolan@gmail.com> wrote:

>
>
> On Tue, May 12, 2015 at 8:03 PM, Gregory Maxwell <gmaxwell@gmail.com>
> wrote:
>
>>
>> (0) Block coverage should have locality; historical blocks are
>> (almost) always needed in contiguous ranges.   Having random peers
>> with totally random blocks would be horrific for performance; as you'd
>> have to hunt down a working peer and make a connection for each block
>> with high probability.
>>
>> (1) Block storage on nodes with a fraction of the history should not
>> depend on believing random peers; because listening to peers can
>> easily create attacks (e.g. someone could break the network; by
>> convincing nodes to become unbalanced) and not useful-- it's not like
>> the blockchain is substantially different for anyone; if you're to the
>> point of needing to know coverage to fill then something is wrong.
>> Gaps would be handled by archive nodes, so there is no reason to
>> increase vulnerability by doing anything but behaving uniformly.
>>
>> (2) The decision to contact a node should need O(1) communications,
>> not just because of the delay of chasing around just to find who has
>> someone; but because that chasing process usually makes the process
>> _highly_ sybil vulnerable.
>>
>> (3) The expression of what blocks a node has should be compact (e.g.
>> not a dense list of blocks) so it can be rumored efficiently.
>>
>> (4) Figuring out what block (ranges) a peer has given should be
>> computationally efficient.
>>
>> (5) The communication about what blocks a node has should be compact.
>>
>> (6) The coverage created by the network should be uniform, and should
>> remain uniform as the blockchain grows; ideally it you shouldn't need
>> to update your state to know what blocks a peer will store in the
>> future, assuming that it doesn't change the amount of data its
>> planning to use. (What Tier Nolan proposes sounds like it fails this
>> point)
>>
>> (7) Growth of the blockchain shouldn't cause much (or any) need to
>> refetch old blocks.
>>
>
> M = 1,000,000
> N = number of "starts"
>
> S(0) = hash(seed) mod M
> ...
> S(n) = hash(S(n-1)) mod M
>
> This generates a sequence of start points.  If the start point is less
> than the block height, then it counts as a hit.
>
> The node stores the 50MB of data starting at the block at height S(n).
>
> As the blockchain increases in size, new starts will be less than the
> block height.  This means some other runs would be deleted.
>
> A weakness is that it is random with regards to block heights.  Tiny
> blocks have the same priority as larger blocks.
>
> 0) Blocks are local, in 50MB runs
> 1) Agreed, nodes should download headers-first (or some other compact way
> of finding the highest POW chain)
> 2) M could be fixed, N and the seed are all that is required.  The seed
> doesn't have to be that large.  If 1% of the blockchain is stored, then 16
> bits should be sufficient so that every block is covered by seeds.
> 3) N is likely to be less than 2 bytes and the seed can be 2 bytes
> 4) A 1% cover of 50GB of blockchain would have 10 starts @ 50MB per run.
> That is 10 hashes.  They don't even necessarily need to be crypt hashes
> 5) Isn't this the same as 3?
> 6) Every block has the same odds of being included.  There inherently
> needs to be an update when a node deletes some info due to exceeding its
> cap.  N can be dropped one run at a time.
> 7) When new starts drop below the tip height, N can be decremented and
> that one run is deleted.
>
> There would need to be a special rule to ensure the low height blocks are
> covered.  Nodes should keep the first 50MB of blocks with some probability
> (10%?)
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
>

--047d7b6da9f07bed800515e9c085
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr">This is exactly the sort of solution I was hoping for. It se=
ems this is the minimal modification to make it work, and, if someone was w=
illing to work with me, I would love to help implement this. </p>
<p dir=3D"ltr">My only concern would be if the - - max-size flag is not inc=
luded than this delivers significantly less benefit to the end user. Still =
a good chunk, but possibly not enough. </p>
<div class=3D"gmail_quote">On May 12, 2015 6:03 PM, &quot;Tier Nolan&quot; =
&lt;<a href=3D"mailto:tier.nolan@gmail.com">tier.nolan@gmail.com</a>&gt; wr=
ote:<br type=3D"attribution"><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr=
"><div><div><div><div><div><div><div><div><div><div><div><div><div class=3D=
"gmail_extra"><br><div class=3D"gmail_quote"><br>On Tue, May 12, 2015 at 8:=
03 PM, Gregory Maxwell <span dir=3D"ltr">&lt;<a href=3D"mailto:gmaxwell@gma=
il.com" target=3D"_blank">gmaxwell@gmail.com</a>&gt;</span> wrote:<br><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:=
1px solid rgb(204,204,204);padding-left:1ex"><br>
(0) Block coverage should have locality; historical blocks are<br>
(almost) always needed in contiguous ranges.=C2=A0 =C2=A0Having random peer=
s<br>
with totally random blocks would be horrific for performance; as you&#39;d<=
br>
have to hunt down a working peer and make a connection for each block<br>
with high probability.<br>
<br>
(1) Block storage on nodes with a fraction of the history should not<br>
depend on believing random peers; because listening to peers can<br>
easily create attacks (e.g. someone could break the network; by<br>
convincing nodes to become unbalanced) and not useful-- it&#39;s not like<b=
r>
the blockchain is substantially different for anyone; if you&#39;re to the<=
br>
point of needing to know coverage to fill then something is wrong.<br>
Gaps would be handled by archive nodes, so there is no reason to<br>
increase vulnerability by doing anything but behaving uniformly.<br>
<br>
(2) The decision to contact a node should need O(1) communications,<br>
not just because of the delay of chasing around just to find who has<br>
someone; but because that chasing process usually makes the process<br>
_highly_ sybil vulnerable.<br>
<br>
(3) The expression of what blocks a node has should be compact (e.g.<br>
not a dense list of blocks) so it can be rumored efficiently.<br>
<br>
(4) Figuring out what block (ranges) a peer has given should be<br>
computationally efficient.<br>
<br>
(5) The communication about what blocks a node has should be compact.<br>
<br>
(6) The coverage created by the network should be uniform, and should<br>
remain uniform as the blockchain grows; ideally it you shouldn&#39;t need<b=
r>
to update your state to know what blocks a peer will store in the<br>
future, assuming that it doesn&#39;t change the amount of data its<br>
planning to use. (What Tier Nolan proposes sounds like it fails this<br>
point)<br>
<br>
(7) Growth of the blockchain shouldn&#39;t cause much (or any) need to<br>
refetch old blocks.<br></blockquote><div><br><div class=3D"gmail_quote">M =
=3D 1,000,000<br></div><div class=3D"gmail_quote">N =3D number of &quot;sta=
rts&quot;<br></div><div class=3D"gmail_quote"><br></div><div class=3D"gmail=
_quote">S(0) =3D hash(seed) mod M<br></div><div class=3D"gmail_quote">...<b=
r>S(n) =3D hash(S(n-1)) mod M<br></div><div class=3D"gmail_quote"><br></div=
><div class=3D"gmail_quote">This generates a sequence of start points.=C2=
=A0 If the start point is less than the block height, then it counts as a h=
it.<br><br></div><div class=3D"gmail_quote">The node stores the 50MB of dat=
a starting at the block at height S(n).<br><br></div><div class=3D"gmail_qu=
ote">As
 the blockchain increases in size, new starts will be less than the=20
block height.=C2=A0 This means some other runs would be deleted.<br><br></d=
iv>A weakness is that it is random with regards to block heights.=C2=A0 Tin=
y blocks have the same priority as larger blocks.<br><br></div><div>0) Bloc=
ks are local, in 50MB runs<br></div><div>1) Agreed, nodes should download h=
eaders-first (or some other compact way of finding the highest POW chain)<b=
r></div><div>2) M could be fixed, N and the seed are all that is required.=
=C2=A0 The seed doesn&#39;t have to be that large.=C2=A0 If 1% of the block=
chain is stored, then 16 bits should be sufficient so that every block is c=
overed by seeds.<br></div><div>3) N is likely to be less than 2 bytes and t=
he seed can be 2 bytes<br></div><div>4) A 1% cover of 50GB of blockchain wo=
uld have 10 starts @ 50MB per run.=C2=A0 That is 10 hashes.=C2=A0 They don&=
#39;t even necessarily need to be crypt hashes<br></div><div>5) Isn&#39;t t=
his the same as 3?<br></div><div>6) Every block has the same odds of being =
included.=C2=A0 There inherently needs to be an update when a node deletes =
some info due to exceeding its cap.=C2=A0 N can be dropped one run at a tim=
e.=C2=A0 <br></div><div>7) When new starts drop below the tip height, N can=
 be decremented and that one run is deleted.<br><br></div><div>There would =
need to be a special rule to ensure the low height blocks are covered.=C2=
=A0 Nodes should keep the first 50MB of blocks with some probability (10%?)=
<br></div></div></div></div></div></div></div></div></div></div></div></div=
></div></div></div></div>
<br>-----------------------------------------------------------------------=
-------<br>
One dashboard for servers and applications across Physical-Virtual-Cloud<br=
>
Widest out-of-the-box monitoring support with 50+ applications<br>
Performance metrics, stats and reports that give you Actionable Insights<br=
>
Deep dive visibility with transaction tracing using APM Insight.<br>
<a href=3D"http://ad.doubleclick.net/ddm/clk/290420510;117567292;y" target=
=3D"_blank">http://ad.doubleclick.net/ddm/clk/290420510;117567292;y</a><br>=
_______________________________________________<br>
Bitcoin-development mailing list<br>
<a href=3D"mailto:Bitcoin-development@lists.sourceforge.net">Bitcoin-develo=
pment@lists.sourceforge.net</a><br>
<a href=3D"https://lists.sourceforge.net/lists/listinfo/bitcoin-development=
" target=3D"_blank">https://lists.sourceforge.net/lists/listinfo/bitcoin-de=
velopment</a><br>
<br></blockquote></div>

--047d7b6da9f07bed800515e9c085--