Return-Path: Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 192BAC0001 for ; Sat, 27 Feb 2021 07:10:54 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id E859D842FC for ; Sat, 27 Feb 2021 07:10:53 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: 1.3 X-Spam-Level: * X-Spam-Status: No, score=1.3 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no Authentication-Results: smtp1.osuosl.org (amavisd-new); dkim=pass (2048-bit key) header.d=codexapertus-com.20150623.gappssmtp.com Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WCTb3dtYA3iz for ; Sat, 27 Feb 2021 07:10:52 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) by smtp1.osuosl.org (Postfix) with ESMTPS id 8892B842FB for ; Sat, 27 Feb 2021 07:10:52 +0000 (UTC) Received: by mail-yb1-f170.google.com with SMTP id c131so11284867ybf.7 for ; Fri, 26 Feb 2021 23:10:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codexapertus-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=n/ht1PcMwBb+RstIxCx6iVJxZKk2ayvn7+TU1GPbDp4=; b=e+kVkufVBFt8I+zIRJmadJg5Onr1FcY6MySa4FHNQ8KdTlnsg6H9aD5yJhu7aciy74 KC3orlHpezl968qu1nn0AAw7P4+pllwUSZ2glFyazCKDdrsxSBDBQGiERwsqVJFepecl BJBxswzN8h2RSqiyteSKkS6ehLW30TsJhFKlezxteNa1i8AZ7Ovbp88bOIzMbJ4NqAoM LfxfEdJeHpClZ40EAjZuUDf2h9RUBSUt55jFNWAU/offVfPai0zJ2pX5d6i9ruOC3mcy m8frybNohImoS5aPwNrp1Euz3ram3MxKPFdx3/UrjHSiA1L+VYfM3+/ju+l+H6BYI/W/ zELg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=n/ht1PcMwBb+RstIxCx6iVJxZKk2ayvn7+TU1GPbDp4=; b=oK41sWwF67mzlsxJ+qH5Tr2/nsKGo5xTbaSNhcb3pqQDy+KmcEIm38aUSfOUE2cxqm 8tMW2AUtDCsj1+kLGHi2PEju/KUUbouAOJsMsRfU1ZiJtCZRS3N2TJOIPasbg7QbmydA HDHeQO2xxUW8k9/fiT26yKUwAD4tlfn/KeMLrHYcvhs/ab1Ui726pyyhx5WTVuqj3GYm QcIMAJbF2Fq0E5UKUbxhAM+psyK1GLzlpGcRO2fqAJBz9lWxDx92mRr8xhojQozh9HzR O5CwGpxQJIWl9LcKFMOwCmvoVgJ4CxAsx/Wa8pxPlbSAKgNc5Hrh5W2NyZ4YLRvcHoPT ++SQ== X-Gm-Message-State: AOAM531LFiES1Z10mAt6/ce1g5lNIzSWW28PY1QupRVhtI8vT72l9jtO 4NbhNr/J5kV/yDFceZkq/fu73GR8nonl0YUjA0U= X-Google-Smtp-Source: ABdhPJxQf93Hmq8RbilEXj+BZDmspFJkcuIo891kYDurzipk/CiIvOBHsabFyZ3wfkbddSExUK9cq4D5ctWdeuyMYeo= X-Received: by 2002:a5b:847:: with SMTP id v7mr9526758ybq.354.1614409849835; Fri, 26 Feb 2021 23:10:49 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Igor Cota Date: Sat, 27 Feb 2021 08:10:39 +0100 Message-ID: To: Keagan McClelland , Bitcoin Protocol Discussion , chike@sfc.wide.ad.jp, Artur Brugeman Content-Type: multipart/alternative; boundary="0000000000009f13fa05bc4c1565" X-Mailman-Approved-At: Sat, 27 Feb 2021 11:49:53 +0000 Subject: Re: [bitcoin-dev] A design for Probabilistic Partial Pruning X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Feb 2021 07:10:54 -0000 --0000000000009f13fa05bc4c1565 Content-Type: text/plain; charset="UTF-8" Hi Keagan, I had a very similar idea. The only difference being for the node to decide on a range of blocks to keep beforehand, rather than making the decision block-by-block like you suggest. I felt the other nodes would be better served by ranges due to the sequential nature of IBD. Perhaps this would be computationally lighter as well. I also encourage you to read Ryosuke Abe's paper [1] that proposes a DHT scheme to solve this same problem. Cheers, Igor [1] https://arxiv.org/abs/1902.02174 On Fri, 26 Feb 2021 at 21:57, Keagan McClelland via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > Hi all, > > I've been thinking for quite some time about the problem of pruned nodes > and ongoing storage costs for full nodes. One of the things that strikes me > as odd is that we only really have two settings. > > A. Prune everything except the most recent blocks, down to the cache size > B. Keep everything since genesis > > From my observations and conversations with various folks in the > community, they would like to be able to run a "partially" pruned node to > help bear the load of bootstrapping other nodes and helping with data > redundancy in the network, but would prefer to not dedicate hundreds of > Gigabytes of storage space to the cause. > > This led me to the idea that a node could randomly prune some of the > blocks from history if it passed some predicate. A rough sketch of this > would look as follows. > > 1. At node startup, it would generate a random seed, this would be unique > to the node but not necessary that it be cryptographically secure. > 2. In the node configuration it would also carry a "threshold" expressed > as some percentage of blocks it wanted to keep. > 3. As IBD occurs, based off of the threshold, the block hash, and the > node's unique seed, the node would either decide to prune the data or keep > it. The uniqueness of the node's hash should ensure that no block is > systematically overrepresented in the set of nodes choosing this storage > scheme. > 4. Once the node's IBD is complete it would advertise this as a peer > service, advertising its seed and threshold, so that nodes could > deterministically deduce which of its peers had which blocks. > > The goals are to increase data redundancy in a way that more uniformly > shares the load across nodes, alleviating some of the pressure of full > archive nodes on the IBD problem. I am working on a draft BIP for this > proposal but figured I would submit it as a high level idea in case anyone > had any feedback on the initial design before I go into specification > levels of detail. > > If you have thoughts on > > A. The protocol design itself > B. The barriers to put this kind of functionality into Core > > I would love to hear from you, > > Cheers, > Keagan > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > -- *Igor Cota* Codex Apertus d.o.o. --0000000000009f13fa05bc4c1565 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Keagan,

I had a very similar idea. T= he only difference being for the node to decide on a range of blocks to kee= p beforehand, rather than making the decision block-by-block like you sugge= st.

I felt the other nodes would be better served = by ranges due to the sequential nature of IBD. Perhaps this would be comput= ationally lighter as well.

I also encourage=C2=A0y= ou to read=C2=A0Ryosuke Abe's paper [1] that proposes a=C2=A0DHT scheme= to solve this same problem.

Cheers,
Igo= r


On Fri, 26 Feb 2021 at 21:5= 7, Keagan McClelland via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.= org> wrote:
Hi all,

I've been thinking for q= uite some time about the problem of pruned nodes and ongoing storage costs = for full nodes. One of the things that strikes me as odd is that we only re= ally have two settings.

A. Prune everything except= the most recent blocks, down to the cache size
B. Keep everythin= g since genesis

From my observations and conversat= ions with various folks in the community, they would like to be able to run= a "partially" pruned node to help bear the load of bootstrapping= other nodes and helping with data redundancy in the network, but would pre= fer to not dedicate hundreds of Gigabytes of storage space to the cause.

This led me to the idea that a node could randomly p= rune some of the blocks from history if it passed some predicate. A rough s= ketch of this would look as follows.

1. At node st= artup, it would generate a random seed, this would be unique to the node bu= t not necessary that it be cryptographically secure.
2. In the no= de configuration it would also carry a "threshold" expressed as s= ome percentage of blocks it wanted to keep.
3. As IBD occurs, bas= ed off of the threshold, the block hash, and the node's unique seed, th= e node would either decide to prune the data or keep it. The uniqueness of = the node's hash should ensure that no block is systematically overrepre= sented in the set of nodes choosing this storage scheme.
4. Once = the node's IBD is complete it would advertise this as a peer service, a= dvertising its seed and threshold, so that nodes could deterministically de= duce which of its peers had which blocks.

The goal= s are to increase data redundancy in a way that more uniformly shares the l= oad across nodes, alleviating some of the pressure of full archive nodes on= the IBD problem. I am working on a draft BIP for this proposal but figured= I would submit it as a high level idea in case anyone had any feedback on = the initial design before I go into specification levels of detail.

If you have thoughts on

A. The p= rotocol design itself
B. The barriers to put this kind of functio= nality into Core

I would love to hear from you,

Cheers,
Keagan
_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev


--
Igor Cota
Codex Apertus = d.o.o.
--0000000000009f13fa05bc4c1565--