Return-Path: Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id E4DDBC0001 for ; Sat, 27 Feb 2021 22:40:08 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id C60984F01D for ; Sat, 27 Feb 2021 22:40:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -0.201 X-Spam-Level: X-Spam-Status: No, score=-0.201 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001] autolearn=ham autolearn_force=no Authentication-Results: smtp4.osuosl.org (amavisd-new); dkim=pass (1024-bit key) header.d=woobling.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cbBPww_G6dC6 for ; Sat, 27 Feb 2021 22:40:07 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.8.0 Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) by smtp4.osuosl.org (Postfix) with ESMTPS id B2B004F003 for ; Sat, 27 Feb 2021 22:40:07 +0000 (UTC) Received: by mail-oi1-f181.google.com with SMTP id z126so13971578oiz.6 for ; Sat, 27 Feb 2021 14:40:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=woobling.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=alpCL44/B/BBCvreZR+/uED8ZI0B297IO5ufYttyk4Y=; b=Hi6SwrdS5ZjOTLdeQZrK/z4HguRyjx7IA+kClXPALpipt0XoOyH3QJfhclF1MNBPGj Dyy0eFtb8BgEcysCA3DjWKzA2X5sUlX3uIZjPSQPAohseuBS8NbQXUdByuvblLSGxFkD hGBOynxfNSvolg2fklBRdo31Mr97OGI5v9Z4M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=alpCL44/B/BBCvreZR+/uED8ZI0B297IO5ufYttyk4Y=; b=KpVpCPhAu5w/dc4ME3m+/8xmKfovI0mVqQ7APUpZaWIoIoY4im3bq/+iR5b10yXXB3 Ghd8Pd3NUWEkqZBocwmxjrTgC8i9vKv49kvcJISHyKfZ5cC9gPY/WarqoihoG0sGgX5x 48pku4tFySZm3fKAol7BMEGL40+Kha6yOqa35z1joXJ8D26CGbru8+6agcKKvwS0bsgA kqsGQzzImHgr4/mQrMcQVHoB62zr23CHzowLbS47NbX5l5JbuZ+ZR7ovjnw2sovmXVzw ksVI4eHJ6UopAfI0EgIAgTTYozKRhHAU/omWaJztiYtJrPNSVcLRUvmrIP6OeOpf3crJ bPWA== X-Gm-Message-State: AOAM531SznVjdDqrRI1dDoFsyaWgPIi313p7C0OEPR+9tzZqQR9NGz9S 1UOaYanoouiKe46nM6KuRHmbRrIkeRj6D/EVtYiGSPeQ3G27kO70 X-Google-Smtp-Source: ABdhPJzIPwHUnf0tUOpg3zZQob2gXYQ/O7QsbvamhHDx1Pn7voZ+SaEkFU36R60WKikv0cXlKnmub5jF+tzC1x+GO4o= X-Received: by 2002:a17:90a:517:: with SMTP id h23mr9961085pjh.108.1614463800841; Sat, 27 Feb 2021 14:10:00 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Yuval Kogman Date: Sat, 27 Feb 2021 22:09:48 +0000 Message-ID: To: Keagan McClelland , Bitcoin Protocol Discussion Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Sat, 27 Feb 2021 22:49:03 +0000 Subject: Re: [bitcoin-dev] A design for Probabilistic Partial Pruning X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Feb 2021 22:40:09 -0000 On Fri, 26 Feb 2021 at 20:57, Keagan McClelland via bitcoin-dev wrote: > The goals are to increase data redundancy in a way that more uniformly sh= ares the load across nodes, alleviating some of the pressure of full archiv= e nodes on the IBD problem. I am working on a draft BIP for this proposal b= ut figured I would submit it as a high level idea in case anyone had any fe= edback on the initial design before I go into specification levels of detai= l. You might be interested in an approach (henceforth "SeF") by Swanand Kadhe, Jichan Chung and Kannan Ramchandran which employs fountain codes, presented at Scaling Bitcoin 2019: https://arxiv.org/abs/1906.12140 From a cursory search it appears there is quite a bit of related/followup work as well. The simplest fountain code, the Luby Transform (applied in this work) the encoder divides a message into smaller chunks, and then constructs an infinite stream of codewords which are XORs of d randomly selected chunks where d is sampled from the robust soliton distribution. The simplest decoder simply XORs new k=3D1 codewords from the relevant k>1 codewords, and the full message can be recovered with overwhelming probability and in linear time with a sufficiently large random sample of codewords from the encoded stream. Note that the decoder must know which chunks went into a codeword, but this is usually addressed using pseudorandomness, which has other motivations in an adversarial setting. In SeF, the general idea is that "droplet nodes" are pruning nodes which retain some number (equivalent to your threshold parameter) of codewords from the encoding concatenated blocks (to obtain a fixed message size), and serve these to compatible clients. This is more robust than retaining a random sample of blocks, and also performs well according to their simulations. Even if this paper is not directly applicable to your efforts, the theory of fountain codes should be of interest (many variants exist), and there is work on fountain codes. There is also some work on fountain codes in an adversarial setting (Falcon codes) but I'm only vaguely familiar with it, and if i'm not mistaken most of the considerations are either already implicitly addressed by the Bitcoin protocol or explicitly addressed in the SeF paper, whose results also take into account malicious droplet nodes.