Received: from sog-mx-1.v43.ch3.sourceforge.com ([172.29.43.191] helo=mx.sourceforge.net) by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1YsFTG-0005eC-AH for bitcoin-development@lists.sourceforge.net; Tue, 12 May 2015 19:04:02 +0000 Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.216.54 as permitted sender) client-ip=209.85.216.54; envelope-from=gmaxwell@gmail.com; helo=mail-vn0-f54.google.com; Received: from mail-vn0-f54.google.com ([209.85.216.54]) by sog-mx-1.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1YsFTF-0007rv-Bf for bitcoin-development@lists.sourceforge.net; Tue, 12 May 2015 19:04:02 +0000 Received: by vnbf62 with SMTP id f62so1312989vnb.3 for ; Tue, 12 May 2015 12:03:56 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.52.7.33 with SMTP id g1mr12270976vda.83.1431457435905; Tue, 12 May 2015 12:03:55 -0700 (PDT) Received: by 10.52.9.200 with HTTP; Tue, 12 May 2015 12:03:55 -0700 (PDT) In-Reply-To: References: <20150512171640.GA32606@savin.petertodd.org> Date: Tue, 12 May 2015 19:03:55 +0000 Message-ID: From: Gregory Maxwell To: Tier Nolan Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -1.6 (-) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (gmaxwell[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1YsFTF-0007rv-Bf Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] Proposed additional options for pruned nodes X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2015 19:04:02 -0000 It's a little frustrating to see this just repeated without even paying attention to the desirable characteristics from the prior discussions. Summarizing from memory: (0) Block coverage should have locality; historical blocks are (almost) always needed in contiguous ranges. Having random peers with totally random blocks would be horrific for performance; as you'd have to hunt down a working peer and make a connection for each block with high probability. (1) Block storage on nodes with a fraction of the history should not depend on believing random peers; because listening to peers can easily create attacks (e.g. someone could break the network; by convincing nodes to become unbalanced) and not useful-- it's not like the blockchain is substantially different for anyone; if you're to the point of needing to know coverage to fill then something is wrong. Gaps would be handled by archive nodes, so there is no reason to increase vulnerability by doing anything but behaving uniformly. (2) The decision to contact a node should need O(1) communications, not just because of the delay of chasing around just to find who has someone; but because that chasing process usually makes the process _highly_ sybil vulnerable. (3) The expression of what blocks a node has should be compact (e.g. not a dense list of blocks) so it can be rumored efficiently. (4) Figuring out what block (ranges) a peer has given should be computationally efficient. (5) The communication about what blocks a node has should be compact. (6) The coverage created by the network should be uniform, and should remain uniform as the blockchain grows; ideally it you shouldn't need to update your state to know what blocks a peer will store in the future, assuming that it doesn't change the amount of data its planning to use. (What Tier Nolan proposes sounds like it fails this point) (7) Growth of the blockchain shouldn't cause much (or any) need to refetch old blocks. I've previously proposed schemes which come close but fail one of the above. (e.g. a scheme based on reservoir sampling that gives uniform selection of contiguous ranges, communicating only 64 bits of data to know what blocks a node claims to have, remaining totally uniform as the chain grows, without any need to refetch -- but needs O(height) work to figure out what blocks a peer has from the data it communicated.; or another scheme based on consistent hashes that has log(height) computation; but sometimes may result in a node needing to go refetch an old block range it previously didn't store-- creating re-balancing traffic.) So far something that meets all those criteria (and/or whatever ones I'm not remembering) has not been discovered; but I don't really think much time has been spent on it. I think its very likely possible.