Received: from sog-mx-4.v43.ch3.sourceforge.com ([172.29.43.194] helo=mx.sourceforge.net) by sfs-ml-1.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1TCEd4-0004Qz-IT for bitcoin-development@lists.sourceforge.net; Thu, 13 Sep 2012 18:59:10 +0000 X-ACL-Warn: Received: from vps7135.xlshosting.net ([178.18.90.41]) by sog-mx-4.v43.ch3.sourceforge.com with esmtp (Exim 4.76) id 1TCEd3-0002Lp-EM for bitcoin-development@lists.sourceforge.net; Thu, 13 Sep 2012 18:59:10 +0000 Received: by vps7135.xlshosting.net (Postfix, from userid 1000) id 0F4DD4A44C1; Thu, 13 Sep 2012 20:59:02 +0200 (CEST) Date: Thu, 13 Sep 2012 20:59:02 +0200 From: Pieter Wuille To: Matthew Mitchell Message-ID: <20120913185900.GA393@vps7135.xlshosting.net> References: <2104FC7F-0AE0-4C55-9987-B20EFCE19FC3@godofgod.co.uk> <19ED4257-0BCA-41C5-A533-B0AB9B500181@godofgod.co.uk> <2B95CF41-4304-4D2A-9ABF-198D97B7449B@godofgod.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-PGP-Key: http://sipa.ulyssis.org/pubkey.asc User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.7 (/) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (pieter.wuille[at]gmail.com) 0.0 DKIM_ADSP_CUSTOM_MED No valid author signature, adsp_override is CUSTOM_MED -0.5 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain 1.2 NML_ADSP_CUSTOM_MED ADSP custom_med hit, and not from a mailing list X-Headers-End: 1TCEd3-0002Lp-EM Cc: "bitcoin-development@lists.sourceforge.net" Subject: Re: [Bitcoin-development] Segmented Block Relaying BIP draft. X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Sep 2012 18:59:10 -0000 On Thu, Sep 13, 2012 at 06:49:49PM +0100, Matthew Mitchell wrote: > A merkle tree root is found by hashing the two children together and those children are found the same way until you get to the greatest level down the tree. This means you can validate children as being correct as long as they hash together to form the root. This means you do not need all the transaction hashes to validate segments of the block, you only need the root hashes for all the segments you want. Let's say there are 8 transactions and you want to verify 4 segments (2 transactions each), The merkle tree looks like this (Might not work depending on the font): > I hope that made sense. I'm quite sure Gregory thoroughly understands how Merkle trees work and why they are useful. His question was about the use case. Let me try to answer his question, by making some assumptions about your intentions. Correct me if I'm wrong - I haven't read all details. You want to parallellize block downloads, while at the same time preventing re-download of transactions that are already known. To do so, a requesting node would first request (for example) the 8 level-3 hashes, then start 8 parallel threads to download the transactions from presumably 8 different peers. Each thread then fetches the transaction id's that correspond to its own 1/8th of the block, and requests the transactions whose txid is not yet known. Comparing this with Gregory's own suggestion (just fetch the entire txid list at first, and then (again as parallellized as needed) fetch the unknown transactions from several peers), this does indeed have an advantage: you need to download (relatively) far less data before the threaded part can start. If this is what you propose, it is certainly meaningful, but the gains aren't very large, in my opinion. However, my impression while reading your proposal was at times that you intend to process the different layers of the Merkle tree iteratively after starting the initial parallel segments. I don't think that is useful, as you'll need the actual txids anyway before deciding whether they need to be downloaded at all, it adds several round-trips, and once you have downloaded the intermediate merkle hashes for about 8 levels, you've already transferred more data than the transactions themselves. I think Gregory also assumed something like this, making him question why it's useful. Anyway, this whole paragraph is one assumption, so if it's not the case, ignore me. Can you clarify what you mean? Preferably by giving a concrete sequence of exchanged messages, with a given purpose? PS: the reason Satoshi used a Merkle tree for the transactions in a block was to allow a short piece of data (the hashes along a path in the tree) to prove a transaction belongs to it - I doubt he intended it for parallellizing downloads (though it certainly opens some opportunities here). -- Pieter