Received: from sog-mx-4.v43.ch3.sourceforge.com ([172.29.43.194] helo=mx.sourceforge.net) by sfs-ml-3.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1TVoc6-0006l9-2z for bitcoin-development@lists.sourceforge.net; Tue, 06 Nov 2012 19:15:06 +0000 X-ACL-Warn: Received: from vps7135.xlshosting.net ([178.18.90.41]) by sog-mx-4.v43.ch3.sourceforge.com with esmtp (Exim 4.76) id 1TVoc4-00014o-VG for bitcoin-development@lists.sourceforge.net; Tue, 06 Nov 2012 19:15:06 +0000 Received: by vps7135.xlshosting.net (Postfix, from userid 1000) id 72CAE49401F; Tue, 6 Nov 2012 20:14:58 +0100 (CET) Date: Tue, 6 Nov 2012 20:14:58 +0100 From: Pieter Wuille To: Mike Hearn Message-ID: <20121106191455.GA23467@vps7135.xlshosting.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-PGP-Key: http://sipa.ulyssis.org/pubkey.asc User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.8 (/) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (pieter.wuille[at]gmail.com) 0.0 DKIM_ADSP_CUSTOM_MED No valid author signature, adsp_override is CUSTOM_MED -0.4 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain 1.2 NML_ADSP_CUSTOM_MED ADSP custom_med hit, and not from a mailing list X-Headers-End: 1TVoc4-00014o-VG Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] Draft BIP for Bloom filtering X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Nov 2012 19:15:06 -0000 On Fri, Oct 26, 2012 at 04:01:58PM +0200, Mike Hearn wrote: > I don't feel I understand the effort required to do some kind of > partial tree encoding. Having a kind of custom compression whereby > branches are represented as varint indexes into a dictionary, I can > feel how much work that involves and maybe I can make time over the > next few weeks to implement it. Has anyone got example code for > representing partial Merkle trees? I've implemented code for efficient representation of partial merkle trees: see https://github.com/sipa/bitcoin/commits/partialmerkle The encoding/decoding algorithm uses a depth-first traversal of the tree, at each node outputting whether anything relevant is beneath, and if not, storing its hash and not descending into the children. Furthermore, thanks to some properties of the tree, some hard upper bounds on the size of the serialized structures are guaranteed. See the comments in the code for details. Unit tests are included to verify that 1) encoding and decoding a subset of transactions is an identity 2) the calculated merkle root matches the merkle root calculated by the existing merkle algorithm in the source code 3) the calculated merkle root actually depends on all hashes in the data structure. 4) serialization/deserialization is an identity 5) bounds on the size of the serialization are respected Hope it is clear enough to port to other languages. -- Pieter