summaryrefslogtreecommitdiff
path: root/47/c45a96fc388e0edccd3cd6365e459302adb693
blob: f6b8b3afe7bd3fb0193987762b117c5935dd67fd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
Received: from sog-mx-4.v43.ch3.sourceforge.com ([172.29.43.194]
	helo=mx.sourceforge.net)
	by sfs-ml-3.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <pw@vps7135.xlshosting.net>) id 1TVoc6-0006l9-2z
	for bitcoin-development@lists.sourceforge.net;
	Tue, 06 Nov 2012 19:15:06 +0000
X-ACL-Warn: 
Received: from vps7135.xlshosting.net ([178.18.90.41])
	by sog-mx-4.v43.ch3.sourceforge.com with esmtp (Exim 4.76)
	id 1TVoc4-00014o-VG for bitcoin-development@lists.sourceforge.net;
	Tue, 06 Nov 2012 19:15:06 +0000
Received: by vps7135.xlshosting.net (Postfix, from userid 1000)
	id 72CAE49401F; Tue,  6 Nov 2012 20:14:58 +0100 (CET)
Date: Tue, 6 Nov 2012 20:14:58 +0100
From: Pieter Wuille <pieter.wuille@gmail.com>
To: Mike Hearn <mike@plan99.net>
Message-ID: <20121106191455.GA23467@vps7135.xlshosting.net>
References: <CANEZrP0XALwBFJyZTzYd5xBp4MRrjv0s_y2tOXbO7UgjWF2HzA@mail.gmail.com>
	<CAAS2fgScydOWz_eqnhWxQNVUOtzvSBwkj7tttP3_DLdW+=3CTQ@mail.gmail.com>
	<CANEZrP2sBZL=UYAxtjU2Su13Z12wB7s04LxmcyUR2hH51tcN9g@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CANEZrP2sBZL=UYAxtjU2Su13Z12wB7s04LxmcyUR2hH51tcN9g@mail.gmail.com>
X-PGP-Key: http://sipa.ulyssis.org/pubkey.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: 0.8 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	(pieter.wuille[at]gmail.com)
	0.0 DKIM_ADSP_CUSTOM_MED   No valid author signature, adsp_override is
	CUSTOM_MED
	-0.4 RP_MATCHES_RCVD Envelope sender domain matches handover relay
	domain 1.2 NML_ADSP_CUSTOM_MED    ADSP custom_med hit,
	and not from a mailing list
X-Headers-End: 1TVoc4-00014o-VG
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] Draft BIP for Bloom filtering
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Tue, 06 Nov 2012 19:15:06 -0000

On Fri, Oct 26, 2012 at 04:01:58PM +0200, Mike Hearn wrote:
> I don't feel I understand the effort required to do some kind of
> partial tree encoding. Having a kind of custom compression whereby
> branches are represented as varint indexes into a dictionary, I can
> feel how much work that involves and maybe I can make time over the
> next few weeks to implement it. Has anyone got example code for
> representing partial Merkle trees?

I've implemented code for efficient representation of partial merkle
trees: see https://github.com/sipa/bitcoin/commits/partialmerkle

The encoding/decoding algorithm uses a depth-first traversal of the tree, at
each node outputting whether anything relevant is beneath, and if not, storing
its hash and not descending into the children.

Furthermore, thanks to some properties of the tree, some hard upper bounds on
the size of the serialized structures are guaranteed. See the comments in the
code for details.

Unit tests are included to verify that
1) encoding and decoding a subset of transactions is an identity
2) the calculated merkle root matches the merkle root calculated by the existing merkle algorithm in the source code
3) the calculated merkle root actually depends on all hashes in the data structure.
4) serialization/deserialization is an identity
5) bounds on the size of the serialization are respected

Hope it is clear enough to port to other languages.

-- 
Pieter