Received: from sog-mx-3.v43.ch3.sourceforge.com ([172.29.43.193]
	helo=mx.sourceforge.net)
	by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <mark@monetize.io>) id 1Sh30N-00083v-E0
	for bitcoin-development@lists.sourceforge.net;
	Tue, 19 Jun 2012 18:18:19 +0000
Received: from mail-ob0-f175.google.com ([209.85.214.175])
	by sog-mx-3.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128)
	(Exim 4.76) id 1Sh30H-0007H5-Qc
	for bitcoin-development@lists.sourceforge.net;
	Tue, 19 Jun 2012 18:18:19 +0000
Received: by obcva7 with SMTP id va7so2688603obc.34
	for <bitcoin-development@lists.sourceforge.net>;
	Tue, 19 Jun 2012 11:18:08 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=mime-version:x-originating-ip:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type:x-gm-message-state;
	bh=E5CB8fe3eIQaYaes+JjbO/w7oq6d0jbjWsyR8mj8iAc=;
	b=dAryCcMij9xrg2htixxPtBlN0kU+T/iQJxMkkT7F33BZbaFOYp3TVgyMkbZem7bDsZ
	A8muns1A6MYm1jyHnVknMC/xzmbP1R/XgZYxBH+D++lWCqbvsWPGoElF53O/7z6ebrx4
	HwJex9k+8+4AE/Me2kYJjJ7Bb4DjZ4fn2UYhHF4vp/E4E3n0cTfBxyqDamsy8Gdei7Bi
	rxarnxK59vmiaHUUBBzh3KnBGm5YOi2btvoJiySBqqW4W+3HdG6lcxxd1fefQ40oJC2X
	yXOg7acb1unYZmnMcMZnFe+ryq08nYjrUuMPXVFeccPq4Zebgm0EYXoRIGgq/ChghcUT
	AG1g==
MIME-Version: 1.0
Received: by 10.182.131.2 with SMTP id oi2mr20409952obb.43.1340129887943; Tue,
	19 Jun 2012 11:18:07 -0700 (PDT)
Received: by 10.76.101.15 with HTTP; Tue, 19 Jun 2012 11:18:07 -0700 (PDT)
X-Originating-IP: [128.102.238.61]
In-Reply-To: <4FE0B7EB.6000100@gmail.com>
References: <CAF7tpEyEWCbcB+jSpWOMyeZUBjQ=FbVEC8kLt3j2Yzv3YJOgiA@mail.gmail.com>
	<4FE0B7EB.6000100@gmail.com>
Date: Tue, 19 Jun 2012 11:18:07 -0700
Message-ID: <CACh7GpEehHFEJGRTtijgM7UAa2jeEWRKrQo5dym8F_YgXAEhFA@mail.gmail.com>
From: Mark Friedenbach <mark@monetize.io>
To: Alan Reiner <etotheipi@gmail.com>
Content-Type: multipart/alternative; boundary=e89a8f5028eeec7b5504c2d74db3
X-Gm-Message-State: ALoCoQm74aiP7kAJV1tIxOJx/blTkGFDJlLso6CQf2h52s6gMc/xnRTINFXNWnFyb+duGFEgpktE
X-Spam-Score: 1.0 (+)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	1.0 HTML_MESSAGE           BODY: HTML included in message
X-Headers-End: 1Sh30H-0007H5-Qc
Cc: bitcoin-development@lists.sourceforge.net
Subject: Re: [Bitcoin-development] Ultimate Blockchain Compression w/
 trust-free lite node
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Tue, 19 Jun 2012 18:18:19 -0000

--e89a8f5028eeec7b5504c2d74db3
Content-Type: text/plain; charset=UTF-8

On Tue, Jun 19, 2012 at 10:33 AM, Alan Reiner <etotheipi@gmail.com> wrote:

> I hope that someone else here would chime in on the issue raised in the
> thread, about using a tree-structure that has multiple valid
> configurations for the same set of unspent-TxOuts.  If you use any
> binary tree, you must replay the entire history of insertions and
> deletions in the correct order to get the tree structure and correct
> root.  Along those lines, using something like a red-black tree, while
> theoretically well-known, could be subject to implementation errors.
> One implementation of a red-black tree may do the rebalancing
> differently, and still work for it's intended purpose in the majority of
> applications where it doesn't matter.  One app developer updates their
> RB tree code which updated the RB-tree optimizations/rebalancing, and
> now a significant portion of the network can't agree on the correct
> root.  Not only would that be disruptive, it would be a disaster to
> track down.
>

Then use a 2-3-4 tree (aka self-balancing B-tree of order 4), which is a
generalization of RB-trees that doesn't allow for implementation choices in
balancing (assuming ordered insertion and deletion).

As gmaxwell points out, this is an trivially fixable 'problem'. Choose a
standard, mandate it, and write test cases.

If we were to use a raw trie structure, then we'd have all the above
> issues solved:  a trie has the same configuration no matter how elements
> are inserted or deleted, and accesses to elements in the tree are
> constant time -- O(1).  There is no such thing as an unbalanced trie.
> But overall space-efficiency is an issue.
>
> A PATRICIA tree/trie would be ideal, in my mind, as it also has a
> completely deterministic structure, and is an order-of-magnitude more
> space-efficient.  Insert, delete and query times are still O(1).
> However, it is not a trivial implementation.  I have occasionally looked
> for implementations, but not found any that were satisfactory.
>

No, a trie of any sort is dependent upon distribution of input data for
balancing. As Peter Todd points out, a malicious actor could construct
transaction or address hashes in such a way as to grow some segment of the
trie in an unbalanced fashion. It's not much of an attack, but in principle
exploitable under particular timing-sensitive circumstances.

Self-balancing search trees (KVL, RB, 2-3-4, whatever) don't suffer from
this problem.

Mark

--e89a8f5028eeec7b5504c2d74db3
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div class=3D"gmail_quote">On Tue, Jun 19, 2012 at 10:33 AM, Alan Reiner <s=
pan dir=3D"ltr">&lt;<a href=3D"mailto:etotheipi@gmail.com" target=3D"_blank=
">etotheipi@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q=
uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e=
x">

I hope that someone else here would chime in on the issue raised in the<br>
thread, about using a tree-structure that has multiple valid<br>
configurations for the same set of unspent-TxOuts. =C2=A0If you use any<br>
binary tree, you must replay the entire history of insertions and<br>
deletions in the correct order to get the tree structure and correct<br>
root. =C2=A0Along those lines, using something like a red-black tree, while=
<br>
theoretically well-known, could be subject to implementation errors.<br>
One implementation of a red-black tree may do the rebalancing<br>
differently, and still work for it&#39;s intended purpose in the majority o=
f<br>
applications where it doesn&#39;t matter. =C2=A0One app developer updates t=
heir<br>
RB tree code which updated the RB-tree optimizations/rebalancing, and<br>
now a significant portion of the network can&#39;t agree on the correct<br>
root. =C2=A0Not only would that be disruptive, it would be a disaster to<br=
>
track down.<br>
</blockquote><div><br></div><div>Then use a 2-3-4 tree (aka=C2=A0self-balan=
cing=C2=A0B-tree of order 4), which is a generalization of RB-trees that do=
esn&#39;t allow for implementation choices in balancing (assuming ordered i=
nsertion and deletion).</div>
<div><br></div><div>
As gmaxwell points out, this is an trivially fixable &#39;problem&#39;. Cho=
ose a standard, mandate it, and write test cases.</div><div><br></div><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #cc=
c solid;padding-left:1ex">


If we were to use a raw trie structure, then we&#39;d have all the above<br=
>
issues solved: =C2=A0a trie has the same configuration no matter how elemen=
ts<br>
are inserted or deleted, and accesses to elements in the tree are<br>
constant time -- O(1). =C2=A0There is no such thing as an unbalanced trie.<=
br>
But overall space-efficiency is an issue.<br>
<br>
A PATRICIA tree/trie would be ideal, in my mind, as it also has a<br>
completely deterministic structure, and is an order-of-magnitude more<br>
space-efficient. =C2=A0Insert, delete and query times are still O(1).<br>
However, it is not a trivial implementation. =C2=A0I have occasionally look=
ed<br>
for implementations, but not found any that were satisfactory.<br></blockqu=
ote><div><br></div><div>No, a trie of any sort is dependent upon distributi=
on of input data for balancing. As=C2=A0<span class=3D"Apple-style-span" st=
yle=3D"border-collapse:collapse;color:rgb(34,34,34);font-family:arial,sans-=
serif;font-size:13px">Peter Todd points out, a malicious actor could constr=
uct transaction or address hashes in such a way as to grow some segment of =
the trie in an unbalanced fashion. It&#39;s not much of an attack, but in p=
rinciple exploitable under particular timing-sensitive circumstances.</span=
></div>
<div><span class=3D"Apple-style-span" style=3D"border-collapse:collapse;col=
or:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px"><br></span></=
div><div><span class=3D"Apple-style-span" style=3D"border-collapse:collapse=
;color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px">Self-bala=
ncing search trees (KVL, RB, 2-3-4, whatever) don&#39;t suffer from this pr=
oblem.</span></div>
<div><span class=3D"Apple-style-span" style=3D"border-collapse:collapse;col=
or:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px"><br></span></=
div><div><span class=3D"Apple-style-span" style=3D"border-collapse:collapse=
;color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px">Mark</spa=
n></div>
</div>

--e89a8f5028eeec7b5504c2d74db3--