Received: from sog-mx-2.v43.ch3.sourceforge.com ([172.29.43.192] helo=mx.sourceforge.net) by sfs-ml-1.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1YOsrP-0002xv-Nn for bitcoin-development@lists.sourceforge.net; Fri, 20 Feb 2015 19:03:35 +0000 Received-SPF: pass (sog-mx-2.v43.ch3.sourceforge.com: domain of gmail.com designates 74.125.82.43 as permitted sender) client-ip=74.125.82.43; envelope-from=mh.in.england@gmail.com; helo=mail-wg0-f43.google.com; Received: from mail-wg0-f43.google.com ([74.125.82.43]) by sog-mx-2.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1YOsrO-0006LZ-8J for bitcoin-development@lists.sourceforge.net; Fri, 20 Feb 2015 19:03:35 +0000 Received: by mail-wg0-f43.google.com with SMTP id z12so14978062wgg.2 for ; Fri, 20 Feb 2015 11:03:28 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.194.93.134 with SMTP id cu6mr21224516wjb.79.1424459006757; Fri, 20 Feb 2015 11:03:26 -0800 (PST) Sender: mh.in.england@gmail.com Received: by 10.194.188.11 with HTTP; Fri, 20 Feb 2015 11:03:26 -0800 (PST) In-Reply-To: References: Date: Fri, 20 Feb 2015 20:03:26 +0100 X-Google-Sender-Auth: 5fi-RYGcrIHIhswxI5H7tFJywH8 Message-ID: From: Mike Hearn To: Gregory Maxwell Content-Type: multipart/alternative; boundary=047d7bb7092c185138050f89b55f X-Spam-Score: -0.5 (/) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (mh.in.england[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 1.0 HTML_MESSAGE BODY: HTML included in message 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1YOsrO-0006LZ-8J Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] bloom filtering, privacy X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Feb 2015 19:03:35 -0000 --047d7bb7092c185138050f89b55f Content-Type: text/plain; charset=UTF-8 > > It's a straight forward idea: there is a scriptpubkey bitmap per block > which is committed. Users can request the map, and be SPV confident > that they received a faithful one. If there are hits, they can request > the block and be confident there was no censoring. OK, I see now, thanks Gregory. You're right, the use of UTXO set in that context was confusing me. If I go back to when we first did Bloom filtering and imagine the same proposal being made, I guess I would have been wondering about the following issues. Perhaps they have solutions: 1. Miners have to upgrade to generate the per-block filters. Any block that doesn't have such a filter has to be downloaded in full, still. So it would have taken quite a while for the bandwidth savings to materialise. 2. If checking the filter isn't a consensus rule, any miner who builds a wrong filter breaks the entire SPV userbase. With per-node filtering, a malicious or wrong node breaks an individual sync, but if the wallet user notices they don't seem to be properly synchronised they can just press "Rescan chain" and most likely get fixed. In practice broken nodes have never been reported, but it's worth considering. 3. Downloading full blocks is still a lot of data. If you have a wallet that receives tips a couple of times per day, and you open your wallet once per week, then with 1mb blocks you would be downloading ~14mb of data each time. Pretty pokey even on a desktop. Much sadness if you're on mobile. 4. Header size is constant, but per-block filters wouldn't be. So even the null sync would download more data as the network got busier. Of course Bloom filtering has the same scaling problem, but only between hard disk -> RAM -> CPU rather than across the network. 5. Is it really more private? Imagine we see a hit in block 100, so we download the full block and find our transaction. But now we must also learn when that transaction is spent, so we can keep the wallet-local UTXO set up to date. So we scan forward and see another hit in block 105, so now we download block 105. The remote node can now select all transactions in block 105 that spend transactions in block 100 and narrow down which txs are ours. If we are watching a wallet that does multi-sends then it feels like this problem gets much worse. I'd really like to find a solution that has O(1) scaling on sync. If we see nodes just as sources of UTXO data then shovelling the client (tx, existing merkle path) pairs keyed off script prefixes would (with one additional index) be O(1) and provide the same security guarantees as Bloom filtering today. It creates lots of other problems to solve, but at least it would scale into the forseeable future. --047d7bb7092c185138050f89b55f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
It's a straight forward idea: there is a scr= iptpubkey bitmap per block
which is committed. Users can request the map, and be SPV confident
that they received a faithful one. If there are hits, they can request
the block and be confident there was no censoring.

OK, I see now, thanks Gregory. You're right, the use of UTXO se= t in that context was confusing me.

If = I go back to when we first did Bloom filtering and imagine the same proposa= l being made, I guess I would have been wondering about the following issue= s. Perhaps they have solutions:

1. Miners have to = upgrade to generate the per-block filters. Any block that doesn't have = such a filter has to be downloaded in full, still. So it would have taken q= uite a while for the bandwidth savings to materialise.

=
2. If checking the filter isn't a consensus rule, any miner who bu= ilds a wrong filter breaks the entire SPV userbase. With per-node filtering= , a malicious or wrong node breaks an individual sync, but if the wallet us= er notices they don't seem to be properly synchronised they can just pr= ess "Rescan chain" and most likely get fixed. In practice broken = nodes have never been reported, but it's worth considering.
<= br>
3. Downloading full blocks is still a lot of data. If you hav= e a wallet that receives tips a couple of times per day, and you open your = wallet once per week, then with 1mb blocks you would be downloading ~14mb o= f data each time. Pretty pokey even on a desktop. Much sadness if you'r= e on mobile.

4. Header size is constant, but per-b= lock filters wouldn't be. So even the null sync would download more dat= a as the network got busier. Of course Bloom filtering has the same scaling= problem, but only between hard disk -> RAM -> CPU rather than across= the network.

5. Is it really more private? Imagin= e we see a hit in block 100, so we download the full block and find our tra= nsaction. But now we must also learn when that transaction is spent, so we = can keep the wallet-local UTXO set up to date. So we scan forward and see a= nother hit in block 105, so now we download block 105. The remote node can = now select all transactions in block 105 that spend transactions in block 1= 00 and narrow down which txs are ours. If we are watching a wallet that doe= s multi-sends then it feels like this problem gets much worse.


I'd really like to find a so= lution that has O(1) scaling on sync. If we see nodes just as sources of UT= XO data then shovelling the client (tx, existing merkle path) pairs keyed o= ff script prefixes would (with one additional index) be O(1) and provide th= e same security guarantees as Bloom filtering today. It creates lots of oth= er problems to solve, but at least it would scale into the forseeable futur= e.


--047d7bb7092c185138050f89b55f--