Received: from sog-mx-3.v43.ch3.sourceforge.com ([172.29.43.193] helo=mx.sourceforge.net) by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1YPB4Z-00008v-7r for bitcoin-development@lists.sourceforge.net; Sat, 21 Feb 2015 14:30:23 +0000 Received-SPF: pass (sog-mx-3.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.216.181 as permitted sender) client-ip=209.85.216.181; envelope-from=adam.back@gmail.com; helo=mail-qc0-f181.google.com; Received: from mail-qc0-f181.google.com ([209.85.216.181]) by sog-mx-3.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1YPB4X-0001HZ-3L for bitcoin-development@lists.sourceforge.net; Sat, 21 Feb 2015 14:30:23 +0000 Received: by qcxm20 with SMTP id m20so5068716qcx.0 for ; Sat, 21 Feb 2015 06:30:15 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.229.192.5 with SMTP id do5mr6357387qcb.12.1424529015684; Sat, 21 Feb 2015 06:30:15 -0800 (PST) Sender: adam.back@gmail.com Received: by 10.96.56.136 with HTTP; Sat, 21 Feb 2015 06:30:15 -0800 (PST) In-Reply-To: References: Date: Sat, 21 Feb 2015 14:30:15 +0000 X-Google-Sender-Auth: t7eiQVg1ESd2IwxXty2ZdeV9My0 Message-ID: From: Adam Back To: Mike Hearn Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -1.5 (-) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (adam.back[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1YPB4X-0001HZ-3L Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] bloom filtering, privacy X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Feb 2015 14:30:23 -0000 If you want to be constructive and index transactions that are not p2sh but non-simple and contain checksig so the address is visible, you could do that with a block bloom filter also. I wasnt sure if the comments about the need to batch requests was about downloading headers & filters, or about transactions, there is no harm downloading headers & bloom filters without Tor - there is no identity nor addresses revealed by doing so. So over Tor you would just be fetching transactions that match the address. For downloading transactions unless you frequently receive transactions you wont be fetching every block. Or are you assuming bloom filters dialled up to the point of huge false positives? You said otherwise. Mid-term I'd say you want some basic request tunneling as part of bitcoin, that maybe isnt Tor, to avoid sharing their fate if Tor controversies are a risk to Tor service. Some of the bitcoin-Tor specific weak points could maybe then be addressed. Relatedly I think bitcoin could do with a store-and-forward message bus with privacy and strong reliability via redundancy (but less redundancy maybe than consensus all-nodes must receiving and agree and store forever). That provides an efficient store-and-forward SPV receivable stealth-address solution that doesnt suck: send the recipient their payment, if they like it they broadcast it themselves. As a bonus store-and-forward message mixes are better able to provide meaningful network privacy than interactive privacy networks. You could spend over the same channel You seem to be saying at one point that Tor is useless against pervasive eavesdropper threat model (which I am not sure I agree with, minimally it makes them work for the info and adds uncertainty; and not been paying super close attention but I think some of the Snowden releases suggest Tor is a net win) and secondly that other types of attackers are disinterested (how do we know that?) or maybe that you dont care about privacy vs them (maybe some users do!) It would certainly be nice to get real privacy from a wider range of attackers but nothing (current situation) is clearly worse; using block bloom filters we'd make the pervasive case harder work, and the nosy full node learn nothing. Adam On 21 February 2015 at 13:28, Mike Hearn wrote: > Let's put the UTXO commitments/anti-fraud proofs to one side for a moment. I > would like to see them happen one day, but they aren't critical to these > protocols and are just proving to be a distraction. > > >> >> Then they make fresh random connections to different nodes and request >> download of the respective individual transactions from the full node. > > > ... > >> About privacy the node can make different random connections to >> different nodes to fetch addresses ..... The full node cant >> correlate the addresses as belonging to the same person by correlating >> the download requests for them, because they are made via different >> nodes. > > > Apologies for the wall of text, but I don't think this will work nor solve > any real problem. And I must justify such a strong statement clearly. > > First: technical issues > > When you download the per-block Bloom filter and test, what you get back is > a set of script elements (addresses, keys, OP_RETURN tags etc). But then in > the next step you are saying that you connect to random peers and request > individual transactions. We don't know that at this point. All we know are a > set of addresses that possibly matched. So I think what you mean is "wallets > connect to random peers and request transactions in block N that match a > given set of addresses". > > This is what Bloom filtering already does, of course. Doing the test against > the per-block filter first doesn't seem to buy us much because with > thousands of transactions per block, even a very tiny FP rate will still > trigger a match on every single one. > > The second problem I see is that we can't do this in parallel because of the > following edge case: wallet contains key K and someone sends it money using > an OP_CHECKSIG output. The input which spends this output does not contain > any predictable data, thus we do not know what to look for in the following > blocks to detect a spend of it until we have seen the first transaction and > know its hash. > > In practice this means we must either scan through the chain in sequence and > update our matching criteria if we see such an output (this is what the > Bloom filtering protocol already does server-side), or we must constrain the > user such that output scripts always force repetition of predictable data - > this is what mostly happens today due to pay-to-address outputs, but not > always, and correctness is more important than completeness. > > If we can't do it in parallel then we must suffer a node round-trip for > every single block we traverse, because we can't request long runs of blocks > with a single command. That latency will kill performance dead. It's a non > starter. > > But let's imagine we don't care about OP_CHECKSIG outputs and are willing to > ignore them. There are cases where they are the best and most efficient > technical solution, but let's put that to one side. > > The primary difference after making the above changes are that no one node > gets a filter containing all our keys and addresses. I don't think a per > block pre-test filter would gain us much efficiency so from a privacy > perspective this is what it boils down to - sharding of the scan. > > But we can already do this with the current Bloom filtering protocol. > BitcoinJ doesn't do so because having multiple parallel scans uses up > network IOPs which are a resource of unknown quantity, and because stepping > through the chain in parallel with multiple peers complicates the chain sync > implementation quite a bit. > > Second: this doesn't solve any real problem > > Who cares about collecting Bloom filters off the wire? > > Commercial fraudsters? Doubtful. There are much easier ways to steal money. > > Spies? Yes! Without a doubt NSA/GCHQ are building or have built databases of > IP addresses to Bitcoin addresses and are correlating it via XKEYSCORE with > other identifiable information. > > However, just requesting data from different nodes doesn't help with that, > because they are doing DPI and can still see all the connections, so can > still combine all the filters or received transactions. > > Ah, you say, but we're requesting everything via Tor. > > Yes, about that. We've implemented that already. Some wallets even use it by > default, like Alon & Chris' Bitcoin Authenticator wallet. It's just one line > of code to activate. > > Unfortunately there are severe practical problems to using Tor: > > If you don't have a warm consensus then booting it up is very slow. We're > already slower than our competitors like blockchain.info and > VISA/MasterCard, we can't make this any worse. > > This one is possibly not that big a deal and can be solved with more > technical tricks. > > Bitcoin Core's DoS strategy means anyone can block all of Tor quite > trivially. So we'd need some complicated fallback mechanism to disable Tor > remotely, in case someone did this. > > Bitcoin wire traffic isn't encrypted or authenticated so it makes it much > easier for trolls to tamper with lots of wire traffic at once, whereas > without Tor it's much harder. > > Let's ignore the fact that the Tor project insists on poking the law > enforcement bear with rusty nails, and has been receiving tipoffs about > plans to seize directory authorities. How much Bitcoin wallets should rely > on Tor sticking around is a debate for some other time. > > There's a much simpler way to fix all of this - add opportunistic encryption > to the wire protocol.