Received: from sog-mx-1.v43.ch3.sourceforge.com ([172.29.43.191] helo=mx.sourceforge.net) by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1YOrr4-0008RB-0i for bitcoin-development@lists.sourceforge.net; Fri, 20 Feb 2015 17:59:10 +0000 Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.216.47 as permitted sender) client-ip=209.85.216.47; envelope-from=adam.back@gmail.com; helo=mail-qa0-f47.google.com; Received: from mail-qa0-f47.google.com ([209.85.216.47]) by sog-mx-1.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1YOrr2-0000ye-UV for bitcoin-development@lists.sourceforge.net; Fri, 20 Feb 2015 17:59:10 +0000 Received: by mail-qa0-f47.google.com with SMTP id v10so13503019qac.6 for ; Fri, 20 Feb 2015 09:59:03 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.140.31.246 with SMTP id f109mr24859263qgf.23.1424455143476; Fri, 20 Feb 2015 09:59:03 -0800 (PST) Sender: adam.back@gmail.com Received: by 10.96.150.233 with HTTP; Fri, 20 Feb 2015 09:59:03 -0800 (PST) In-Reply-To: References: Date: Fri, 20 Feb 2015 17:59:03 +0000 X-Google-Sender-Auth: 2qUdiw_elDBwqlla0EWnybh3wbw Message-ID: From: Adam Back To: Mike Hearn Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -1.5 (-) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (adam.back[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1YOrr2-0000ye-UV Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] bloom filtering, privacy X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Feb 2015 17:59:10 -0000 The idea is not mine, some random guy appeared in #bitcoin-wizards one day and said something about it, and lots of people reacted, wow why didnt we think about that before. It goes something like each block contains a commitment to a bloom filter that has all of the addresses in the block stored in it. Now the user downloads the headers and bloom data for all blocks. The know the bloom data is correct in an SPV sense because of the commitment. They can scan it offline and locally by searching for addresses from their wallet in it. Not sure off hand what is the most efficient strategy, probably its pretty fast locally anyway. Now they know (modulo false positives) which addresses of theirs maybe in the block. So now they ask a full node for merkle paths + transactions for the addresses from the UTXO set from the block(s) that it was found in. Separately UTXO commitments could optionally be combined to improve security in two ways: - the normal SPV increase that you can also see that the transaction is actually in the last blocks UTXO set. - to avoid withholding by the full node, if the UTXO commitment is a trie (sorted) they can expect a merkle path to lexically adjacent nodes either side of where the claimed missing address would be as a proof that there really are no transactions for that address in the block. (Distinguishing false positive from node withholding) Adam On 20 February 2015 at 17:43, Mike Hearn wrote: > Ah, I see, I didn't catch that this scheme relies on UTXO commitments > (presumably with Mark's PATRICIA tree system?). > > If you're doing a binary search over block contents then does that imply > multiple protocol round trips per synced block? I'm still having trouble > visualising how this works. Perhaps you could write down an example run for > me. > > How does it interact with the need to download chains rather than individual > transactions, and do so without round-tripping to the remote node for each > block? Bloom filtering currently pulls down blocks in batches without much > client/server interaction and that is useful for performance. > > Like I said, I'd rather just junk the whole notion of chain scanning and get > to a point where clients are only syncing headers. If nodes were calculating > a script->(outpoint, merkle branch) map in LevelDB and allowing range > queries over it, then you could quickly pull down relevant UTXOs along with > the paths that indicated they did at one point exist. Nodes can still > withhold evidence that those outputs were spent, but the same is true today > and in practice this doesn't seem to be an issue. > > The primary advantage of that approach is it does not require a change to > the consensus rules. But there are lots of unanswered questions about how it > interacts with HD lookahead and so on. >