Received: from sog-mx-2.v43.ch3.sourceforge.com ([172.29.43.192] helo=mx.sourceforge.net) by sfs-ml-3.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1UzQrN-0005v9-Gk for bitcoin-development@lists.sourceforge.net; Wed, 17 Jul 2013 12:29:33 +0000 Received-SPF: pass (sog-mx-2.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.214.181 as permitted sender) client-ip=209.85.214.181; envelope-from=mh.in.england@gmail.com; helo=mail-ob0-f181.google.com; Received: from mail-ob0-f181.google.com ([209.85.214.181]) by sog-mx-2.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1UzQrL-0006d8-Lv for bitcoin-development@lists.sourceforge.net; Wed, 17 Jul 2013 12:29:33 +0000 Received: by mail-ob0-f181.google.com with SMTP id 16so2160753obc.26 for ; Wed, 17 Jul 2013 05:29:26 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.79.131 with SMTP id j3mr7820358oex.96.1374064166216; Wed, 17 Jul 2013 05:29:26 -0700 (PDT) Sender: mh.in.england@gmail.com Received: by 10.76.23.36 with HTTP; Wed, 17 Jul 2013 05:29:26 -0700 (PDT) In-Reply-To: <20130717105853.GA10083@savin> References: <3E7894A0-06F3-453D-87F8-975A244EBACF@include7.ch> <2BDA0943-22BB-4405-9AF0-86FB41FD04A6@include7.ch> <2F20A509-13A9-4C84-86D7-A15C21BACD53@include7.ch> <2A1C412D-414E-4C41-8E20-F0D21F801328@grabhive.com> <8EE501AA-1601-4C28-A32E-80F17D219D3A@grabhive.com> <20130717105853.GA10083@savin> Date: Wed, 17 Jul 2013 14:29:26 +0200 X-Google-Sender-Auth: DpRhbLbYeJyKFmw_gnQoFqEPXPI Message-ID: From: Mike Hearn To: Peter Todd Content-Type: multipart/alternative; boundary=047d7b67812686ae5304e1b43e88 X-Spam-Score: -0.5 (/) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (mh.in.england[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 1.0 HTML_MESSAGE BODY: HTML included in message 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1UzQrL-0006d8-Lv Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] SPV bitcoind? (was: Introducing BitcoinKit.framework) X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jul 2013 12:29:33 -0000 --047d7b67812686ae5304e1b43e88 Content-Type: text/plain; charset=UTF-8 Partial UTXO sets is a neat idea. Unfortunately my intuition is that many SPV wallets only remain open for <1 minute at a time because the user wants to see they received money, or to send it. It'd be neat to get some telemetry from the Android wallet for this - I will ask Andreas to let users opt in to usage statistics. So for anti-DoS I think smart prioritisation heuristics are the way to go again. Perhaps by letting clients have an "identity" that they provide to a node when it's load shedding. Clients that have been seen before, have a track record of not being abusive etc get priority and new clients that were never seen before get dropped. Coming up with a way to do that whilst preserving privacy sounds like an interesting cryptographic challenge. On Wed, Jul 17, 2013 at 12:58 PM, Peter Todd wrote: > On Tue, Jul 16, 2013 at 04:16:23PM +0200, Wendell wrote: > > Hello everyone, > > > > In the previous thread, I expressed interest in seeing an SPV bitcoind, > further stating that I would fund such work. Mike Hearn followed up with > some of Satoshi's old code for this, which is now quite broken. The offer > and interest on my side still stand, as more diversity in SPV options seems > like the right way to go. > > > > Time-permitting, I would really appreciate feedback from knowledgable > parties about the possible approaches to an SPV bitcoind. We at Hive > ideally want to see something that could one be merge into master, rather > than a fork. > > Keep in mind that SPV mode is newer than many realize: bloom filters are > a 0.8 feature, itself released only last Febuary. As John Dillon posted > earlier this week in "Protecting Bitcoin against network-wide DoS > attack" the Bitcoin codebase will have to implement much better anti-DoS > attack defences soon, and in a decentralized system there aren't any > options other than requiring peers to either do work (useful or not) or > sacrifice something of value. SPV peers can't do useful work, leaving > only sacrifice - to what extent and how much is unknown. In addition SPV > nodes have serious privacy issues because their peers know that any > transaction sent to them by the SPV node is guaranteed to be from the > node rather than relayed; bloom filters are only really helpful with > payment protocols that don't exist yet and don't apply to merchants. > Then you have MITM problems, vulnerability to fake blocks etc. > > It'll be awhile before we know how serious these issues are in practice, > and we're likely to find new issues we didn't think of too. In any case > Bitcoin is far better off if we make it easy to run a full node, > donating whatever resources you can. Fortunately there's a whole > continuum between SPV and full nodes. > > The way you do this is by maintaining partial UTXO sets. The trick is > that if you have verified every block in some range i to j, every time > you see a txout created by a transaction, and not subsequently spent, > you can be sure that at height j the txout existed. If height j is the > current block, you can be sure the txout exists provided that the chain > itself is valid. Any transaction that only spends txouts in this partial > set is a transaction you can fully verify and safely relay; for other > transactions you just don't know and have to wait until you see them in > a block. > > So what's useful about that? Basically it means your node starts with > the same security level, and usefulness to the network, as a SPV node. > But over time you keep downloading blocks as they are created, and with > whatever bandwidth you have left (out of some user-configurable > allocation) you download additional blocks going further and further > back in time. Gradually your UTXO set becomes more complete, and over > time you can verify a higher and higher % of all valid transactions. > Eventually your node becomes a full node, but in the meantime it was > still useful for the user, and still contributed to the network by > relaying blocks and an increasingly large subset of all transactions. > (optionally you can store a subset of the chain history too for other > nodes to bootstrap from) You've also got better security because you > *are* validating blocks, starting off incompletely, and increasingly > completely until your finally validating fully. Privacy is improved, for > both you and others, by mixing your transactions with others and adding > to the overall anonymity set. > > In the future we'll have miners commit a hash of the UTXO set, and that > gives us even more options to, for instance, have relayed transactions > include proof that their inputs were valid, allowing all nodes to relay > them safely. > > > As for specifics, you need to maintain a UTXO set, and in addition a set > of spent txouts (the STXO set) for which you haven't seen the > transaction that created the txout. As download newer blocks you update > the UTXO set; as you download older blocks you update the UTXO set and > STXO set. > > Nodes now advertise this new variable to their peers: > > nOldestBlock - The oldest block that we've validated. (and all > subsequent blocks) > > We'll also want the ability to advertise what sub-ranges of the > blockchain data we have on hand: > > listArchivedBlockRanges - lists of (begin, end pairs) > > Nodes should drop all but the largest n pairs, say 5 or something. The > index -1 is reserved to indicate the last block to make it easy to > advertise that you have every block starting at some height to the most > recent. (reserving -n with n as the last block might be a better choice > to show intent, but still allow for specific proofs when we get node > identities) > > We probably want to define a NODE_PARTIAL service bit or something; I'll > have to re-read Pieter Wuille's proposal and think about it. Nodes > should NOT advertize NODE_NETWORK unless they have the full chain and > have verified it. > > Nodes with partial peers should only relay transactions to those peers > if the transactions spend inputs the peers know about - remember how > even an SPV node has that information if it's not spending unconfirmed > inputs it didn't create. Nodes will have to update their peers > periodically as nOldestBlock changes. That said it may also be > worthwhile to simply relay all transactions in some cases too - a > reasonable way to approach this might be to set a bloom filter for tx's > that you *definitely* want, and if you are interested in everything, > just set the filter to all 1's. If someone comes up with a reasonable > micropayment or proof-of-work system even relaying txs that you haven't > validated is fine - the proof-of-work and prioritization will prevent > DoS attacks just fine. > > Remember that if you're running a partial node, it can get new blocks > from any partial node, and it can retrieve historic blockchain data from > any partial node that has archived the sequence of blocks you need next. > On a large scale this is similar to how in BitTorrent you can serve data > to your peers the moment you get it - a significant scalability > improvement for the network as a whole. Even if a large % of the network > was partial nodes running for just a few hours a day the whole system > would work fine due to how partial nodes can serve each other the data > they need. > > On startup you can act as a SPV node temporarily, grabbing asking for > filtered blocks matching your wallet, and then go back and get the full > blocks, or just download the full blocks right away. That's a tradeoff > on how long the node has been off. > > Anyway, it's a bit more code compared to pure-SPV, but it results in a > much more scalable Bitcoin, and if you can spare the modest bandwidth > requirements to keep up with the blockchain it'll result in much better > robustness against DoS attacks for you and Bitcoin in general. > > -- > 'peter'[:-1]@petertodd.org > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > --047d7b67812686ae5304e1b43e88 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Partial UTXO sets is a neat idea. Unfortunately my intuiti= on is that many SPV wallets only remain open for <1 minute at a time bec= ause the user wants to see they received money, or to send it. It'd be = neat to get some telemetry from the Android wallet for this - I will ask An= dreas to let users opt in to usage statistics.

So for anti-DoS I think smart prioritisation heuristics are = the way to go again. Perhaps by letting clients have an "identity"= ; that they provide to a node when it's load shedding. Clients that hav= e been seen before, have a track record of not being abusive etc get priori= ty and new clients that were never seen before get dropped. Coming up with = a way to do that whilst preserving privacy sounds like an interesting crypt= ographic challenge.


On Wed,= Jul 17, 2013 at 12:58 PM, Peter Todd <pete@petertodd.org> = wrote:
On Tue, Jul 16, 2013 at 04= :16:23PM +0200, Wendell wrote:
> Hello everyone,
>
> In the previous thread, I expressed interest in seeing an SPV bitcoind= , further stating that I would fund such work. Mike Hearn followed up with = some of Satoshi's old code for this, which is now quite broken. The off= er and interest on my side still stand, as more diversity in SPV options se= ems like the right way to go.
>
> Time-permitting, I would really appreciate feedback from knowledgable = parties about the possible approaches to an SPV bitcoind. We at Hive ideall= y want to see something that could one be merge into master, rather than a = fork.

Keep in mind that SPV mode is newer than many realize: bloom filters = are
a 0.8 feature, itself released only last Febuary. As John Dillon posted
earlier this week in "Protecting Bitcoin against network-wide DoS
attack" the Bitcoin codebase will have to implement much better anti-D= oS
attack defences soon, and in a decentralized system there aren't any options other than requiring peers to either do work (useful or not) or
sacrifice something of value. SPV peers can't do useful work, leaving only sacrifice - to what extent and how much is unknown. In addition SPV nodes have serious privacy issues because their peers know that any
transaction sent to them by the SPV node is guaranteed to be from the
node rather than relayed; bloom filters are only really helpful with
payment protocols that don't exist yet and don't apply to merchants= .
Then you have MITM problems, vulnerability to fake blocks etc.

It'll be awhile before we know how serious these issues are in practice= ,
and we're likely to find new issues we didn't think of too. In any = case
Bitcoin is far better off if we make it easy to run a full node,
donating whatever resources you can. Fortunately there's a whole
continuum between SPV and full nodes.

The way you do this is by maintaining partial UTXO sets. The trick is
that if you have verified every block in some range i to j, every time
you see a txout created by a transaction, and not subsequently spent,
you can be sure that at height j the txout existed. If height j is the
current block, you can be sure the txout exists provided that the chain
itself is valid. Any transaction that only spends txouts in this partial set is a transaction you can fully verify and safely relay; for other
transactions you just don't know and have to wait until you see them in=
a block.

So what's useful about that? Basically it means your node starts with the same security level, and usefulness to the network, as a SPV node.
But over time you keep downloading blocks as they are created, and with
whatever bandwidth you have left (out of some user-configurable
allocation) you download additional blocks going further and further
back in time. Gradually your UTXO set becomes more complete, and over
time you can verify a higher and higher % of all valid transactions.
Eventually your node becomes a full node, but in the meantime it was
still useful for the user, and still contributed to the network by
relaying blocks and an increasingly large subset of all transactions.
(optionally you can store a subset of the chain history too for other
nodes to bootstrap from) You've also got better security because you *are* validating blocks, starting off incompletely, and increasingly
completely until your finally validating fully. Privacy is improved, for both you and others, by mixing your transactions with others and adding
to the overall anonymity set.

In the future we'll have miners commit a hash of the UTXO set, and that=
gives us even more options to, for instance, have relayed transactions
include proof that their inputs were valid, allowing all nodes to relay
them safely.


As for specifics, you need to maintain a UTXO set, and in addition a set of spent txouts (the STXO set) for which you haven't seen the
transaction that created the txout. As download newer blocks you update
the UTXO set; as you download older blocks you update the UTXO set and
STXO set.

Nodes now advertise this new variable to their peers:

nOldestBlock - The oldest block that we've validated. (and all
subsequent blocks)

We'll also want the ability to advertise what sub-ranges of the
blockchain data we have on hand:

listArchivedBlockRanges - lists of (begin, end pairs)

Nodes should drop all but the largest n pairs, say 5 or something. The
index -1 is reserved to indicate the last block to make it easy to
advertise that you have every block starting at some height to the most
recent. (reserving -n with n as the last block might be a better choice
to show intent, but still allow for specific proofs when we get node
identities)

We probably want to define a NODE_PARTIAL service bit or something; I'l= l
have to re-read Pieter Wuille's proposal and think about it. Nodes
should NOT advertize NODE_NETWORK unless they have the full chain and
have verified it.

Nodes with partial peers should only relay transactions to those peers
if the transactions spend inputs the peers know about - remember how
even an SPV node has that information if it's not spending unconfirmed<= br> inputs it didn't create. Nodes will have to update their peers
periodically as nOldestBlock changes. That said it may also be
worthwhile to simply relay all transactions in some cases too - a
reasonable way to approach this might be to set a bloom filter for tx's=
that you *definitely* want, and if you are interested in everything,
just set the filter to all 1's. If someone comes up with a reasonable micropayment or proof-of-work system even relaying txs that you haven't=
validated is fine - the proof-of-work and prioritization will prevent
DoS attacks just fine.

Remember that if you're running a partial node, it can get new blocks from any partial node, and it can retrieve historic blockchain data from any partial node that has archived the sequence of blocks you need next. On a large scale this is similar to how in BitTorrent you can serve data to your peers the moment you get it - a significant scalability
improvement for the network as a whole. Even if a large % of the network was partial nodes running for just a few hours a day the whole system
would work fine due to how partial nodes can serve each other the data
they need.

On startup you can act as a SPV node temporarily, grabbing asking for
filtered blocks matching your wallet, and then go back and get the full
blocks, or just download the full blocks right away. That's a tradeoff<= br> on how long the node has been off.

Anyway, it's a bit more code compared to pure-SPV, but it results in a<= br> much more scalable Bitcoin, and if you can spare the modest bandwidth
requirements to keep up with the blockchain it'll result in much better=
robustness against DoS attacks for you and Bitcoin in general.

--
'peter'[:-1]@pet= ertodd.org

---------------------------------------------------------= ---------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gam= pad/clk?id=3D48808831&iu=3D/4140/ostg.clktrk
___________________= ____________________________
Bitcoin-development mailing list
Bitcoin-develo= pment@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-de= velopment


--047d7b67812686ae5304e1b43e88--