Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 7E2EA18CA for ; Wed, 3 Apr 2019 19:51:46 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-vk1-f172.google.com (mail-vk1-f172.google.com [209.85.221.172]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id D78227A6 for ; Wed, 3 Apr 2019 19:51:44 +0000 (UTC) Received: by mail-vk1-f172.google.com with SMTP id h127so73921vkd.12 for ; Wed, 03 Apr 2019 12:51:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/SRxxu+IBNUOGYnt7TegVsnYnHOnRQiCJyJv34EfPrI=; b=W6PPeHl+8fIFBokv5gaJVSlbhLmgXwm+chdix0SCQKuKEZbuhOT+xantLQFOMC8Iki KgyPMy9T6ZD8cQNxyRU0esC0dKbTMMLIc7fexV1o6cMchDI/I/wpBxmVtyGrw+B5Cp92 7uZK+IygPMEigFVL14sqT9HR/sH88zEympdKuhTqYm3TIhq3smcA881SdgEIPV7KteOz rnN7lyf/ownmUPo1pdCvwnCVjeYUXe8aAu7+UeEHLBBOyEe8jbNq61aZFW9nNr6xhcmG 00HSdpunDCnsjA/ACHhMDk7TTgAb3785sQNSZe+UpzDdFtziwpzeb3itrWmskGFqShSe m/3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/SRxxu+IBNUOGYnt7TegVsnYnHOnRQiCJyJv34EfPrI=; b=TSoLQlZk19nkbymz0uBXhwDSrCq1BHhWyhYiEnDJg03BSJ0zAST48DJebqbeUKb9S7 H9hv5UPnF0EaKA+oOIKY8RvxoTJBlEmtIdgBaU2fwA/1QveC8/gcfDH/b4tc08u0kIC6 0GcNhDBMJQUID1m5Mo2W4mrw98SgbjKHYN+Ekm1KwSSOyqQtPDoU1C96fDSfWkLJyizb ClHEy6KqFCHmZaL8epS9ONLsmwRxiGYP/FLTieEvjnX7pY421zJlRJ12Q25/4pIT+T2t Uo3++Y2fVlz5o5tEf32xx0UhI1Txzf22dn9CvgXKpEiRjohOalIp2yCUO4+cT5gBAczZ XoRg== X-Gm-Message-State: APjAAAW7PnSMbNInYSVm5kboE9Ohltg6GLSU13oBqTHWDLHt4xRjEoXK k4XWi/rkxigjWdvrmGeG5KJZTllvmzb1vbbopNd/+s3dowg= X-Google-Smtp-Source: APXvYqx7yv91pWPr53ThsNJvb9WZg4iAeMQUYBF0KICQob4DjIlM5l3t233qSgqrYCV4UWfDUa0Pu050hY7mTK3x0Qc= X-Received: by 2002:a1f:a9c2:: with SMTP id s185mr1495912vke.5.1554321103560; Wed, 03 Apr 2019 12:51:43 -0700 (PDT) MIME-Version: 1.0 References: <816FFA03-B4D9-4ECE-AF15-85ACBFA4BA8F@jonasschnelli.ch> In-Reply-To: <816FFA03-B4D9-4ECE-AF15-85ACBFA4BA8F@jonasschnelli.ch> From: "James O'Beirne" Date: Wed, 3 Apr 2019 15:51:32 -0400 Message-ID: To: Jonas Schnelli Content-Type: multipart/alternative; boundary="0000000000003ea4780585a595a9" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Wed, 03 Apr 2019 20:56:54 +0000 Cc: Bitcoin Protocol Discussion Subject: Re: [bitcoin-dev] assumeutxo and UTXO snapshots X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Apr 2019 19:51:46 -0000 --0000000000003ea4780585a595a9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for the reply, Jonas. I should've figured someone had hit the mailing list with this one before! In hindsight, I may have overemphasized the use of this for low-powered mobile devices. Indeed I think this may also be a worthwhile optimization for common hardware too. On the margin, if a user wants to interact with Bitcoin they will download software that allows them to do it immediately - this results in many people defaulting to a light client. If Bitcoin were able to initialize from scratch in a comparable amount of time and then populate the full chain in the background, we may have many more people *incidentally* running full nodes. Regardless of whether or not we use UTXO snapshots per se, I'd argue that the pattern of doing some kind of quick initialization (whether it's with assumed-valid data, or headers-contingent data like BIP157) and then performing full validation in the background is a good way to ensure that we have a healthier population of full nodes than we would otherwise. For this reason, and (as Ethan points out) because IBD's linear setup time is infeasible in the long-term, I think this pattern is an obvious direction for the bitcoin client to go. > * Do we semi-trust the peer that servers the UTXO set (compared to a block or tx which we can validate)? What channel to we use to serve the snapshot? As you note in your post from 2016, where and how we retrieve the snapshot is more or less immaterial because we compare a hash of its contents to a previously specified value that the code ships with (the `assumeutxo` hash). We don't need to trust the source serving it to us, although bandwidth DoS prevention via some kind chunked delivery from peers would be worth thinking about. Regards, James On Wed, Apr 3, 2019 at 2:37 AM Jonas Schnelli wrote: > Thanks James for the post. > > I proposed a similar idea [1] back in 2016 with the difference of signing > the UTXO-set hash in a gitian-ish way. > > While the idea of UTXO-set-syncs are attractive, there are probably still > significant downsides in usability (compared to models with less security= ), > mainly: > * Assume the UTXO set is 6 weeks old (which seems a reasonable age for > providing enough security) a peer using that snapshot would still require > to download and verify ~6048 blocks (~7.9GB at 1.3MB blocks,=E2=80=A6 pro= bably > CPU-days on a phone) > * Do we semi-trust the peer that servers the UTXO set (compared to a bloc= k > or tx which we can validate)? What channel to we use to serve the snapsho= t? > > If the goal is to run a full node on a consumer device that is also been > used for other CPU intense operations (like a phone, etc.), I=E2=80=99m n= ot sure if > this proposal will lead to a satisfactory user experience. > > The longer I think around this problem, the more I lean towards accepting > the fact that one need to use dedicated hardware in his own environment t= o > perform a painless full validation. > > /jonas > > [1] > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-February/012= 478.html > > > Am 02.04.2019 um 22:43 schrieb James O'Beirne via bitcoin-dev < > bitcoin-dev@lists.linuxfoundation.org>: > > > > Hi, > > > > I'd like to discuss assumeutxo, which is an appealing and simple > > optimization in the spirit of assumevalid[0]. > > > > # Motivation > > > > To start a fully validating bitcoin client from scratch, that client > currently > > needs to perform an initial block download. To the surprise of no one, > IBD > > takes a linear amount time based on the length of the chain's history. > For > > clients running on modest hardware under limited bandwidth constraints, > > say a mobile device, completing IBD takes a considerable amount of time > > and thus poses serious usability challenges. > > > > As a result, having fully validating clients run on such hardware is > rare and > > basically unrealistic. Clients with even moderate resource constraints > > are encouraged to rely on the SPV trust model. Though we have promising > > improvements to existing SPV modes pending deployment[1], it's worth > > thinking about a mechanism that would allow such clients to use trust > > models closer to full validation. > > > > The subject of this mail is a proposal for a complementary alternative > to SPV > > modes, and which is in the spirit of an existing default, `assumevalid`= . > It may > > help modest clients transact under a security model that closely > resembles > > full validation within minutes instead of hours or days. > > > > # assumeutxo > > > > The basic idea is to allow nodes to initialize using a serialized > version of the > > UTXO set rendered by another node at some predetermined height. The > > initializing node syncs the headers chain from the network, then obtain= s > and > > loads one of these UTXO snapshots (i.e. a serialized version of the UTX= O > set > > bundled with the block header indicating its "base" and some other > metadata). > > > > Based upon the snapshot, the node is able to quickly reconstruct its > chainstate, > > and compares a hash of the resulting UTXO set to a preordained hash > hard-coded > > in the software a la assumevalid. This all takes ~23 minutes, not > accounting for > > download of the 3.2GB snapshot[2]. > > > > The node then syncs to the network tip and afterwards begins a > simultaneous > > background validation (i.e., a conventional IBD) up to the base height > of the > > snapshot in order to achieve full validation. Crucially, even while the > > background validation is happening the node can validate incoming block= s > and > > transact with the benefit of the full (assumed-valid) UTXO set. > > > > Snapshots could be obtained from multiple separate peers in the same > manner as > > block download, but I haven't put much thought into this. In concept it > doesn't > > matter too much where the snapshots come from since their validity is > > determined via content hash. > > > > # Security > > > > Obviously there are some security implications due consideration. While > this > > proposal is in the spirit of assumevalid, practical attacks may become > easier. > > Under assumevalid, a user can be tricked into transacting under a false > history > > if an attacker convinces them to start bitcoind with a malicious > `-assumevalid` > > parameter, sybils their node, and then feeds them a bogus chain > encompassing > > all of the hard-coded checkpoints[3]. > > > > The same attack is made easier in assumeutxo because, unlike in > assumevalid, > > the attacker need not construct a valid PoW chain to get the victim's > node into > > a false state; they simply need to get the user to accept a bad > `-assumeutxo` > > parameter and then supply them an easily made UTXO snapshot containing, > say, a > > false coin assignment. > > > > For this reason, I recommend that if we were to implement assumeutxo, w= e > not > > allow its specification via commandline argument[4]. > > > > Beyond this risk, I can't think of material differences in security > relative to > > assumevalid, though I appeal to the list for help with this. > > > > # More fully validating clients > > > > A particularly exciting use-case for assumeutxo is the possibility of > mobile > > devices functioning as fully validating nodes with access to the > complete UTXO > > set (as an alternative to SPV models). The total resource burden needed > to start a node > > from scratch based on a snapshot is, at time of writing, a ~(3.2GB > > + blocks_to_tip * 4MB) download and a few minutes of processing time, > which sounds > > manageable for many mobile devices currently in use. > > > > A mobile user could initialize an assumed-valid bitcoin node within an > hour, > > transact immediately, and complete a pruned full validation of their > > assumed-valid chain over the next few days, perhaps only doing the > background > > IBD when their device has access to suitable high-bandwidth connections= . > > > > If we end up implementing an accumulator-based UTXO scaling design[5][6= ] > down > > the road, it's easy to imagine an analogous process that would allow > very fast > > startup using an accumulator of a few kilobytes in lieu of a multi-GB > snapshot. > > > > --- > > > > I've created a related issue at our Github repository here: > > https://github.com/bitcoin/bitcoin/issues/15605 > > > > and have submitted a draft implementation of snapshot usage via RPC her= e: > > https://github.com/bitcoin/bitcoin/pull/15606 > > > > I'd like to discuss here whether this is a good fit for Bitcoin > conceptually. Concrete > > plans for deployment steps should be discussed in the Github issue, and > after all > > that my implementation may be reviewed as a sketch of the specific > software > > changes necessary. > > > > Regards, > > James > > > > > > [0]: > https://bitcoincore.org/en/2017/03/08/release-0.14.0/#assumed-valid-block= s > > [1]: https://github.com/bitcoin/bips/blob/master/bip-0157.mediawiki > > [2]: as tested at height 569895, on a 12 core Intel Xeon Silver 4116 CP= U > @ 2.10GHz > > [3]: > https://github.com/bitcoin/bitcoin/blob/84d0fdc/src/chainparams.cpp#L145-= L161 > > [4]: Marco Falke is due credit for this point > > [5]: utreexo: https://www.youtube.com/watch?v=3DedRun-6ubCc > > [6]: Boneh, Bunz, Fisch on accumulators: > https://eprint.iacr.org/2018/1188 > > > > _______________________________________________ > > bitcoin-dev mailing list > > bitcoin-dev@lists.linuxfoundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > > --0000000000003ea4780585a595a9 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for the reply, Jonas. I should've figured = someone had hit the mailing list with this one before!

=
In hindsight, I may have overemphasized the use of this for low-powere= d mobile devices. Indeed I think this may also be a worthwhile optimization= for common hardware too.=C2=A0

On the margin, if = a user wants to interact with Bitcoin they will download software that allo= ws them to do it immediately - this results in many people defaulting to a = light client. If Bitcoin were able to initialize from scratch in a comparab= le amount of time and then populate the full chain in the background, we ma= y have many more people *incidentally* running full nodes.

Regardless of whether or not we use UTXO snapshots per se, I'd= argue that the pattern of doing some kind of quick initialization (whether= it's with assumed-valid data, or headers-contingent data like BIP157) = and then performing full validation in the background is a good way to ensu= re that we have a healthier population of full nodes than we would otherwis= e.

For this reason, and (as Ethan points out) beca= use IBD's linear setup time is infeasible in the long-term, I think thi= s pattern is an obvious direction for the bitcoin client to go.
<= br>
> * Do we semi-trust the peer that servers the UTXO set (c= ompared to a block or tx which we can validate)? What channel to we use to = serve the snapshot?

As you note in your post f= rom 2016, where and how we retrieve the snapshot is more or less immaterial= because we compare a hash of its contents to a previously specified value = that the code ships with (the `assumeutxo` hash). We don't need to trus= t the source serving it to us, although bandwidth DoS prevention via some k= ind chunked delivery from peers would be worth thinking about.
Regards,
James

<= div dir=3D"ltr" class=3D"gmail_attr">On Wed, Apr 3, 2019 at 2:37 AM Jonas S= chnelli <dev@jonasschnelli.ch> wrote:
Tha= nks James for the post.

I proposed a similar idea [1] back in 2016 with the difference of signing t= he UTXO-set hash in a gitian-ish way.

While the idea of UTXO-set-syncs are attractive, there are probably still s= ignificant downsides in usability (compared to models with less security), = mainly:
* Assume the UTXO set is 6 weeks old (which seems a reasonable age for prov= iding enough security) a peer using that snapshot would still require to do= wnload and verify ~6048 blocks (~7.9GB at 1.3MB blocks,=E2=80=A6 probably C= PU-days on a phone)
* Do we semi-trust the peer that servers the UTXO set (compared to a block = or tx which we can validate)? What channel to we use to serve the snapshot?=

If the goal is to run a full node on a consumer device that is also been us= ed for other CPU intense operations (like a phone, etc.), I=E2=80=99m not s= ure if this proposal will lead to a satisfactory user experience.

The longer I think around this problem, the more I lean towards accepting t= he fact that one need to use dedicated hardware in his own environment to p= erform a painless full validation.

/jonas

[1]
https://lists.l= inuxfoundation.org/pipermail/bitcoin-dev/2016-February/012478.html

> Am 02.04.2019 um 22:43 schrieb James O'Beirne via bitcoin-dev <= = bitcoin-dev@lists.linuxfoundation.org>:
>
> Hi,
>
> I'd like to discuss assumeutxo, which is an appealing and simple > optimization in the spirit of assumevalid[0].
>
> # Motivation
>
> To start a fully validating bitcoin client from scratch, that client c= urrently
> needs to perform an initial block download. To the surprise of no one,= IBD
> takes a linear amount time based on the length of the chain's hist= ory. For
> clients running on modest hardware under limited bandwidth constraints= ,
> say a mobile device, completing IBD takes a considerable amount of tim= e
> and thus poses serious usability challenges.
>
> As a result, having fully validating clients run on such hardware is r= are and
> basically unrealistic. Clients with even moderate resource constraints=
> are encouraged to rely on the SPV trust model. Though we have promisin= g
> improvements to existing SPV modes pending deployment[1], it's wor= th
> thinking about a mechanism that would allow such clients to use trust<= br> > models closer to full validation.
>
> The subject of this mail is a proposal for a complementary alternative= to SPV
> modes, and which is in the spirit of an existing default, `assumevalid= `. It may
> help modest clients transact under a security model that closely resem= bles
> full validation within minutes instead of hours or days.
>
> # assumeutxo
>
> The basic idea is to allow nodes to initialize using a serialized vers= ion of the
> UTXO set rendered by another node at some predetermined height. The > initializing node syncs the headers chain from the network, then obtai= ns and
> loads one of these UTXO snapshots (i.e. a serialized version of the UT= XO set
> bundled with the block header indicating its "base" and some= other metadata).
>
> Based upon the snapshot, the node is able to quickly reconstruct its c= hainstate,
> and compares a hash of the resulting UTXO set to a preordained hash ha= rd-coded
> in the software a la assumevalid. This all takes ~23 minutes, not acco= unting for
> download of the 3.2GB snapshot[2].
>
> The node then syncs to the network tip and afterwards begins a simulta= neous
> background validation (i.e., a conventional IBD) up to the base height= of the
> snapshot in order to achieve full validation. Crucially, even while th= e
> background validation is happening the node can validate incoming bloc= ks and
> transact with the benefit of the full (assumed-valid) UTXO set.
>
> Snapshots could be obtained from multiple separate peers in the same m= anner as
> block download, but I haven't put much thought into this. In conce= pt it doesn't
> matter too much where the snapshots come from since their validity is<= br> > determined via content hash.
>
> # Security
>
> Obviously there are some security implications due consideration. Whil= e this
> proposal is in the spirit of assumevalid, practical attacks may become= easier.
> Under assumevalid, a user can be tricked into transacting under a fals= e history
> if an attacker convinces them to start bitcoind with a malicious `-ass= umevalid`
> parameter, sybils their node, and then feeds them a bogus chain encomp= assing
> all of the hard-coded checkpoints[3].
>
> The same attack is made easier in assumeutxo because, unlike in assume= valid,
> the attacker need not construct a valid PoW chain to get the victim= 9;s node into
> a false state; they simply need to get the user to accept a bad `-assu= meutxo`
> parameter and then supply them an easily made UTXO snapshot containing= , say, a
> false coin assignment.
>
> For this reason, I recommend that if we were to implement assumeutxo, = we not
> allow its specification via commandline argument[4].
>
> Beyond this risk, I can't think of material differences in securit= y relative to
> assumevalid, though I appeal to the list for help with this.
>
> # More fully validating clients
>
> A particularly exciting use-case for assumeutxo is the possibility of = mobile
> devices functioning as fully validating nodes with access to the compl= ete UTXO
> set (as an alternative to SPV models). The total resource burden neede= d to start a node
> from scratch based on a snapshot is, at time of writing, a ~(3.2GB
> + blocks_to_tip * 4MB) download and a few minutes of processing time, = which sounds
> manageable for many mobile devices currently in use.
>
> A mobile user could initialize an assumed-valid bitcoin node within an= hour,
> transact immediately, and complete a pruned full validation of their > assumed-valid chain over the next few days, perhaps only doing the bac= kground
> IBD when their device has access to suitable high-bandwidth connection= s.
>
> If we end up implementing an accumulator-based UTXO scaling design[5][= 6] down
> the road, it's easy to imagine an analogous process that would all= ow very fast
> startup using an accumulator of a few kilobytes in lieu of a multi-GB = snapshot.
>
> ---
>
> I've created a related issue at our Github repository here:
>=C2=A0 =C2=A0https://github.com/bitcoin/bitcoin/i= ssues/15605
>
> and have submitted a draft implementation of snapshot usage via RPC he= re:
>=C2=A0 =C2=A0https://github.com/bitcoin/bitcoin/pul= l/15606
>
> I'd like to discuss here whether this is a good fit for Bitcoin co= nceptually. Concrete
> plans for deployment steps should be discussed in the Github issue, an= d after all
> that my implementation may be reviewed as a sketch of the specific sof= tware
> changes necessary.
>
> Regards,
> James
>
>
> [0]: https://bitcoinc= ore.org/en/2017/03/08/release-0.14.0/#assumed-valid-blocks
> [1]: https://github.com/bitcoin/b= ips/blob/master/bip-0157.mediawiki
> [2]: as tested at height 569895, on a 12 core Intel Xeon Silver 4116 C= PU @ 2.10GHz
> [3]: https://githu= b.com/bitcoin/bitcoin/blob/84d0fdc/src/chainparams.cpp#L145-L161
> [4]: Marco Falke is due credit for this point
> [5]: utreexo: https://www.youtube.com/watch?v=3Ded= Run-6ubCc
> [6]: Boneh, Bunz, Fisch on accumulators: https://eprint.iacr.o= rg/2018/1188
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org= /mailman/listinfo/bitcoin-dev

--0000000000003ea4780585a595a9--