Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id C2289910 for ; Fri, 24 Feb 2017 23:49:41 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-it0-f54.google.com (mail-it0-f54.google.com [209.85.214.54]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 0234814C for ; Fri, 24 Feb 2017 23:49:39 +0000 (UTC) Received: by mail-it0-f54.google.com with SMTP id h10so33943322ith.1 for ; Fri, 24 Feb 2017 15:49:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=Lmef55HI523UZeYPavtFlTwkTnJSofvXX6GI3MxSa0s=; b=F6YeTwz7lB/9kjUUQWv3ZFQB9keyRA61v7fu72/BQ7Y0TWau6kdhtsK0+WgzbrUUHc oJN6AEZhRZvM+Xvj9BsdzI19Rkq5RLxC425/gfnnxBAVvd22q0PRlqTVLfv7j9Binhnb NuhI59xsncWol7bqoJasLgoekWzjtjVIUBN4Mx76/MVvwPxsRhZg3OYTwOFJXBUZXpc+ iI+Ep0evQt7mIuvNNSupw2joeZMNu2dUDCWMnRfsjw3ZMR+TavQe0srcd2ZqUokropZT I7kMLX/XZRgCF5dMfXcnv8eg1z4/OxqTuuUNF7jR6kvdOvFO8uTuQefjDgeQHEiiHLL0 6ONw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=Lmef55HI523UZeYPavtFlTwkTnJSofvXX6GI3MxSa0s=; b=H9oxGbKrD0AC/6hGchYnZ1g0GoH3ozCLEbwvebExm1nQ17aUQAcsCaV7FPONGI3CXm pyyrVe1H3gD2ZAm7jEdHZs/zu5+PIjR9XMKmtzd0/Dnyy983UVlLYPIOXnd8G06siXNO r32Rz6hqLDpysVC+LWCSvM1w+052UTNv80ALjUAwBKChoQSE0Udkg09Xy1ns0XcEJAva D+vOFFTijp16Usa0bA0R7CDuLOt0Atp3RUUArY2b92nGusfnrpAsTvS23CBjyPlLKLfF NMgaE3CRxfWtBf7S/Ytc+//kxzjnN75jCudGFSmqO0empBnO/zY1s2i6g/S8sFu0lz8J 4rnw== X-Gm-Message-State: AMke39kewUp88DzTmvI1TDZCwzBpsNRN4NWDBNzf42+EvINODVPMJlJZWOldgHucp123aQ== X-Received: by 10.36.208.134 with SMTP id m128mr5094312itg.44.1487980179082; Fri, 24 Feb 2017 15:49:39 -0800 (PST) Received: from [10.0.1.42] (71-81-80-204.dhcp.stls.mo.charter.com. [71.81.80.204]) by smtp.gmail.com with ESMTPSA id d11sm1348231itg.1.2017.02.24.15.49.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Feb 2017 15:49:38 -0800 (PST) From: Steve Davis Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Date: Fri, 24 Feb 2017 17:49:36 -0600 References: To: bitcoin-dev@lists.linuxfoundation.org In-Reply-To: Message-Id: <8F096BE1-D305-43D4-AF10-2CC48837B14F@gmail.com> X-Mailer: Apple Mail (2.3259) X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Sat, 25 Feb 2017 00:12:21 +0000 Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to attakcs by third-parties, not just repo maintainers X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2017 23:49:41 -0000 If the 20 byte SHA1 is now considered insecure (with good reason), what = about RIPEMD-160 which is the foundation of Bitcoin addresses? Is that also susceptible to such an attack vector? What does that mean for old addresses? etc /s > Date: Fri, 24 Feb 2017 11:04:54 +0100 > From: Tim Ruffing > To: bitcoin-dev@lists.linuxfoundation.org > Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to > attakcs by third-parties, not just repo maintainers > Message-ID: <1487930694.1528.1.camel@mmci.uni-saarland.de> > Content-Type: text/plain; charset=3D"UTF-8" >=20 > On Fri, 2017-02-24 at 00:57 +0100, Aymeric Vitte via bitcoin-dev = wrote: >>=20 >> I have not worked on this since some time, so that's just thoughts, >> but maybe it can render things much more difficult >> than???????computing two files until the same hash is found >>=20 >=20 > You basically rely on the idea that specific collisions are more > difficult to find.?This trick or similar tricks will not help.?(And > actually, the more files you add to the hash, the more freedom you = give > the attacker.) >=20 > Even if certain collisions are more difficult to find today (which is > certainly true), the general rule is that someone will prove you wrong > in a year. >=20 > Even if ignore security entirely, switching to new hash function is > much simpler trying to fix the usage of a broken hash function. >=20 > Relying on SHA1 is hopeless. We have to get rid of it. >=20 > Best, > Tim >=20 >=20 >=20 >=20 >=20 > ------------------------------ >=20 > Message: 2 > Date: Fri, 24 Feb 2017 16:18:43 +0100 > From: Aymeric Vitte > To: bitcoin-dev@lists.linuxfoundation.org > Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to > attakcs by third-parties, not just repo maintainers > Message-ID: <15848c1b-2873-35e8-0588-c636126257df@gmail.com> > Content-Type: text/plain; charset=3Dutf-8 >=20 > Not sure that you really read deeply what I sent, because stating that > hashing files continuously instead of hashing the intermediate steps > just gives more latitude to the attacker can't be true when the = attacker > has absolutely no control over the past files >=20 > I did not write this as a workaround to fix SHA1, which will be dead > soon or later but as maybe some general concept that could possibly = help > whatever hash function you are using for objects that are not frozen = but > extending (ie the original email stating that trees might be some kind > of worse candidates for collisions reminded me this), indeed it makes = no > sense to patch SHA1 or play around, but this kind of proposal could > accompany the defunct >=20 > The drawback is that you have to keep the hash state when you close = the > latest hash computation in order to start the next one >=20 > Then the question is: knowing the hash state, is it as easy to find a > collision between two files that will be computed in the next round = than > finding a collision between two files only? >=20 > Knowing that you can probably modify the hash state with some > unpredictable patterns >=20 > Most likely the answer is: no, it's (astronomically?) more difficult >=20 > Please take it as a suggestion that might be explored (ps: I have the > code for this if needed) rather than an affirmation, still amazed as > shown in the few links provided (among others) that each time I raise > this subject nobody really pays attention (what's the use case?, etc) > and by the fact that it's apparently used by only one project in the > world and not supported by any library >=20 >=20 > Le 24/02/2017 ? 11:04, Tim Ruffing via bitcoin-dev a ?crit : >> On Fri, 2017-02-24 at 00:57 +0100, Aymeric Vitte via bitcoin-dev = wrote: >>> I have not worked on this since some time, so that's just thoughts, >>> but maybe it can render things much more difficult >>> than computing two files until the same hash is found >>>=20 >> You basically rely on the idea that specific collisions are more >> difficult to find. This trick or similar tricks will not help. (And >> actually, the more files you add to the hash, the more freedom you = give >> the attacker.) >>=20 >> Even if certain collisions are more difficult to find today (which is >> certainly true), the general rule is that someone will prove you = wrong >> in a year. >>=20 >> Even if ignore security entirely, switching to new hash function is >> much simpler trying to fix the usage of a broken hash function. >>=20 >> Relying on SHA1 is hopeless. We have to get rid of it. >>=20 >> Best, >> Tim >>=20 >>=20 >>=20 >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >=20 > --=20 > Zcash wallets made simple: https://github.com/Ayms/zcash-wallets > Bitcoin wallets made simple: https://github.com/Ayms/bitcoin-wallets > Get the torrent dynamic blocklist: http://peersm.com/getblocklist > Check the 10 M passwords list: http://peersm.com/findmyass > Anti-spies and private torrents, dynamic blocklist: = http://torrent-live.org > Peersm : http://www.peersm.com > torrent-live: https://github.com/Ayms/torrent-live > node-Tor : https://www.github.com/Ayms/node-Tor > GitHub : https://www.github.com/Ayms >=20 >=20 >=20 > ------------------------------ >=20 > Message: 3 > Date: Fri, 24 Feb 2017 17:30:49 +0100 > From: Tim Ruffing > To: bitcoin-dev@lists.linuxfoundation.org > Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to > attakcs by third-parties, not just repo maintainers > Message-ID: <1487953849.5148.2.camel@mmci.uni-saarland.de> > Content-Type: text/plain; charset=3D"UTF-8" >=20 > On Fri, 2017-02-24 at 16:18 +0100, Aymeric Vitte via bitcoin-dev = wrote: >> Not sure that you really read deeply what I sent, because stating >> that >> hashing files continuously instead of hashing the intermediate steps >> just gives more latitude to the attacker can't be true when the >> attacker >> has absolutely no control over the past files > What prevents the attacker to provide different past files when = talking > to parties who are still in the initial state? >=20 > Then the question is: knowing the hash state, is it as easy to find a >> collision between two files that will be computed in the next round >> than >> finding a collision between two files only? > With the original usage of the hash function, the hash state is always > the initial state. Now that the attacker has some control over the = hash > state even. In other words, if the original use of the hash function > was vulnerable, then your scheme is vulnerable for the initial state. >=20 > Concrete attack: If you can find x !=3D y with H(x) =3D H(y), then you = can > also find m, x !=3D y, with H(m||x) =3D H(m||y), just by setting m =3D = "".=20 >=20 > Not sure if this is the right place to discuss that issue though... >=20 > Best, > Tim >=20 >=20 > ------------------------------ >=20 > Message: 4 > Date: Fri, 24 Feb 2017 18:29:50 +0100 > From: Aymeric Vitte > To: Tim Ruffing , Bitcoin Protocol > Discussion > Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to > attakcs by third-parties, not just repo maintainers > Message-ID: > Content-Type: text/plain; charset=3Dwindows-1252 >=20 > ??? apparently we are not discussing the same thing >=20 > Maybe I did not provide the right links (reading them again I myself > don't find them so clear), see maybe again > https://github.com/whatwg/streams/issues/33#issuecomment-28045860 >=20 > a - b - c -d >=20 > hash(a) >=20 > hash(a+b) >=20 > etc >=20 > But you are not going to rehash from the beginning, then: >=20 > update a --> keep the remaining bytes a_ (+ hash state 1) --> digest > a=3Dhash(a) >=20 > update a_+b from hash state 1--> keep the remaining bytes b_ (+ hash > state 2) --> digest a_+b=3Dhash(a+b) >=20 > etc >=20 > Basically that's similar to a real time progressive hash of chunks of = a > file that you are streaming and therefore don't know what will come = next > (per opposition to hashing a file that you already have), this could > apply to trees >=20 > This is different from something like: >=20 > hash(a) >=20 > hash(hash(a) +hash(b)) >=20 > etc >=20 > There is no initial state, and the attacker can't modify what was > already hashed, to make it more difficult you can probably modify the > hash state N >=20 >=20 > Le 24/02/2017 ? 17:30, Tim Ruffing via bitcoin-dev a ?crit : >> On Fri, 2017-02-24 at 16:18 +0100, Aymeric Vitte via bitcoin-dev = wrote: >>> Not sure that you really read deeply what I sent, because stating >>> that >>> hashing files continuously instead of hashing the intermediate steps >>> just gives more latitude to the attacker can't be true when the >>> attacker >>> has absolutely no control over the past files >> What prevents the attacker to provide different past files when = talking >> to parties who are still in the initial state? >>=20 >> Then the question is: knowing the hash state, is it as easy to find a >>> collision between two files that will be computed in the next round >>> than >>> finding a collision between two files only? >> With the original usage of the hash function, the hash state is = always >> the initial state. Now that the attacker has some control over the = hash >> state even. In other words, if the original use of the hash function >> was vulnerable, then your scheme is vulnerable for the initial state. >>=20 >> Concrete attack: If you can find x !=3D y with H(x) =3D H(y), then = you can >> also find m, x !=3D y, with H(m||x) =3D H(m||y), just by setting m =3D = "".=20 >>=20 >> Not sure if this is the right place to discuss that issue though... >>=20 >> Best, >> Tim >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >=20 > --=20 > Zcash wallets made simple: https://github.com/Ayms/zcash-wallets > Bitcoin wallets made simple: https://github.com/Ayms/bitcoin-wallets > Get the torrent dynamic blocklist: http://peersm.com/getblocklist > Check the 10 M passwords list: http://peersm.com/findmyass > Anti-spies and private torrents, dynamic blocklist: = http://torrent-live.org > Peersm : http://www.peersm.com > torrent-live: https://github.com/Ayms/torrent-live > node-Tor : https://www.github.com/Ayms/node-Tor > GitHub : https://www.github.com/Ayms >=20 >=20 >=20 > ------------------------------ >=20 > Message: 5 > Date: Fri, 24 Feb 2017 14:20:19 -0800 > From: Bram Cohen > To: Peter Todd > Cc: Bitcoin Protocol Discussion > > Subject: Re: [bitcoin-dev] A Better MMR Definition > Message-ID: > = > Content-Type: text/plain; charset=3D"utf-8" >=20 > So your idea is to cluster entries by entry time because newer things = are > more likely to leave and updating multiple things near each other is > cheaper? >=20 > That can be done with my tool. Instead of using hashes for the values = being > stored, you use position entries. The first entry gets a value of all > zeros, the next one a one followed by all zeros, then the next two > correspond to the first two with the second bit flipped to one, then = the > next four the first four with the third bit flipped to one, etc. It > probably performs a little bit better to do it two bits at a time = instead > of one so that the entries are 00, 01, 10, 11, 0001, 0010, 0011, 0101, > 0110, 0111, 1001, etc. If you were to really use this you'd probably = want > to to add some optimizations to use the fact that the terminals fit in = 64 > bits instead of 256, but it mostly works unchanged, and gets whatever > benefits there are to this clustering plus the high performance > implementation tricks I've built which I keep complaining that = nobody's > giving feedback on. >=20 > I'm not sold on this being a win: The empirical access patterns are > unknown, it requires an extra cache miss per lookup to find the entry > number, it may be that everything is optimized well enough without it = for > there to be no meaningful gains, and it's a bunch of extra complexity. = What > should be done is that a plain vanilla UTXO set solution is optimized = as > well as it can be first, and then the insertion ordering trick is = tried as > an optimization to see if it's an improvement. Without that baseline > there's no meaningful basis for comparison, and I'm quite confident = that a > naive implementation which just allocates individual nodes will > underperform the thing I've come up with, even without adding = optimizations > related to fitting in 64 bits. >=20 > On Thu, Feb 23, 2017 at 8:36 PM, Peter Todd = wrote: >=20 >> On Thu, Feb 23, 2017 at 07:32:43PM -0800, Bram Cohen wrote: >>> On Thu, Feb 23, 2017 at 7:15 PM, Peter Todd = wrote: >>>=20 >>>>=20 >>>> Glad we're on the same page with regard to what's possible in TXO >>>> commitments. >>>>=20 >>>> Secondly, am I correct in saying your UTXO commitments scheme = requires >>>> random >>>> access? While you describe it as a "merkle set", obviously to be >> merkelized >>>> it'll have to have an ordering of some kind. What do you propose = that >>>> ordering >>>> to be? >>>>=20 >>>=20 >>> The ordering is by the bits in the hash. Technically it's a Patricia >> Trie. >>> I'm using 'merkle tree' to refer to basically anything with a hash = root. >>=20 >> The hash of what? The values in the set? >>=20 >>>> Maybe more specifically, what exact values do you propose to be in = the >> set? >>>>=20 >>>>=20 >>> That is unspecified in the implementation, it just takes a 256 bit = value >>> which is presumably a hash of something. The intention is to nail = down a >>> simple format and demonstrate good performance and leave those = semantics >> to >>> a higher layer. The simplest thing would be to hash together the = txid and >>> output number. >>=20 >> Ok, so let's assume the values in the set are the unspent outpoints. >>=20 >> Since we're ordering by the hash of the values in the set, outpoints = will >> be >> distributed uniformly in the set, and thus the access pattern of data = in >> the >> set is uniform. >>=20 >> Now let's fast-forward 10 years. For the sake of argument, assume = that for >> every 1 UTXO in the set that corresponds to funds in someone's wallet = that >> are >> likely to be spent, there are 2^12 =3D 4096 UTXO's that have been = permanently >> lost (and/or created in spam attacks) and thus will never be spent. >>=20 >> Since lost UTXO's are *also* uniformly distributed, if I'm processing = a new >> block that spends 2^12 =3D 4096 UTXO's, on average for each UTXO = spent, I'll >> have to update log2(4096) =3D 12 more digests than I would have had = those >> "dead" >> UTXO's not existed. >>=20 >> Concretely, imagine our UTXO set had just 8 values in it, and we were >> updating >> two of them: >>=20 >> # >> / \ >> / \ >> / \ >> / \ >> / \ >> # # >> / \ / \ >> / \ / \ >> # . . # >> / \ / \ / \ / \ >> . X . . . . X . >>=20 >> To mark two coins as spent, we've had to update 5 inner nodes. >>=20 >>=20 >> Now let's look at what happens in an insertion-ordered TXO commitment >> scheme. >> For sake of argument, let's assume the best possible case, where = every UTXO >> spent in that same block was recently created. Since the UTXO's are >> recently >> created, chances are almost every single one of those "dead" UTXO's = will >> have >> been created in the past. Thus, since this is an insertion-ordered = data >> structure, those UTXO's exist in an older part of the data structure = that >> our >> new block doesn't need to modify at all. >>=20 >> Concretely, again let's imagine a TXO commitment with 8 values in it, = and >> two >> of them being spent: >>=20 >> # >> / \ >> / \ >> / \ >> / \ >> / \ >> . # >> / \ / \ >> / \ / \ >> . . . # >> / \ / \ / \ / \ >> . . . . . . X X >>=20 >> To mark two coins as spent, we've only had to update 3 inner nodes; = while >> our >> tree is higher with those lost coins, those extra inner nodes are = amortised >> across all the coins we have to update. >>=20 >>=20 >> The situation gets even better when we look at the *new* UTXO's that = our >> block >> creates. Suppose our UTXO set has size n. To mark a single coin as = spent, >> we >> have to update log2(n) inner nodes. We do get to amortise this a bit = at >> the top >> levels in the tree, but even if we assume the amortisation is totally = free, >> we're updating at least log2(n) - log2(m) inner nodes "under" the = amortised >> nodes at the top of the tree for *each* new node. >>=20 >> Meanwhile with an insertion-ordered TXO commitment, each new UTXO = added to >> the >> data set goes in the same place - the end. So almost none of the = existing >> data >> needs to be touched to add the new UTXOs. Equally, the hashing = required >> for the >> new UTXO's can be done in an incremental fashion that's very L1/L2 = cache >> friendly. >>=20 >>=20 >> tl;dr: Precisely because access patterns in TXO commitments are *not* >> uniform, >> I think we'll find that from a L1/L2/etc cache perspective alone, TXO >> commitments will result in better performance than UTXO commitments. >>=20 >>=20 >> Now it is true that Bitcoin's current design means we'll need a map = of >> confirmed outpoints to TXO insertion order indexes. But it's not >> particularly >> hard to add that "metadata" to transactions on the P2P layer in the = same >> way >> that segwit added witnesses to transactions without modifying how = txids >> were >> calculated; if you only connect to peers who provide you with TXO = index >> information in blocks and transactions, you don't need to keep that = map >> yourself. >>=20 >> Finally, note how this makes transactions *smaller* in many = circumstances: >> it's >> just a 8-byte max index rather than a 40 byte outpoint. >>=20 >> -- >> https://petertodd.org 'peter'[:-1]@petertodd.org >>=20 > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: = >=20 > ------------------------------ >=20 > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >=20 >=20 > End of bitcoin-dev Digest, Vol 21, Issue 34 > *******************************************