Mimblewimble and scriptless scripts
Andrew Poelstra (Blockstream)
https://www.youtube.com/watch?v=ovCBT1gyk9c
https://twitter.com/kanzure/status/954531580848082944
We're about ready for the second session of RWC. We are going to move on to cryptocurrencies. So we have 3 talks lined up in this session. The first one is on mimblewimble and scriptless scripts, delivered by Andrew Poelstra from Blockstream.
Introduction
Thank you. Cool. Hello. My name is Andrew Poelstra. I am the research director at Blockstream and I'm here to talk about a couple of things that I've been thinking about over the past year, called mimblewimble, and also scriptless scripts. Maybe you've heard of this. I've given similar talks to this a few times before at more blockchain/bitcoin-oriented conferences. This is the first time that I'm talking to a general cryptography audience. So I'll try to tailor the talk to a more general cryptographer audience member.
What is a blockchain?
As the first speaker of the cryptocurrency sessions today, I should talk a bit about what blockchains are. A blockchain here is just a list of blocks. It's a merkleized list of blocks. Each block commits to the previous one. Each block also commits to a pile of other data, called transactions. And there are certain rules about what those transactions are.
The real innovative part of bitcoin is that there's a global consensus system on what the blockchain is. For my purposes, and the next couple of talks, we're just going to take that for granted by saying that somehow everyone has the same view of the system, and we're not going to worry too much about how they are going to achieve that.
In bitcoin, every transaction has a pile of inputs (references to previous outputs), a pile of outputs, and the essentially the only rule beyond that the transactions are valid is that each input can only be used in one transaction. You can't double spend.
If you are somebody joinin the bitcoin system, the design goal is for the system to be trustless (publicly verifiable). Anybody can download the blockchain, the series of blocks and all transactions. They validate all transactions check that each one is legitimate and has all of the necessary signatures with no conflicts. After doing htis process, they are left with a set of unspent transaction outputs, called the UTXO set. And basically every single cryptocurrency that you here about has some form of this model. In bitcoin, the system state is described by the unspent transaction outputs, those are the coins that can be spent in future transactions. In zcash or monero they use a different model where they track spent outputs because the unspent ones are not publicly visible. In ethereum it's more complex, but the idea is the same. You download all the transactions, you verify them in a row, and you're left with a state that you know came from a legitimate history.
Mimblewimble
http://diyhpl.us/wiki/transcripts/sf-bitcoin-meetup/2016-11-21-mimblewimble/
Mimblewimble was proposed a year and a half ago by this character Tom Elvis Judesor. It's a slighlty design. People can verify the state of the system without downloading all of the background data. Somehow it's possible to compress out all transactions. A lot of the data is redundant, and you don't need it to get full public verifiability.
One quick note. I said "this character Tom Elvis Judesor". We're in Switzerland. Maybe there are some French speakers that recognize this, but it's the name of Voldemort in the Harry Potter books in the French translation. It's not a real name, as far as I'm aware.
This mimblewimble system was originally proposed in the form of a text document that was published on a tor hidden service and dread-dropped on an IRC channel about a year and a half ago. That's where all of this comes from. The author just dropped it, signed off a minute later and has never come back as far as any of us are aware.
Talk outline
So I want to talk about two things. Mimblewimble and scriptless scripts. The first topic is mimblewimble, and in particular how do we get this kind of compression. And the second topic is scriptless scripts, which are given this scompression which is given by restricting the transaction structure to be very simple and highly structured-- how can we still do anything interesting? The cool part about bitcoin isn't just that you can send coins around, but also that you can attach various conditions to their spending. You can look for hash preimages, you can do multisignatures, you can do fun things that build up "smart contracts" I guess is the buzzword regarding these features.
Confidential transactions and pedersen commitments
https://www.youtube.com/watch?v=ovCBT1gyk9c&t=4m45s
Let me before I begin to explain mimblewimble, let me explain confidential transactions. It's a scheme devised by Gregory Maxwell a couple of years ago where you have a bitcoin-like system but all of the amounts are hidden. The amounts are replaced by homomorphic commitments. In particular, these pedersen commitments whose structure I've displayed here. And the nice hting about these commimtnets being homomorphic is that people who download these transactions can verify what they need to verify about these amounts using the homomorphic property. In particular, this means being able to verify that the sum of the inputs is equal to the sum of the ouputs. This tells the user that no coins were created or destroyed. Verifiers don't care about anything else, only to verify that the system is working properly.
As a second complication, and this is a sleight of hand; given a dollar value v and zed mod q, choose-- you can't take a dollar zero value to the bank and split it into a 10 dollar bill and a 0 dollar value. To make this system work, we have to restrict the amounts to relatively small values relative to the group order so that they will behave like positive integers. They won't overflow under any realistic scenario. And to prevent that from happening, we need something called a range proof, which I won't go into into too much detail except to say that it is there.
Mimblewimble transactions
Our pedersen commitments consist of the actual amount and also this other uniformly random blinding factor called "r", which is used to hide the amount. And in order to produce a transaction, you need to know the sum of the blinding factors for the input. And you can maybe see this by trying to think about how to structure this, and also knowing that the range proofs serve double duty in a practical system by also serving as a proof-of-knowledge of the blinding factor.
What this means is that you can't construct a transaction without knowing the secret. Well, in bitcoin you already have this rule that you can't construct transactions without knowing the secret. The rule is that you have to know the secret key associated to the public key that every output is labeled by. Why have these two secrets? Why not just drop the secondary one, the one in what's called the script signature (scriptSig) and just use the pedersen commitment blinding factors as your secret? This almost works but there's a problem.
Mimblewimble: Kernels
https://www.youtube.com/watch?v=ovCBT1gyk9c&t=7m15s
It almost works to just use the pedersen commitment blinding factors as your secret, but there's a little bit too much structure in here, in the sense that in a real transaction your inputs come from somebody and your outputs are presumably going to somebody else. These are two disparate distrusting parties that presumably shouldn't know each other's blinding factors. So if you were to just use the homomorphic property as I have described (sum of inputs is equal to sum of outputs), then you would have this problem which is that the output owner would know the sum of the input blinding factor and vice versa. You're not transferring anything, everyone involved in the transaction would know all of the blinding factors, and therefore they would have all the outputs.
The way that mimblewimble solves this is by adding a special zero-value output called the kernel. The kernel cannot be spent, which means that to the extent that a sum of blinding factors and a kernel blinding factor are known, that's useless to somebody trying to create a transaction, they have to know a sum of actual spendable outputs. And this prevents those kinds of attacks from happening. As a second feature, the kernel acts as a multi-signature key for all of the participants in the transaction. What it is, really, is that it's a sum of all of the blinding factors from each participant and then multiplied by some generator G here. You can think of this in two ways: as a sum of pedersen commitments (zero-valued output) and the other way is as a multi-signature key. These two things are complementary and that's sort of the magic of mimblewimble.
Mimblewimble in pictures
I'm going to switch now to pictures for the remainder of the first half of this talk to give you an idea of what the implications are. So here I have two mimblewimble transactions. They have inputs, which are references to old outputs, and I have outputs, which are new outputs. The rule for these to be valid is that rather than having each input having some sort of signature attached (scriptSig) we simply require that the sum of the inputs minus the sum of the outputs equals the kernel. These things are not attached to the kernel, they are free-floating. None of these thins sign for the other things. If I'm allowed to have a transaction with multiple kernels, I can combine the input sets and the output sets, and I can have one transaction with two kernels which is already sort of a neat feature. In bitcoin, we call this coinjoin when you combine unrelated transactions. You can do this non-interactively in mimblewimble because nothing is signing the complete transactions. But then, notice that the second transaction spends an output from the first transaction. Okay. If my rule is simply that outputs minus inputs equal kernel. I can delete input and output as long as they are the same. So now I have a transaction that is smaller than the original, but still has the same net effect as the original transaction. And critically, because the kernels are still in tact and each kernel is a multi-signature key from the two transactors, this entire aggregate transaction is still authenticated by everyone involved. So somehow you're moving money from A to B to C and the intermediate step is deleted from the perspective of any public verifier but you still retain the no-inflation property (that the total input amount equals the total output amount) and the authorization authentication property which is that anyone involved in any of the series of transactions that lead to this output set signed off on everything that happened, that no fraud is happening.
Here's where the cool stuff comes in. Suppose that we've got an entire blockchain of these mimblewimble transactions. In this picture, every single block is a transaction. The reason is that every transaction in a block can be combined in the way that I just described, so every block just becomes a single transaction with a set of inputs, a set of outputs, a set of kernels. There could be an arbitrary number of kernels. And that's what your blocks are. It's hard to describe these as transactions. They are inputs and outputs, and what you have is a "diff" of the unspent output set.
Now, suppose that I'm a new verifier trying to download the state of the chain. Normally in the bitcoin paradigm I download all the blocks and all the transactions in those blocks and I verify them all and so forth. But in mimblewimble, I can do that, but what I can do is download the transaction from every single block and combine in the same way as before because my validity requirement is just that sum property. And then observe that every single input except for the coinbase inputs which create new coins-- every single input is the output of a previous transaction. This telescoping sum property that I have described can be applied to the entire blockchain and everything can go away. The only thin left is the coinbase inputs (which in practice are usually implicit, like a +50 coins every block kind of thing) and whichever outputs have not been spent which is your final system state (the unspent outputs) and your kernels (every block has a pile of kernels). And this gives you the compression that I alluded to at the beginnin of this talk. It's just this property that my inputs and my outputs cancel out, algebraically.
This is pretty cool, but it makes smart contracts and other things more difficult because when I deleted these outputs, what I'm saying is that any new verifier does not see those outputs have ever existed. In principle they had no idea that they ever existed. And in particular, if htose outputs had weird requirements on them like n-of-m signers need to be signing or a hash preimage needs to be revealed or whatever, there's no way for the public verifiers to verify that those conditions have been met because the data isn't in the blockchain or see that those conditions had ever existed. So it seems like this is really a limited thing. When the paper was published in August 2016, that seemed to be where we would be left with, but in the second part of my talk I will explain that we can actually nonetheless do some cool things even though our verifiers don't see all the data.
Mimblewimble scaling numbers
Let me give some numbers about what we get from this compression technique in mimblewimble. In bitcoin, we have roughly 150 million transactions, 400 million outputs, 65 million UTXOs. If you download the bitcoin blockchain today and you maintain a full transaction index, that's about 180 GB of space on your disk. If you were to use confidential transactions with pedersen commitments rather than explicit values this would more or less triple because the pedersen commitments need these rangeproofs attached to them. With mimblewimble, the intermediate outputs go away and their rangeproofs too. So we would have only 18 GB of transaction kernels, one per transaction, 2 GB of final output unspents, just curvepoints that are 32 bytes each, and 45 GB of those rangeproofs. Despite adding confidential transactions ,you have a total system that is much smaller than bitcoin. Thanks to Benedikt Bunz and Dan Boneh at Stanford and someone at UCLA, we have a new efficient rangeproof that came out in the lats month called bulletproofs (see also or here). And the result of that is that the mimblewimble system with bulletproofs is much smaller than bitcoin without any rangeproofs at all, which is a really exciting development.
Scriptless scripts
Okay, so let's move on to the next part which is that, this is nice but this if this is only about unconditional coin movement then this is not particularly exciting. However, scriptless scripts, which is something that has been developed throughout 2017 and continues, a very broad research project, is a way to use these kernels and kernel signatures to attach conditions to them without modifying the system so that the verifiers need to understand new rules. What I mean by this is that a set of parties can decide on some sort of contract or protocol that they want to execute, and as a result of faithful execution they will produce a valid signature and the blockchain and its verifiers can validate that the signature is valid. The blockchain does not need to know any of the details of the original transaction.
Historically this came from mimblewimble. We wanted to be able to do cool stuff with mimblewimble, and we couldn't. But in fact, any system that supports or Schnorr signatures or some signature scheme that is linear in the data can do this scriptless script technique.
There are many reasons to do this. I described the motivation for mimblewimble, but even for bitcoin and friends this is exciting. The reason being that, the existing paradigm for doing this is to use a script-signature system where you lay out the rules and then you lay out an explicit witness that those rules were followed. And then everyone downloads all of this and then verifies the witnesses for every single transaction, which prevents the mimblewimble-style compression that we talked about, but it also means that anybody who wants to do cool stuff can only do it within the framework that the entire system is aware of, namely the consensus system where everyone agrees on whether rules were followed or implemented at all. It's very difficult to extend this to do things that require primitives that do not or do not yet exist in the consensus system. As a secondary feature, we care about privacy and fungibility for public cryptocurrencies which have every transaction published and downloadable by everyone, and this is very bad for privacy and for commercial confidentiality especially if your amounts are on the blockchain. This makes it difficult to do business, when you're revealing all of your financial transactions, and this compromises any sort of real commercial use of this system. Using scriptless scripts, we avoid revealing contracts because all that hits the chain are public keys and signatures.
Schnorr signatures support scriptless scripts
Let me get into some algebra to explain how this works. Probably most people here are familiar with Schnorr signatures or they at least saw them in school or something. Let me briefly overview how Schnorr multi-signatures work where you have multiple parties and you want to create a signature that every participant needs to contribute to produce the signatures. They all have their separate public keys. They sum these up to get a joint public key P and they want to produce a signature which validates with the key P such that all of them together would need to produce this. So they do the standard Schnorr signature thing which is that they think of a nonce R which is actually k * G and you produce the signature s = k + ex where k is your secret nonce and e is the hash of the data going into the signature. To do a multisignature you just sum everythin so everyone chooses their own R = k * G. Everybody passes around their different R values they pass them around to get a joint R value. Using the joint R value everyone computes their joint hash challenge and then they do the same thing except e is now a hash of their joint public key and something else. Your nonces are the sum of everyone's contributed nonces, and the signature is the sum of everyone's contributed signature values. Very easy. In practice, I should warn people that there are things called key cancelation attacks where people can choose their keys and nonces in adversarial ways and you need to be careful about it. But it's not an impossible problem and it wont derail anything that I'm talking about here.
Kind of a philosophical point is that these multisignatures are already kind of a scriptless script in the sense that you have a bunch of people who have all of their own independent public keys and they add them together to get a joint key. They have a joint key and joint signatures. Public verifiers that weren't party to that, won't know how many people were involved, or that there were more than one person involved. They certainly don't know the original values. You can generalize this. If you look in the literature you'll find threshold signatures, a generalization of this to m-of-n signatures using linear secret sharing and this nice property that because a multisignature came from adding up everyone's nonces and signature values, then if you put a linear secret sharing scheme on there then you can basically do the same thing where you're contributing shares of signatures instead of entire signatures and it all just sort of works- magically.
Adaptor signatures
Alright, so, I'm going to modify this multisignature protocol a little bit to produce an "adaptor signature" which will be a building block for all of the scriptless scripts stuff that I've been working on.
What we do here is that one pary, let's just do the 2-party case, it's much simpler for 2-parties. Somebody rather than giving their nonce R in the multisignature protocol instead they think of this blinding factor T and they send R + T instead. They do everything else the same way. They are also going to send T. This is kind of a weird thing to do because I'm not trying to hide the nonce or blinding it, all I'm doingg is offsetting it with some value that I know. So we do all of this, we follow the protocol in the same way that I described it. You will get a signature that is almost valid- it's valid if you offset it by some secret value little t that only one party knows. And it's easy for everyone involved in this to verify that this is happening correctly. So what you have is basically an adaptor signature where knowledge of-- if you know the adaptor signature, the knowledge of a valid signature with the same nonce (enforced by the hash being the same), is equivalent to knowledge of the value of little t here. This is a building block.
Features of adaptor signatures
We can do a lot of cool stuff with this. It's really general. When I said big T I just threw a point over the wall and that will work and will get you this system. But if you attach any other auxiliary proofs to T, or if you're doing some other complex multi-party protocol in parallel and this big T is something important to the protocol, then what you got now is an adpator signature that will let you translate correct movement of the auxiliary protocol into a valid signature. In particular, if you're doing blockchain stuff, then you can create transactions that can only be completed by correct execution of this auxiliary protocol. And this is extremely cheap to do.
That's really exciting. It means that some semi-honest systems where there are incentive problems and you have to add a lot of auxiliary proofs, maybe you can just stick a monetary incentive for people to keep working, like say I'm doing a multi-party protocol with someone and I put up coins that they can take and I first put them into a 2-of-2 multisig output with myself and my counterparty, I do the adaptor signature protocol such that the only way they can complete the protocol is by revealing this secret value which I then need for other reasons.
What I really want to emphasize to this audience is that this is a very general framework for attaching valid signatures to arbitrary discrete log based protocols or any protocols where your primitives are linear. There's a lot of open surface area for exploring this concept.
Example: Atomic cross-chain swaps
Let me give one example of this. And then I'll close for questions. Which is the atomic swap. This is your standard proto-typical smart contract that you think about when you're doing blockchain stuff. You have Alice and Bob, and Alice has A-coins on the A-coin blockchain and Bob has B-coins on the B coin blockchain and they want to exchange these. These two blockchains don't know about each other and they can't verify each other's transactions. But somehow they want to make these two transactions happen atomically on both chains: either they both go through or neither of them do, so that there's no risk that one counterparty steals the coins and the other is left trying to be honest.
There's a classical way to do this in blockchains where you use the blockchain script system to put a hash preimage challenge and you have to reveal the same preimage on both sides. Alice knows the preimage, and reveals it on her side, and then Bob copies it on to the other chain and takes his coins. This is how the atomicity works. Here, using adaptor signatures we can do something simpler where both Alice and Bob put up their coins into 2-of-2 multisig outputs on both blockchains and then Bob gives Alice adaptor signatures using each side using the same T value which means that for Bob to take his coins he needs to reveal T and for Alice to take her coins she will need to reveal T and Bob actually knows T. So they do this setup and then there's some ordering constraints which I'm not going to go into; but they do the whole setup, and in the end nobody has an opportunity to take their coins without everything being setup such that when Bob takes his coins he publishes little t and Alice can compute little t from the final signature that hits the blockchain and uses that to make another signature that gives her her own coins and you get atomicity this way. You still have this exchange-of-information but you don't have explicit hashes or preimages on the blockchain. Nobody can even see that this was happening; and the system is much smaller, and in a public validation sense, all that anyone would need to verify is that the signatures are valid. So that's pretty cool. As a bonus you get this privacy and stuff.
Open problems
I have a list of open problems. The main one that I'm thinking about most this morning is quantum resistance. Everything I have described uses these really simple linear properties of digital signatures. I think a lot of post-quantum schemes out there like NTRU and friends are using lattices which are also really linear objects. Maybe there's a way to do this in a quantum-hard setting which would be really exciting because as much as I like cyclic groups and discrete logarithms unfortunately it seems that their days may be numbered and I'll have to start doing some real math.
That's all I have. Thank you.
Q&A
https://www.youtube.com/watch?v=ovCBT1gyk9c&t=27m47s
Q: I have a question. The blinding factor R becomes my private key?
A: Yep.
Q: How do I prevent other people from coming up with the same R value?
A: R is a uniformly random point in the scalar group. It's a 256-bit number. Assuming that the discrete logarithm problem is hard, then nobody can compute R, for the same reason that nobody can compute a secret key from a public key. This is exactly the same security model here.
Q: Also I was thinking about, because of pedersen commitments, if I know the discrete log between the group generator like G and H,...
A: That's a really good question.
Q: If I know G is xH....
A: Something I skipped over, if someone knows the discrete log between G and H, then it's not a pedersen commitment but a chameleon hash. We need G and H to be uniformly random points. So usually you have a hash to point function to get uniformly random points that come out of a random oracle. And then assuming the discrete logarithm problem is hard and assuming that you actually have a random oracle, then nobody knows this discrete log and nobody can compromise these pedersen commitments.