The "crypto" in cryptocurrency: Why everything is weird and hard

Andrew Poelstra

I do not have any slides for you. This is my attempt to keep this non-technical, by depriving myself of any ability to write equations on screen. The topic of my talk is why everything is weird and hard in cryptocurrency. What is cryptography as a mathematics and field of science? And how does this work in practice?

# Encryption

Historically the purvue of cryptography is encryption, like trying to come up with something random looking. Security here is straightforward: if you have a key then you can decrypt something, and the idea of having a security model or threat model to describe that didn't really make sense or it did but it would be onery. In modern day, cryptography has come a long way, including zero-knowledge proofs which have computer programs that demonstrate that they do nothing but make statements about the inputs without leaking details about the input. Digital signatures and encryption are related. Signatures are like encryption where you have two keys, a public key and a private key. Anyone with the public key can encrypt data, and anyone with the decryption key or private key can decrypt the data. This can also mean encrypted data can be sent over an insecure channel.

# Digital signatures

A digital signature is the opposite-- only the person with the private key can produce a signature, and only someone with the public key can verify the signature. When we get into schemes like this, if you were trying to argue the security of this scheme, it's much harder to do and harder to define. I am going to talk about digital signatures in an academic setting, and then I want to go into practical issues about random number generation and then if somehow I have time then I will talk about how this generalizes to a multisig setting.

The idea is that nobody who doesn't have the secret key can forge a signature for a message and a public key. Academically formalizing this is actually pretty hard. By formalizing I mean making a well-defined statement. After many years, depending on how you measure it, many iterations of trying to figure out these definitions-- we landed on something like this: a digital signature scheme is secure if there does not exist any polynomial probabilistic time algorithm that can win the following game. We're going to categorically exclude the possibility that any algorithm exists at all (as long as it's reasonably bounded) that categorically cannot be forged. Even that, while defined, it's hard to prove the security of that. In real life, you have somebody who knows a key and that somebody should be able to produce signatures. So how can you categorically exclude everything but still have that special case?

So let's say, the adversary... let's make him stronger. The adversary is allowed to ask the challenger for a message, and the challenger has to write a signature on that message. The adversary now not only has to produce a forgery, but it is allowed to do so in a setting where we're giving it everything we possibly can. We want to give it as much information as we can get away with. So surely signatures on arbitrary messages are enough. When the adversary produces a forgery, it has to be on a new message. Given a public key, .. it has to produce a signature on a new message. Is this secure? This is actually secure under a formal notion of security, called existential unforgeability under a chosen-message attack. But in more complicated systems, this is insecure for a few reasons. One reason we found in bitcoin is that by allowing the adversary only provide a signature on a new message, we're exlcuding the possibility that an adversary might take an existing signature and tweak it while keeping its validity. This isn't captured in the model. Once a message is signed, who care if the adversary can produce a different signature for the same message?

Well, in bitcoin we were using ECDSA and txids were based on these signatures. So the attacker had an ability to change a signature, allowed them to change the txid of the transaction which would then keep the transaction valid but invalidate any transactions referring back to that txid because the txid was malleated. So we need a stronger model. We need an adversary now where we say, if it provides a signature on some message, it's allowed to be the same message, but it must always be the same signature. This is what is called "strong signature" in the literature. Is this secure? No. Suppose you have a signature scheme and your attacker-- say you are using Schnorr signatures, which are in particular vulnerable to this, such as the original 1989 Schnorr algorithm... Here, the attacker takes one of your signatures, and rather than tweaking the signature, he is going to tweak the signature in a way that he retains validity on the message but changes the public key. He is going to take a signature that validates against one public key, and makes a signature that is related to a public key related to the first public key. Using bitcoin, you make keys that are algebraically related in some way, such as bip32 hierarchical deterministically generated keys. Theoretically someone could produce a signature, someone could take that signed transaction, and create a signature on a different transaction, it's an algebraic relation attack on one of your keys for a different one of your keys. In practice, this is not actually a concern in bitcoin because bitcoin as part of its design in the data that gets signed in a bitcoin transaction you include not only all details about the transactions but the previous transactions including the public key. So the result is you have something that looks like a Schnorr signature except the public key is.... but it turns out this is secure in an even stronger sense than a "strong signature", it's a zero-knowledge signature of knowledge on the message being signed and knowledge of the same key being signed. That's much more complex than I want to get into here.

At each step, I was strengthening the idea of security, and categorically eliminating possibilities of forgeries. But somehow this categoric elimination doesn't work because my categories aren't even quite big enough. Even things like producing a signature when a signature hasn't been seen before. It's taken a while to explore this design space and look at forgeries and see what they look like in practice. For signatures, I was able to give a security game described on stage without using diagrams and arrows everywhere. Same thing gets more complex with multisignatures. When you're talking about zero-knowledge proofs, where you have a zero-knowledge proof and a simulator and all the different rules and trying to demonstrate indistinguishability of these complicated systems that are deployed from the more ideal model and so on until you get something that you can actualy say something concrete about. But nevertheless, in the real life, there is a tremendous amount of value from having provable security.

# Provable security

In my remaining time, I want to defend this concept of provable security, even though I told you it's hard. One thing I want to talk about is this notion of random number generation. I am going to switch topics from defining security to actually deploying secure systems in practice.

For ECDSA or Schnorr, producing these signatures requires the generation of uniformly random data. Uniformly random means that out of all the possible random number you can choose, which in practice is between 0 and some fixed large prime number. If you don't-- if you fail to produce random numbers, and you reuse the same number or the same nonce in multiple signatures, then your coin gets stolen. This has happened a few times, famously, like someone hacked the ps3 using this glitch. It happened in bitcoin in 2012 where some Android wallets reused nonces and had a bad random number generator. A lot of people lost their keys and their coins.

More recently, and I will be quick about this- it wasn't that the nonce was reused. You had random-looking nonces but there was some software out there that was generating nonces that were predictably biased in some way, like the first few bits were consistently 0 much more often than they should be. Even this deviation from random, given enough signatures, is enough to leak your private key. So we have these signature schemes that are proven secure in academic models, but they always require uniform randomness. We have this intuition that random can't be guessed, but here random means indistinguishable from uniform random. Just being able to distinguish it from uniform is enough to break the academic proof, and in practice you will lose your keys and coins.

# Summary

I am out of time. This stuff is difficult and subtle and if you're excited about all these new projects claiming magical things, then you should be skeptical. Bitcoin already moves really fast. We need to slow down and be careful.