The quest for practical threshold Schnorr signatures

Tim Ruffing (@real_or_random)

# Introduction

It's great to be here. My name is Tim Ruffing and I work for the research team at Blockstream. This is a talk about practical threshold Schnorr signatures and our quest for constructing them.

# Disclaimer

First of all, this is work in progress. It's pretty early work in progress, mostly based on discussions with people at Blockstream. I describe the problem we want to solve, and we have some initial ideas, but they are very initial and I have tried to put one or two sentences about those ideas into the slides but nothing is really done yet. If you think what I say doesn't make sense, or the problem is already solved and you have looked in the literature already then please interrupt and tell me.

# Threshold signatures

What are threshold signatures? Say you have a group of n peers, and the idea is that if you have any subset of size k in that set, then they should be able to produce a valid signature and if you have fewer peers then you shouldn't be able to make a signature. So there's a few properties here, like unforgeability where a smaller set cannot produce a valid signature, whereas you also have robustness where k honest peers should be able to produce a valid signatures even if the other peers are trying to stop them. n-of-n is basically just multisignature, and that's a special case.

# Why threshold or multi signatures?

When you think about smart contracts, there's always a case where some funds are controlled by a set of peers. Payment channels are typically made between two parties. The funds are locked up and you need agreement from both parties to spend them, so this is a 2-of-2 scenario. Another scenario might be two-factor authentication which is 2-of-2. Bitfinex's cold wallet is 3-of-6 threshold multisig for their cold wallet to enhance security. If you look at our own products at Blockstream, we have an 11-of-15 Liquid watchmen functionaries. To send bitcoin into the Liquid sidechain, you pay a 11-of-15 multisig federation.

# Goals

The goal in this talk is to obtain threshold signatures that look like ordinary signatures. There are some reasons for this. First, they are much more efficient than the current multisig we have in bitcoin. Second, they also offer privacy advantages.

# Taproot

https://diyhpl.us/wiki/transcripts/bitcoinops/schnorr-taproot-workshop-2019/notes/

Hopefully taproot will get deployed in bitcoin at some point. The basic idea is that your UTXO on the bitcoin blockchain reduces to a single public key and this public key has a secret key in it, but it also has a commitment to a script. This enables you to spend it in two ways. Either you can use key-path spending where you just sign by producing a signature, and nobody will ever notice that there was an extra script in there. It's great for privacy and it looks like an ordinary signature. If you really want to use the script and you don't want to use key-path spending, then you can do script-path spending by revealing g^{x} and then the script and then you can fulfill the script and whatever it requires you to do.

The typical case in most smart contracts is that all parties agree. There are lots of rules but they are mostly for cases where parties go malicious or want to disrupt, but usually you just get agreement from all parties so you might as well use key-path spending using a threshold signature. Nobody will ever notice that there was a script involved or smart contract.

# Schnorr signatures

The signature scheme for ordinary signatures that we want here is Schnorr signatures. Maybe you have seen this before maybe not. The secret key is a simple scalar, an elliptic curve group scalar. The public key is g^{x}. So this looks like ECDSA so far, so you don't have to change key generation. For signing, you take a random nonce r and you compute a public nonce g^{r} and then you compute a challenge by hashing the public key and the nonce and the message, and then you compute this s value which is the second part of the signature where you multiply the challenge with your secret key and add the nonces. This is constructed from Fiat-Shamir protocol... so the signature is the public nonce and the s value. Verification is very simple; you re-compute this hash and you check the main equation and the exponent which you can do.

There's a draft called bip-schnorr, and then there's bip-taproot and bip-tapscript. Have a look, it's on github. They are written by Pieter Wuille and myself and many others. We have a full specification, we have reference code, design rationale, and so on. Everything you would expect to see.

If you're interested, we need more eyeballs to look at this. Maybe go to this short url https://bit.do/schnorr - please have a look.

# Naive multisignatures

You could add up all the x values, and then add up all the r values. You could multiply the group points and get the sum of the exponents since everything is homomorphic here. Anyway, this is naive and it's not secure. There's multiple problems with it, and all the problems are solvable. The MuSig scheme fixes a lot of this, it was published in 2018 and concurrently there's the MSDL-pop scheme by Boneh, Drijvers, Neven in 2018 which solve the same problems.

# From multisignatures to threshold signatures

The simple way to depict this idea is to use secret sharing. They might want to use Shamir secret sharing. I won't explain how it works, because it's not important. But it gives you something like 2-of-3 multisig. So the party with a secret can make a 2-of-3 threshold secret sharing scheme. They create the shares, then send it to some parties, or maybe even himself to simplify the diagram. Since this is 2-of-3, even if one of those parties on the left or right hand side becomes unresponsive, two of the parties can reconstruct this secret value. It's not only that they can reconstruct it, they can actually do computations with it. This is how we basically use it within Schnorr signatures. This is only one of the steps to get it.

# Distributed key generation

.... So far I have described this for the secret key, and you can get a working scheme from that. We haven't done a security proof for this yet but it looks reasonable. If you do this only for the secret key for distributed key generation, then you probably get a protocol that doesn't have a constant number of rounds. So you get O(f) rounds where f is the number of disruptive parties. With 3-of-3, you don't need threshold signature scheme, you can just use a multisignature, so we're doing 2-of-3 threshold multisig here. You have to commit to the set of signers upfront, and maybe later you choose a signer that wants to go offline and then you have to restart. That's a detail that I don't have time to explain now.

The solution to that is to do distributed key generation also for the nonce. So we do the same protocol for the distributed key generation, and we have to do this whenever we have a signing session. This means that signing is an interactive process. If we do that, we can do constant O(1) rounds.

# History of DKG for discrete log

This was originally proposed by Pedersen 1991 where he used a DKG scheme for dlog, based on Feldman's verifiable secret sharing. Then Gennaro, Jarecki, Krawcyz, Rabin in 1999 said the attacker can bias the key. They proposed a better DKG scheme using Pedersen's VSS. Pedersen's scheme uses Feldman's VSS, and then they fixed it by using Pedersen's VSS. Whatever. It's confusing. Even better, three years later, they figured out the 1991 scheme is actually good enough for Schnorr threshold signatures. The attacker can bias the public key or the other one, then for Schnorr threshold signatures that doesn't matter for them. That was found in GJKR 2002.

# Trust assumption

If you work on threshold crypto, you might ask what's the point and what are you trying to do. I want to show you why those two schemes fail in practice if you really want to apply them. There are two main issues. The first issue is that there's a trust assumption. All the schemes assume that k-1 is less than half of all the peers. This is an honest majority assumption, basically. So k-1 must be < n/2. Let me explain to you why they need this assumption. I'll explain this by counterexample. Say we have a scheme without an honest majority, say 6-of-9. For unforgeability, this means that 5 malicious peers shouldn't be able to produce a signature. But robustness would say that 6 honest peers can produce a valid signature. So you can't achieve both unforgeability and robustness at the same time. But I think that's not true, it's the worst case assumption. The worst case is that there's only four good peers. But if there's only three bad participants, .... if I choose k=5 or something, it doesn't mean that 5 parties are malicious, it just means that's the maximum it can tolerate. Even if 5 people are malicious in the first scenario, then maybe I can't achieve a valid signature and this might be acceptable but what shouldn't happen is that those 5 parties can forge. I think this setting still makes sense. If you saw the talk yesterday, she talked about flexible BFT and the alive-but-corrupt nodes. I think that's exactly the setting we have here.

Can we drop the assumption where we have the honest majority assumption? Unfortunately, no. The security proof relies on the fact that the majority of peers is honest. If you're a cryptographer, the idea is that the security reduction runs all the honest peers internally and it because it has a honest majority gets the shares from-- and it can use those shares to extract all the secrets from the attacker. This is how the induction works, and you can't make it work without changing the scheme in this strong setting.

We have one idea for how to fix this, and this would be to use other commitments in Pedersen's verifiable secret sharing scheme which could enable some ability to extract some things even without having an honest majority. But it's a very early idea and we haven't deeply looked into it. Maybe it's a secure scheme, maybe not. We need to see.

# Broadcast assumption

Another issue is that the schemes assume a broadcast channel. They assume that every peer has a reliable and secure way of broadcasting messages to other peers. Reasonable assumption for robustness, but it's unreasonable assumption for unforgeability. If the network is broken, hten it shouldn't be the case that suddenly people can forge signatures. Actually, if you look at those schemes, you can see what happens. The attacker can split the view of his peers by splitting it into two worlds, where in each world he makes different claims about what peers are online or offline. What these protocols do is they reconstruct the secret nonces of the other parties.

There is also an attack on unforgeability. Say the malicious broadcast channel elarns the nonce r, and the signature (R, s). The combined secret key is x = (something - something)/c.

Malicious and offline are just two different things. If a peer is offline, you can't assume htey are malicious. This is like theory vs practice basically. Just because a peer appears offline, doesn't mean you should reconstruct his secrets in public. The simple idea to fix this is that instead of reconstructing the nonce, you could just reconstruct the partial signature that this peer is supposed to give. This looks like an easy fix and maybe it works, but maybe it doesn't work- we haven't tried to make a security proof yet but it looks okay.

# Wish list

This is my wish list for threshold signatures:

Should be a scheme that produces ordinary Schnorr signatures

No restrictions on the parameter k in k-of-n

It should be unforgeable even if the broadcast mechanism is malicious

It hsould be robust in a constant number of rounds, O(1)

We want reasonable message complexity

It should be secure in parallel sessions, which is actually a problem in the contetx of multisig even though I didn't mention it.

# Bonus list

It would also be nice to have asynchrony, deterministic nonces, look at the setup algoirthm, adaptive security, and accountability for threshold signatures.

Deterministic nonces are interesting. In ECDSA, you choose your nonce by applying your hash function to the secret key and message to make sure you cna't screw up the randomness. If you do it deterministically, you leak your private key. On another project, we're trying to solve this with a zero-knowledge proof and I'm not sure it's a practical solution but it might work.

We're also interested in looking at the setup algorithm. We've been looking mostly at signing, but for setup do we want asychrony? Maybe it can be more complex since it is run only once. We need to look at these properties.

Adaptive security means that corruptive nodes aren't fixed at the beginning.

Another interesting property is accountability. If you look at the threshold signature, then you should be able to tell which peers have signed and hwich haven't. But this depends on your setting as to whether you want this. An easy argument is that if you want your signatures to look like an ordinary Schnorr signature then you can't use this property.