Bryan Bishop

Handling Reorgs & Forks

https://twitter.com/kanzure/status/1052344700554960896

Dev++ / BC2 - October 4th-5th 2018 - Keio University, Tokyo, Japan https://bitcoinedge.org/event/keio-devplusplus-2018

schedule: https://keio-devplusplus-2018.bitcoinedge.org/#schedule

video: https://youtu.be/EUUQbveGF5E?t=7

Introduction

Good morning, my name is Bryan. I'm going to be talking about reorganizations and forks in the bitcoin blockchain. First, I want to define what a reorganization is and what a fork is.

Definitions

You might have heard that the bitcoin blockchain is a permanent data structure that records everything from all time and that is actually somewhat false. Instead, actually the bitcoin blockchain can be modified and changed, and some of these changes are called reorgs and some of these changes are called forks. In general if a change has something do with a change in the rules in the bitcoin blockchain it is called a fork. And then a reorganization, usually does not require anything to do with the rule change. Some reorganizations are actually completely valid under the bitcoin blockchain rules.

Confirmations

Before I get into that further though, I'm going to talk about some motivation for why this is interesting at all in the case of bitcoin. And that is because bitcoin transactions actually require something called a confirmation, which is basically the idea that bitcoin transactions get buried in the blockchain and as they get buried further by further work and further blocks being built on top of them these transactions are considered confirmed. So early on and a few years ago for bitcoin it was widely believed incorrectly that, well it was believed by other people, not really by developers that you can actually have zero confirmation transactions- that is actually false. You need to wait for transactions to confirm. And there is a special case, in the case of lightning network, and other similar technologies, where you can actually have zero confirmation transactions, but that will be explored later today by probably Tadge.

So anyway, to determine how long to wait for these confirmations there are some calculations that you can do that I will be describing later in this presentation This is from the bitcoin white paper section 11, I'm actually not going to recite it from memory (the note on the slide says recite from memory). The popular belief or the popular recommendation is to wait for 6 confirmations, and then after that point you consider the transaction to be completely confirmed and irrevocable. However, that is not really true either, but there does need to be a point where the community in general says if there is a problem in the blockchain that goes beyond six blocks that are removed from the blockchain then we consider that a catastrophe of some kind.

So why is this interesting? Well the reason why there is a confirmation at all is because first of all if a transaction is not confirmed, then theoretically it can be spent any number of times and that degrades the property of the currency. One of the properties of money is if you spend, it it stays spent, you can't just redo it and there needs to be a record of these transactions. If you do not have the confirmations then the transaction hasn't actually been spent and the receiver has not received their money. In certain situations, which I will explore in a moment there are something called reorganizations, which is a change in the tip or first few blocks in the blockchain. And in these situations where these replacement blocks come into existence and in certain cases transactions that were in the previous set of blocks are not going to be included in the new set of blocks. I will be showing that.

This is section 11 of the bitcoin white paper, using a poisson process you can calculate, or he calls it a gambler's rune problem. And basically it is just the probability of an attacker being able to win out against the bitcoin network by devoting some amount of energy and this probabilistic calculation to determine the rate of success of an attack against the bitcoin network. Which is that you are racing all the other miners to create your own blocks that have a special set of transactions, perhaps, including or excluding one that the miners are not including or do not know about or something like that. Maybe you want to censor a transaction, maybe you want to do a double spending attack where you previously sent money and a merchant gave you a digital coupon download kit code and know you want your money back. Things like that.

This is also section 11. I'm not going to go through the math, but there is an implementation of it and you can go and type in a form online and there are fields. If your attacker has 40% of the hashrate and you are waiting six confirmations you can calculate the attack success probability, in this case 50% of you winning and being able to get a different block into the blockchain than the one that was previously confirmed, or partially confirmed.

https://people.xiph.org/~greg/attack_success.html

Reorgs

So, in the event that there is a change in the tip of the blockchain, that can be considered a reorganization. And there is a case where it is benign, somewhat benign, such as when you have an orphan block where a miner mines a block, but then a moment later another miner found a block at the same height and thus your block is not going to be included in the blockchain for whatever reason, perhaps the other miner publishes first or had a faster connection or had better connectivity to the network and that used to be somewhat frequent situation, but it is less frequent today in bitcoin in part due to network improvements from things like BlueMatts work and other things.

This diagram is a chain of blocks, ignore the coloring here, other than the black. The black is indicative of a single chain and then these other colors you can ignore. Essentially, it is quite simple, at this point in time when you get this next block, if the blockchain you had was up to this point, and later you get three blocks here that have more work and are a better fit according to the rules of the bitcoin blockchain, if you are on this node now, this is considered to have been a reorganization event switching to these blocks.

So, again like I said, for a reorganization, it is when the tip of the blockchains are replaced with different blocks. One way to think about this is a rollback, like rolling back version history is very similar. The most common type of reorg is what I would call a shallow reorg. That is not standard terminology, that is just what I call it. The one to be really concerned about is deep reorgs, these can be pretty deep, deeper than you think. There is a chance of these happening and you just need to be careful and I will explain the implications of those.

One way to think of this is, a reorg in the case of an orphan block, is generally not considered adversarial because that is something that occurs due to the network topology of the bitcoin network or other things people really didn't intend. But technically an adversarial entity or attacker could purchase or rent bitcoin hashrate and create blocks of the structure they prefer and this has a monetary cost that you can go and calculate and it is really not as high as you might think. So if you are running a centralized service where you are accepting deposits and you have withdrawals, as an example, miners or even non-miners can rent mining hashrate and do certain attacks.

Q: Last month this actually happened.

A: There was an exchange that had an attack against it. So this is common, it does happen, there are recent cases we can discuss. There are certain things you can do to help mitigate against this if you are running a service or writing a bitcoin related protocol, which I will explain in a bit.

Reorg handling

If you are writing a wallet or a node or some centralized service or even a client for a bitcoin related protocol you have to be able to handle these reorgs that occur. The easiest way to do this is to assume that reorgs do not exist past a certain confirmation depth and you can just say after 200 blocks I'm going to assume there will never be a reorg that affects this. The problem with this though is users will be really upset with you because you are forcing them to wait 200 blocks. So you generally have to implement things to handle reorgs in a more timely way so you can have user wait less than forever. So to do this you want to get a list of all the blocks where there is a difference between the local copy and a bitcoin blockchain current best blockchain. And you want to get those lists of blocks and then within those blocks you want to determine if there are any transactions in them that had been removed that were related to your protocol or your node or user or whatever it is. And then you want to look at the new blocks and determine whether any of those transactions have been removed, are they missing, are the mutated in some way. Are the changed for example by mutation, I mean txid malleability is one example. But another example that includes a txid malleability, like if someone did fee bumping, replace by fee, where they added an input, technically that is a different transaction even though it might still have the same general structure. The transaction will include other inputs, it might include other outputs. All sorts of stuff, so you need to be able to handle those cases where in the event of a reorg the transactions look different. So you have to go and update what you previously thought was a confirmed transaction, is now completely different and you have to update your internal database and all sorts of other stuff. One way to think of this, it is very similar to sinking a node on the blockchain, and if your node is offline for a long time then you have this problem where you have a bitcoin blockchain that had a block from 6 months ago and now you have to do 6 months of catch up.

Most recent common block

How does your node know what to download. I call this, find the most recent common block. And what you can do, I call it a multipoint binary search, I don't know if that is what it should actually be called. But you could in theory just do a binary search and you can ask over RPC or the peer to peer protocol, just keep getting a block and keep going back in the block height integer until you get to a point where you have the same block hash. The reason why I do multipoint, is it is a bit faster, if you can sample multiple points at the same time then you can just get there a bit faster with less requests. What you want to avoid doing is iteratively pulling down each and every block until you find the most recent common block, instead you just want to do the block hashes really fast. Unfortunately, all the software that I have seen is very slow for getting the list of all the block hashes. So that is why you want to do the sampling approach.

Testing

If you are writing an application what do you do to ensure you have done reorg handling correctly. Obviously the answer is testing. And there are different levels of testing you can do. One is, and I suspect people are not doing this, you can do unit testing without using bitcoin really by which I mean you can just have your maps or dictionaries have a list of blocks and you map to a list of transactions and you say well that is my simulation of a blockchain and then you can test your code against that. Perhaps one level more complex than that is to have a bitcoin library in the programming language of your choice, maybe with mocks for RPC and have a mocked implementation of a bitcoin node. Then you can do real testing against a bitcoin node using regtest and then if you are more daring you can use testnet which is a bit more volatile and won't behave in the way you hope it will, but you should do it anyway. Ideally, especially if you have a company running a centralized service, I have always encouraged customers to test against testnet with the company's integration. If I had it my way I would actually insist that every customer should always do a test deposit on testnet before they are allowed to deposit for real on bitcoin mainnet. Because I do not want to be responsible if they do a real deposit on mainnet without testing first, and for whatever reason their wallet is broken or they are doing something stupid. Anyway, this was supposed to be about reorgs.

Reorg withdrawal attacks against exchanges

How do you mitigate some of the problems that occur in reorgs. In the event, like the example given earlier, where an exchange that had I guess a 51% attack against one of the altcoins. The exchange listed an altcoin and might not even be fair to call it a 51% attack, it was just some sort of hashrate attack that caused a reorg. And the exchange had credited a user on the exchange for a deposit and then they were able to make a withdrawal in another currency because they traded in the exchange and that is obviously bad, because due to the reorg fee the adversary was able to get back their original coins that they deposited, so it is double spending. They got their original coins back, but also they got a withdrawal in another currency. That same attack can also happen on the same cryptocurrency. So you do that attack and the exchange gives you a withdrawal back in the original same cryptocurrency. But they don't give you a withdrawal for the coins you deposited because perhaps your wallet is broken, so instead they give you, by broken I mean poorly designed. Instead they give you a cryptocurrency withdrawal in the same cryptocurrency but different coins. And that is really broken because now you have the original coins that you reorg'd out and also a withdrawal transaction that is still valid after the reorg for the same cryptocurrency. So now you have double your money.

Tainting

How do you prevent this. The way I would describe it is that you need to have something I call taint. Which is that you want to taint all of the withdrawals with all of the recent deposits. And the reason why this works is if you taint all the withdrawals with all the recent deposits in the event that the deposits disappear in the event of a reorganization or some sort of hashrate attack then the withdrawals are automatically invalidated because the deposits no longer existed. So the bitcoin transaction that gave the withdrawal, you have to taint with the deposit. The deposits don't exist, the transaction is no longer valid and they won't be able to steal your money. Any questions about that, that one might be a bit weird.

Q: Tainting withdrawals with deposits might be difficult if your deposit wallets are cold. So I'm wondering if there is any way you can think of to taint withdrawals with deposits on a real time basis when you need to get 5 guys into a bunker to sign whatever it is that they need to sign.

A: Unfortunately hot wallets have these problems in general. If your requirement is that you have timely access to your bitcoin, you can sign immediately or instantly or something, then you are going to have a lot of security problems in general. I don't know a good solution for that.

Forks: Soft-forks and hard-forks

Moving on from reorganizations to forks, the other part of this talk. Forks are similar to reorgs I guess but in general forks are something that occur in the bitcoin blockchain when there are different rules at play where you have different software that has been deployed or different rules have been implemented. In general a soft fork is where you have a (actually it is kind of disputed) backwards or forward compatible change to the consensus rules meaning that the new rules are actually a restriction of the previous rules. So by all the nodes that were previously deployed in the network, the new rules and the new data and the new activity or behavior is still valid according to the old rules. But the new rules that are added under a soft fork restrict that behavior to a subset of the previously allowed behavior. And this is useful for things like restricting the set of possibilities that you can do and enforcing some new rules you all agree to. The other fork type is a hard fork. This is actually an oversimplification, but hard forks are generally incompatible changes that nodes would reject because they are incompatible with the rules that were previously deployed on the nodes that were operating. Both have reorganization considerations. Something to consider, if you are deploying new software with new rules to the bitcoin network there are situations where not all of the nodes have upgraded. And furthermore not all of the miners have upgraded and you can get into these successive rollbacks that keep occurring multiple times because some nodes are upgraded and some aren't, and they keep bouncing around data to each other. This is something you have to be aware of and when you are designing forks of any kind you have to keep that in mind. For example, a fork event in, if you have poorly defined your soft fork activation trigger then it could occur multiple times and ideally you want an activation or upgrade to only happen once. And then furthermore if you have poorly designed your upgrade or the features or you haven't considered all the consensus related changes, there might be more chain forks than you expected. Hopefully we will have a talk about that later today. The reason that occurs, when we talk about the bitcoin blockchain and consensus code there is a section of code in the bitcoin client that has to be very carefully handled because if there are discrepancies between two nodes and how they interpret this consensus critical code then they will not agree on valid blocks in the blockchain. Then they will start rejecting blocks and everyone will be using different money and ideally we want everyone to be using bitcoin, like one set of rules for bitcoin.

This is my hard fork diagram, it is very similar to my reorg diagram. In this one the colors do matter. You consider this edge to be a hard fork activation, and the nodes that are hard forked purple nodes for example, in certain situations, where a hard fork is specified, it will not switch to these other blocks being produced even if they have more hashrate. Hard forks can be designed in such that these nodes will forever reject that sort of data or the other way around.

Extension blocks

There are many kinds of hard forks. I'm not going to describe all of them but they can get pretty interesting. Almost adversarial, an extension block is an interesting one to focus on where you can have a commitment to extra data within this block and this extra data also has to be validated. This was actually proposed as a block size increase mechanism. Segwit was a soft fork and that one was interesting. Basically it was a restriction on a rule used previously, that certain transaction types could be spent by anyone. And then with Segwit that was a restriction on those rules such that it could only be spent based off of the witness data. And by that way, with some extra accounting methods, you can do weight and sigops limit change and you have an effective block size increase out of Segwit.

Replay protection

So there is a further topic related to forks that you have to consider called replay protection. In the event that someone has created a fork, and this is quite a consternation for companies and wallets, because if someone forks bitcoin and makes some other version of bitcoin they call bitcoin 2 and if they make no other changes other than the fact that bitcoin 2 nodes reject bitcoin blocks and they have some other blocks that now get created, and if that is the only change, the problem is that bitcoin transactions will be possible to be replayed on the bitcoin 2 blockchain. And this is kind of bad because it causes mayhem, chaos, and confusion for users because if they make a bitcoin transaction they are expecting well I only authorized a bitcoin transaction but a fork might actually say well all bitcoin transactions are valid on our chain as well and now companies have to deal with this. Well you sent me bitcoin 2 but I asked you to deposit bitcoin, stuff like that.

Handling hard-forks that have not added replay protection

So there are some methods for doing replay protection when a fork implementation has not added replay protection on their side. The general recommendation is there is a replace by fee method that has been explained online and also a lock time method. The replace by fee method is that if you want to make sure that your coins are different, that no transaction that you make can be replayed on other blockchains, you actually intentionally make a transaction that can be replayed on both blockchains and then on one of them you hopefully get a confirmation before the other and then the other one you replace by using replace by fee or something. Now this can actually not work so well if the other blockchain might have disabled the replace by fee or nodes have disabled replace by fee. It is not a consensus rule, but the nodes on their network might not allow that. Although I think you only need it for one blockchain in this scenario where you have like a two blockchain thing going on, a two blockchain fork.

Opt-in replay protection and opt-out replay protection

Then there is this further problem in cases where replay protection has not been implemented where if you receive a deposit from a user you might be receiving a deposit from both blockchains at the same time even if you only have infrastructure to handle bitcoin. So anyway, forks without replay protection, we do not like those. We prefer forks with replay protection. Two ways to think of it actually, there has been cases of opt-in replay protection and opt-out replay protection. In general we should always prefer cases where we ask users to opt-out of replay protection. Opt-in protection basically means users are vulnerable by default and that is a vulnerability that should be avoided. With opt-in replay protection users would have to know about all the forks, but there have been 80 forks of bitcoin and I do not have time to keep track of all of those. That shouldn't be my problem. Anyway there have been a lot of proposals on how to do replay protection, usually it is to set a bit in the sighash flag which makes the transaction signature invalid on the bitcoin blockchain but valid on the alternative blockchains.

There has been some good work by Johnson Lau and Luke-Jr too for bitcoin hard forks at that url and their research has been aggregated there: https://bitcoinhardforkresearch.github.io

Cross-fork transaction replay

Just to drive the point home with replay of transactions across blockchains, in a situation with multiple forks, certain transactions, it is possible for a transaction to replay across multiple forks simultaneously so this can cause lots of mayhem and chaos. And it gets really difficult to track where are all of the transactions. Who is sending me money who is not, why is this occuring, how do we stop it, things like that so you have to be really careful around forks.

So one interesting method, and I do not recommend it, but I am going to describe it anyway. Is that you could say well there is multiple forks and maybe you are a business and you are accepting deposits you could say well I do not want to handle any of this fork stuff. So I'm going to demand that my users only deposit pre-forked coins from all of the blockchains from all of the forks and then those are the only coins that I will accept as deposits and then you could consider the unison of all of the blockchains as a valid bitcoin. This is kind of stupid, I do not recommend it, but it is certainly one way to do it, because then you could just say well either my transactions work on all of the blockchains or none of them or something. But generally, most people only care about the bitcoin blockchain. That is where bitcoin is, all of these other chains, that is not bitcoin.

Post-fork minted coins for taint as a method of replay protection

One further idea real quick is that, related to tainting is that another method of tainting to protect against replays is after a split or a fork in the blockchain you can actually use post-fork minted coins in transactions and those coins will be invalid on the other side of the fork. So that is another way of doing replay protection.

And that is all that I have. Thank you.