Incentives and Trade-offs in Transaction Selection in DAG-based Protocols

Yonatan Sompolinsky (The Hebrew University)

Commercial effort to implement DAG-based protocols. No ICO, it's a regular company.

A little bit background of DAG-based protocols. The background to this work is something we have been working at Hebrew University and Yoad. This idea of blockDAG is about layer 1 of bitcoin. And so it's a generalization of the chain structure of blockchain. As such, there are two solutions- lightning network and payment channels, and were scaling up layer 1, and you're scaling up in layer 2, it's complementary solutions.

This talk will be focused on an idea about "An inclusive blockchain protocols" from 2015. We are revisitng these ideas now because we are interested in implementing these blockDAG protocols.

Blockchain vs BlockDAG

The first difference is that in a chain you would maintain a single chain. But in a DAG you have an entire graph of blocks. In a chain, we ignore anything thta is off-chain and any data that was not in the chain. In the DAG, you use all of this information, you harvest all information from all blocks. The third implication that in a chain paradigm you are trying to supress throughput so that spontaneous forks are rare in the system. In a blockDAG system, forks are a common phenomena of the system and many forks are created all the time.

As you can imagine, this is essentially a matter of information. In the chain paradigm you ignore everything off-chain and in DAG you use that information, like for scalability and more fairness and more security. It can also be used... it doesn't mean that every blockDAG uses, so just to drop a name, ethereum's GHOST is mostly about fairness not scalability.

Scaling up layer 1 using DAGs

The first step would be to speeed up block rate. Instead of 1 block every 10 minutes or 1 block every 21 seconds, imagine 50 blocks per second or maybe more. That's step one. So we need spontaneous forks of the chain. The second step is to modifying the mining protocol where instead of extending a chain, you want to reference previous blocks and all the tips that it sees, and integrating the forks into one graph so that you have this massive cool graph whic his directed acyclic graph. Seems so simple so far.

The third step is the most crucial, though. Which is extracting a consistent set of transactions from this graph. The graph might contain conflicts and you want to extract from it and order it and make it work. This i the main challenge of a blockDAG effort. We developed the SPECTRE method to deal with this as discussed 2 years ago in Hong Kong. The consistency rule is the main challenge.

I want to clarify that blockDAG is not a solution, it's merely a framework on top of which someone can develop on top of. Consequently, not all blockDAGs are... I mentioned etheruem ues some notion of a DAG partially. You can design other variants. I like to think of DAG vs chain as chain is slow single one lane road in a small town and a DAG is a highway and the first implicaiton of this is that if you don't knw what you're doin, you'll have a traffic jam mess and you have to think carefully about what you are doin with your DAG. So you want an ordered highway, everything is clear, and high throughput. If you are very lucky, or very thoughtful, you can evolve a protocol with fast confirmations at least for some transactions, whic his the topic of this talk.

To scale up blockchains using DAGs opens up various challenges. The first challenge is the consistency rule. You also have to address confirmations time, how much storage you want miners or full nodes to store and for how long, so think of 1000 tx/second, at least ... per day.. do you want to store this forever or not? These are challenges that need to be addressed. Fee structure, fairness, how to maintain decentralization, etc.

How do you utilize the DAG fully so that the entire throughput or at least we're close to the entire throughput that the DAG can supply? Iwill show you why this is not a trivial challenge. We have two scenarios for a DAG. Imagine many blocks per second But for simplicity just say two miners. You're mining 1 and you are mining 2. We are mining 4 transactions. We both create blocks because it's real time and there's no time to coordinate and there are two scenarios. The less optimistic scenario is where you select tx 1, tx 2, and I select tx 1, tx 2 from the mempool and what happens in this case is that ecause these transactions are not cconflictin, then no harm was done in the sense that both transactions were approed by the systme. Howeer you wasted space. So tx 1 and tx 2 took the space of 4 transactions. So this is just a ery simple example and of course... the other scenario... your left hand side, would be that you somehow selected tx 1, 2, and the other miner picked tx 3, and tx 4. So this is a good outcome.

The main question of our research here is how do we make sure that we are closer to the green area than the red area? Assume we hae a good consistency rule and a good blockDAG protocol. How can we make sure that transactions are not duplicated across chains?

If you don't do anything and let miners mine greedly the top transactions in their mempool, then the DAG gains nothing in terms of throughput. We did a simple solution generated by my colleague. The green curve is this throughput of a DAG. By using a greedy mining policy where every miner selects the top transactions from the mempool. The yellow curve represents a chain structure with the same parameters. The chain throughput and the DAG throughput are almost identical. So DAG gains us nothing in terms of htroughput, but it does gain us something in terms of confirmation time. These curves are very far from optimal. We can ain much more-- DAG has more capacity if we were more wise and used a better scheme to coordinate between miners.

So here's the good news and perhaps the main thing I want to message to convey to you. The miners are incentivized to some extent to avoid cllisions and avoid tx duplication and contribute unique transactions and increase throughput. The natural incentive of a miner to enjoy the health of the system. The reason is simple. If we collide in th same system, if we hoose the same tx for our blocks, then we need to split the fee, you will get half the fee after the other portion or whatever, but we an't pay the fee twice. And we both embedded the tx in our block... only one of us will get the fee or maybe we will split it but we won't be able to extract the entire fee twice. So there's incentive to coordinate and avoid collisions.

Chicken game

And now we're headed to a more formal treatment of this problem. I hope the chicken game is known to most of you. It's a famous game to model conflicts. You have two parties driving towards each other and heading to collision. And the first one to swerve and chicke nout is the chicken and the one to stay the course is the winner and daredevil. And of course, if either you chicke nout then the other party wins. And if we both dare then there's a collision. The transaction selection game in blockDAG is quite similar. We have tx1, tx2 in the mempool and one of the transactions is more expensive (tx1) let's say it pays 2 millibitcoins and tx2 pays 50% as much. So now we have to decide how to.... tx1, 3 millibitcoin and tx2 would be 2 millibitcoin. If we both select tx1, the most high paying transaction, we will collide and need to divide the fee between us. But if we both select tx2, we need to divide a lower fee between us. So we need to coordinate and select the transactions. This wowuld be the optimal setup. The strategy in this game- you have game hteory pure strategy which is which transaction to select.. or you can use randomness to pick transactions. In a mixed strategy setup I can toss a coin and then decide which transactions to select. In game theory, using randomness youcan achieve much better results for all players. In practical terms, we are looking at setup in a DAG where miners will randomize over the mempool to select transactions. In bitcoin, a miner selects the top transactions. But in a DAG, the best thing for you to do is to randomize over the mempool according to some distribution and then use this to maximize your throughput.

How do we solve this chicken game? In game theory, there are several approaches. The top left, scenario is that there's adversarial setup. You are a mining pool, you are paranoid, you think everyone is going to sabotage your profits and they want your revenue to drop. In that scenario, your safe solution is that you just want to guarantee yourself the best response against everyone else that wants to sabotage me and lower my profits. This is quite adversarial and does not fit loving bitcoin community so we might consider more cooperative setups.. The selfish setup is a more common one, which is more similar to the Nash equilibrium model where every party pursues their own interest but somehow the invisible hand brings us to equilibrium under which nobody gains from deviating. This is a common concept. There's also an interesting equilibrium concept invented by ... he's a novelist from... and this is coordinated equilibrium where everyone is selfish but we have signal for some coordination and in minin the signal could be the randomness from previous blocks. Perhaps we can use the random beacon to more wisely accept transactions into our mempool or block. And then most optimistic, everyone is playing just for the sake of the health of the system. You can think of bitcoin's early day as a setup as where the only concern of everyone was to maximize social welfare, and miners were confirming zero-fee transactions and there were less incentives back then.

Max social welfare

We can visit several of these ideas but let's begin with max social welfare. This is the simplest and most optimistic. What would be the solution here? All the miners are cooperating, they just ask us, we don't know how they coordinate, there's a design mechanism undre which the DAG is utilized and we won't collide and the throughput is at maximum. The solution in this case is rather simple. The construction to the miners woul dbe just select a transaction in uniform distribution from your mempool, so let's say you have 6 trransactions or thousands, you do this by tossing a fair coin between them until some threshold, and you accept all transactions uniformy. The threshold is what's the capacity of the DAG, you can't overcome this capacity, and it's very famous and simple result from queueing theory that says that you won't have any ... in the system. The buckets of transactions would be sparse enough that collisions will be negligible. This setup of course is a bit naieve. There's two catches.

This is strategically unstable. We're assuming that everyone is maximizing social welfare and this is not always the case. There's a more inherent problem wit hthis approach which is that it does not allow for differential service to urgent transactions. You can't prioritize transactions by fees. So I am ont able to signal to miners that I need this tx faster, I am willing to pay more. So if they are not picking transactions by fees, then everyone waits about the same time. And some transactions will never be approved. You can think of this as the highway where everyone is stuck in the same traffic. As a system designer, is that really what you want to achieve? You want a system that has express lanes, where people are willing to pay more. Assume there's a payment for the express lane, of course. They are willing to pay more to get to their destination faster. And then you will have this curve that says the larger you pay, the faster you get inside. So this is one thing you want to design the system according to.

Other strategies

If you have high utilization of the DAG with collision avoidance, then you can't serve the express lane and can't have prioritized transactions. You nee dthem to back off from expensive transactions so that they don't collide. And on the other extreme, if you allow for fast confirmation times, then you will inevitably suffer from collisions. The function of how long number of olliions over a transaction over its function of its waiting time. Only transactions that wait a long time in the buffer suffer no collisions. Transactions that want to be confirmed fast will suffer collisions and would need a higher fee.

In the chicken dare situation, what happens in the real world? The problem here is intractable. Usually in these games, nash equilibrium is hard to find. But the spirit of nash equilibrium is something in the form of.. if I see you cooperating, then I will cooperat.e But if you are greedy and you are always picking the highest transactions for your block, then I will also be greedy and pick the highest paying transactions. I apologize for the militaristic tone, but this is tit-for-tat strategy. And it helps understand how mining pools will behave.

In a world where mining pools don't hide their identity, then other mining pools will start sabotaging that pool by colliding with their transactions. It's an interesting scenario and problem. Chicken dare is not unique to mining DAGs.

We did a more modest goal-- namely, we computed the nash equilibrium for the single... we only played once, what would be the nash equilibrium with this nice formula? The good news-- the qualitative message of the nash you can see here that we picked high paying fees with high probability. So if a use wants to embed his transaction fast, he will put a large fee, but then we will all sort of collide but not necessarily. The curve does not reach probability 1. Even if I am paying a high fee nobody will be greedy enough to pick it with certain probability. More good news for you is that nash performs quite well. The utilization of the DAG under nash equilibrium even if everyone is selfish, it wouldn't be so bad. It's around 72% on this graph. So 70% of the graph ocntains unique content and unique transactions. It's a surprising result. If everyone was selfish then we would revert back to the greedy scenario, you wouldn't be greedy even if you're selifhs because you want to randomize over your mempool to avoid collisions.

Quality of service

Can we improve utilization a bit more? Another interesting graph we must talk about is the quality of service level. As I said, the problem with the maximum social welfare solution is that you can't prioritize transactions. Here the tall grey bars represent max social welfare strategy. Everyone waits the same time. And then the blue bars represent the nash equilibrium where you approve less transactions- only those that have 3 millibitcoin and on- and those exact numbers do not matter of course, it's a dummy model. As you increase the fee, you wait less. We want this feature. We want high-paying transactions to wait less. And we have other strategies like exponential strategy... If everyone is greedy, then you would only approve a small set of transactions. This is a good illustration of the challenge.

Correlated equilibrium

Perhaps with coordination between miners we can achive better results than nash. We could use previous block's randomness. Preliminary results show that we are able to utilize the DAG better. Needs further investigation.

Scaling and incentives

In bitcoin selfish mining is a known attack and deviation on the protocol. It's risky, it only works in the long term. And so arguably the reason why we didn't see selfish mining until now or at least not in notable amounts was that playing with your mining node just to withhold your blocks and risk losing it all is too risky and miners wont engage in such behaior. Unforotunately in a DAG, mining from the mining protocol is much more available to you, you just need to select transactions in a certain way or hold your blocks for a few seconds. So if I am withholding my block for 2 bloks, that's not going to hurt me. It's a much more granular decision for how to optimize my mining, and the risks for the miner are less. The harm to the system is relatively marginal. Also lazy selfish mining is another concern- it might not have communication infrastructure either deliberately or by laziness and doesn't propagate blocks to the network. This pool might not lose too much profit and other miners might be harmed, this is an interesting topic for us so we're investigating it.

When implementing blockDAG protocols, the incentives really matter. Bitcoin has 1 block every 10 minutes. At least in the ...