From rusty at rustcorp.com.au Wed Dec 12 02:06:09 2018 From: rusty at rustcorp.com.au (Rusty Russell) Date: Wed, 12 Dec 2018 12:36:09 +1030 Subject: [Lightning-dev] Minisketch and lightning gossip Message-ID: <87pnu7xzvy.fsf@rustcorp.com.au> Hi all, In case you're bored with the limited range of improvements going into the 1.1 spec, you might like to ruminate on: https://github.com/sipa/minisketch/blob/master/README.md It's a library for efficient summaries of data, such as bitcoin transaction gossip. It has an implementation sweet-spot at 64-bits, which almost works well for our gossip messages. I've a straw-man protocol below. === 1. type: 260 (`gossip_sync`) (`option_gossip_sync`) 2. data: * [`32`:`chain_hash`] * [`32`:`latest_block_hash`] * [`minisketch_len`:`minisketch`] The `latest_block_hash` is because the whole sync is less reliable if this differs between nodes, so a node may choose to wait, or adapt accordingly if the other node is behind. Because there is some overhead in maintaining the minisketch, it'd be nice if we can have global agreement on the format so each peer need only maintain one (i.e. no seed, no changing encodings depending on block height, etc). Fitting everything we need into 64 bits is possible, with various degrees of ugliness; I've proposed one here: We currently use a short_channel_id which has 3-byte block height, 3-byte txindex, 2-byte output index. The biggest win is to combine block height & txindex into a "txnumber since block 500,000", which only needs about 27 bits per year; 40 bits for this number is sufficient for the forseeable future. 2 bits for type, 1 for channel direction, leaving 21 bits for output number and timestamp. We can encode output number as N ones followed by a zero followed by N*2+1 bits (commonly, this means 2 bits, but future mixers may make this much larger). The remaining bits are used as the lower bits of the timestamp[1]. A node announcement is encoded by using the scid of the oldest channel associated with the node, and the direction bit. Using this direct encoding (rather than a hash of values) allows us to immediately use an INV-style query, or maybe send automatically the stuff the peer doesn't have. The required size of the minisketch we send depends on the number of differences with our peer. We can use some minimum value (maybe based on past gossip rates?), and add the number of changes we've received since last time, increasing if we failed to reconstruct a previous one. There doesn't need to be consensus between peers on the minisketch size though, since truncated minisketches regrow into whole ones when tossed back into the ocean[3]. Cheers, Rusty. [1] You're supposed to refresh gossip every week, 17 bits should be sufficient. And since the originator controls timestamps they can mitigate collisions themselves.[2] [2] I wish I'd insisted we use block numbers for timestamps though [3] I may have misunderstood this part, but it's basically magic.