From rusty at rustcorp.com.au Tue Jan 8 05:23:10 2019 From: rusty at rustcorp.com.au (Rusty Russell) Date: Tue, 08 Jan 2019 15:53:10 +1030 Subject: [Lightning-dev] Quick analysis of channel_update data In-Reply-To: References: <87ef9veztc.fsf@gmail.com> Message-ID: <875zuzg1tt.fsf@rustcorp.com.au> Fabrice Drouin writes: > Follow-up: here's more detailed info on the data I collected and > potential savings we could achieve: > > I made hourly routing table backups for 12 days, and collected routing > information for 17 000 channel ids. > > There are 130 000 different channel updates :on average each channel > has been updated 8 times. Here, ?different? means that at least the > timestamp has changed, and a node would have queried this channel > update during its syncing process. Side note: some implementations are also sending out updates with the *same* timestamp. This is not allowed... > But only 18 000 pairs of channel updates carry actual fee and/or HTLC > value change. 85% of the time, we just queried information that we > already had! Note that this can happen in two legitimate cases: 1. The weekly refresh of channel_update. 2. A node updated too fast (A->B->A) and the ->A update caught up with the ->B update. Fortunately, this seems fairly easy to handle: discard the newer duplicate (unless > 1 week old). For future more advanced reconstruction schemes (eg. INV or minisketch), we could remember the latest timestamp of the duplicate, so we can avoid requesting it again. > Adding a basic checksum (4 bytes for example) that covers fees and > HTLC min/max value to our channel range queries would be a significant > improvement and I will add this the open BOLT 1.1 proposal to extend > queries with timestamps. > > I also think that such a checksum could also be used > - in ?inventory? based gossip messages > - in set reconciliation schemes: we could reconcile [channel id | > timestamp | checksum] first I think this is overkill? Thanks, Rusty.