From rusty at rustcorp.com.au  Tue Jan  8 05:23:10 2019
From: rusty at rustcorp.com.au (Rusty Russell)
Date: Tue, 08 Jan 2019 15:53:10 +1030
Subject: [Lightning-dev] Quick analysis of channel_update data
In-Reply-To: <CAL3Hpbcw-1-UEpKi36Cs1-V+b+oZn6+QRfo2EHUYZMZGC1wb8A@mail.gmail.com>
References: <CAL3HpbeYjJtXaj4RXdyqdNZMQLniayocuSiP67F5sFGe6WPR0Q@mail.gmail.com>
	<87ef9veztc.fsf@gmail.com>
	<CAL3Hpbdp9c2qF3Z40jU0peTm1h_SqaU7HOYUaGzypXRjT3AzzA@mail.gmail.com>
	<CAL3Hpbcw-1-UEpKi36Cs1-V+b+oZn6+QRfo2EHUYZMZGC1wb8A@mail.gmail.com>
Message-ID: <875zuzg1tt.fsf@rustcorp.com.au>

Fabrice Drouin <fabrice.drouin at acinq.fr> writes:
> Follow-up: here's more detailed info on the data I collected and
> potential savings we could achieve:
>
> I made hourly routing table backups for 12 days, and collected routing
> information for 17 000 channel ids.
>
> There are 130 000 different channel updates :on average each channel
> has been updated 8 times. Here, ?different? means that at least the
> timestamp has changed, and a node would have queried this channel
> update during its syncing process.

Side note: some implementations are also sending out updates with the
*same* timestamp.  This is not allowed...

> But only 18 000 pairs of channel updates carry actual fee and/or HTLC
> value change. 85% of the time, we just queried information that we
> already had!

Note that this can happen in two legitimate cases:
1. The weekly refresh of channel_update.
2. A node updated too fast (A->B->A) and the ->A update caught up with the
   ->B update.
 
Fortunately, this seems fairly easy to handle: discard the newer
duplicate (unless > 1 week old).  For future more advanced
reconstruction schemes (eg. INV or minisketch), we could remember the
latest timestamp of the duplicate, so we can avoid requesting it again.

> Adding a basic checksum (4 bytes for example) that covers fees and
> HTLC min/max value to our channel range queries would be a significant
> improvement and I will add this the open BOLT 1.1 proposal to extend
> queries with timestamps.
>
> I also think that such a checksum could also be used
> - in ?inventory? based gossip messages
> - in set reconciliation schemes: we could reconcile [channel id |
> timestamp | checksum] first

I think this is overkill?

Thanks,
Rusty.