From rusty at rustcorp.com.au Mon Feb 12 01:45:55 2018 From: rusty at rustcorp.com.au (Rusty Russell) Date: Mon, 12 Feb 2018 12:15:55 +1030 Subject: [Lightning-dev] Improving the initial gossip sync In-Reply-To: <87shaawft5.fsf@gmail.com> References: <874lmvy4gh.fsf@gmail.com> <87tvurym13.fsf@rustcorp.com.au> <87shaawft5.fsf@gmail.com> Message-ID: <878tbzugj0.fsf@rustcorp.com.au> Christian Decker <decker.christian at gmail.com> writes: > Rusty Russell <rusty at rustcorp.com.au> writes: >> Finally catching up. I prefer the simplicity of the timestamp >> mechanism, with a more ambitious mechanism TBA. > > Fabrice and I had a short chat a few days ago and decided that we'll > simulate both approaches and see what consumes less bandwidth. With > zombie channels and the chances for missing channels during a weak form > of synchronization, it's not that clear to us which one has the better > tradeoff. With some numbers behind it it may become easier to decide :-) Maybe; I think we'd be best off with an IBLT-approach similar to Fabrice's proposal. An IBLT is better than a simple hash, since if your results are similar you can just extract the differences, and they're easier to maintain. Even easier if we make the boundaries static rather than now-relative. For node_announce and channel_update you'd probably want separate IBLTs (perhaps, though not necessarily, as a separate RTT). Note that this approach fits really well as a complement to the timestamp approach: you'd use this for older pre-timestamp, where you're likely to have a similar idea of channels. >> Deployment suggestions: >> >> 1. This should be a feature bit pair. As usual, even == 'support this or >> disconnect', and odd == 'ok even if you don't understand'. > > If we add the timestamp to the end of the `init` message, instead of > introducing a new message altogether, we are forced to use the required > bit, otherwise we just made any future field appended to the `init` > message unparseable to non-supporting nodes. Let's say we add another > field to it later, that the peer supports, but it follows the timestamp > which the peer does not. The peer doesn't know how many bytes to skip > (if any) for the timestamp bit he doesn't understand, to get to the > bytes he does know how to parse. I'm slowly getting to like the extra > message more, since it also allows a number of cute tricks later. This, of course, is the nature of all appendings. You can't understand feature N+1 without understanding feature N, if they both append to the same message. You don't have to *support* feature N, of course. >> 2. This `timestamp_routing_sync`? feature overrides `initial_routing_sync`. >> That lets you decide what old nodes do, using the older >> `initial_routing_sync` option. Similarly, a future `fancy_sync` would >> override `timestamp_routing_sync`. > > So you'd set both bits, and if the peer knows `timestamp_routing_sync` > that then force-sets the `initial_routing_sync`? Sounds ok, if we allow > optional implementations, even though I'd like to avoid feature > interactions as much as possible. If we don't allow optional implementations we're breaking the spec. And we're not going to do that* [* Yeah, OK, we'll eventually do that. But we'll do it when we're pretty sure that ~0 users would break, because they'd be ancient ] >> 3. We can append an optional 4 byte `routing_sync_timestamp` field to to >> `init` without issues, since all lengths in there are explicit. If you >> don't offer the `timestamp_sync` feature, this Must Be Zero (for appending >> more stuff in future). > > That'd still require the peer to know that it has to skip 4 bytes to get > any future fields, which is why I am convinced that either forcing it to > be mandatory, or adding a new message is the better way to go, even if > now everybody implements it correctly. This is simply how we upgrade. See `open_channel` for how this is already done, for example; in fact, we originally had two different upgrades (but we broke spec instead) and they used exactly this technique. A separate message here is supremely awkward, too. >> Now, as to the proposal specifics. >> >> I dislike the re-transmission of all old channel_announcement and >> node_announcement messages, just because there's been a recent >> channel_update. Simpler to just say 'send anything >= >> routing_sync_timestamp`. > > I'm afraid we can't really omit the `channel_announcement` since a > `channel_update` that isn't preceded by a `channel_announcement` is > invalid and will be dropped by peers (especially because the > `channel_update` doesn't contain the necessary information for > validation). OTOH this is a rare corner case which will eventually be fixed by weekly channel_announce retransmission. In particular, the receiver should have already seen the channel_announce, since it preceeded the timestamp they asked for. Presumably IRL you'd ask for a timestamp sometime before you were last disconnected, say 30 minutes. "The perfect is the enemy of the good". >> Background: c-lightning internally keeps an tree of gossip in the order >> we received them, keeping a 'current' pointer for each peer. This is >> very efficient (though we don't remember if a peer sent us a gossip msg >> already, so uses twice the bandwidth it could). > > We can solve that by keeping a filter of the messages we received from > the peer, it's more of an optimization than anything, other than the > bandwidth cost, it doesn't hurt. Yes, it's on the TODO somewhere... we should do this! >> But this isn't *quite* the same as timestamp order, so we can't just set >> the 'current' pointer based on the first entry >= >> `routing_sync_timestamp`; we need to actively filter. This is still a >> simple traverse, however, skipping over any entry less than >> routing_sync_timestamp. >> >> OTOH, if we need to retransmit announcements, when do we stop >> retransmitting them? If a new channel_update comes in during this time, >> are we still to dump the announcements? Do we have to remember which >> ones we've sent to each peer? > > That's more of an implementation detail. In c-lightning we can just > remember the index at which the initial sync started, and send > announcements along until the index is larger than the initial sync > index. True. It is an implementation detail which is critical to saving bandwidth though. > A more general approach would be to have 2 timestamps, one highwater and > one lowwater mark. Anything inbetween these marks will be forwarded > together with all associated announcements (node / channel), anything > newer than that will only forward the update. The two timestamps > approach, combined with a new message, would also allow us to send > multiple `timestamp_routing_sync` messages, e.g., first sync the last > hour, then the last day, then the last week, etc. It gives the syncing > node control over what timewindow to send, inverting the current initial > sync. That would fit neatly with the more complicated bucketing approaches: you'd use this technique to ask for the entire bucket if the SHA mismatched/IBLT failed. Cheers, Rusty.