From aj at erisian.com.au Mon Sep 21 11:08:44 2015 From: aj at erisian.com.au (Anthony Towns) Date: Mon, 21 Sep 2015 21:08:44 +1000 Subject: [Lightning-dev] Ionization Protocol: Flood Routing In-Reply-To: <874mioecuq.fsf@rustcorp.com.au> References: <874mioecuq.fsf@rustcorp.com.au> Message-ID: <20150921110844.GC2696@navy> On Mon, Sep 21, 2015 at 11:46:13AM +0930, Rusty Russell wrote: > We regularly choose a dozen "beacon" nodes at random (using proximity to > the SHA of latest block hash or something). Everyone propagates the > cheapest route to & from those nodes (which is pretty efficient, similar > to your scheme). I think what you're implicitly assuming here is: a) global p2p network of lightning nodes; at a minimum each node has a connection to every node it has a channel open with b) consensus protocol for choosing an ordering of potential beacons c) potential beacons have to have committed to a beacon id prior to ordering being chosen (with spam/flooding protection) d) gossip protocol to announce potential beacons and compare against ordering, keeping the top few in memory. e) sharing routes to beacons with direct neighbours f) distributed hash to store/lookup route recommendations, keyed by beacon/endpoint g) distributed hash to lookup fees for hops keyed by hop ends I think (a) is trivial; and you already called out (b). For (c), not having a commitment means that people could generate a new node id that does well in the ordering algorithm after the fact; if it's SHA comparisons, that means miners would likely monopolise being a beacon. > 4) How to avoid fake beacons? > Require a signature attached to an unspent bitcoin TX from before > beacon selection? That TXID is actually what competes to be close > to the beacon selector. This might as well be the (an) anchor transaction. It's already in the blockchain, it's already associated with a channel. You couldn't just use the txid directly, because that wouldn't differentiate between endpoints though. This would give an advantage to nodes with lots of channels open; not sure whether that's desirable? Probably it is? For (d), once you've got the ordering, nodes tell each other about their 12 favourite beacons, and if they hear about better ones, they pass those on too. That needs to be global knowledge, but it doesn't matter too much if we have slightly different sets of 12 beacons at any point. 12 beacon ids is a fine amount of information to pass around globally, too. For (e), I don't think you want to gossip globally about routes -- that's too much information to pass around if it's not necessary; but you still have to share your routes to beacons with your neighbours in order to figure anything out. Nodes announcing their best routes to their neighbours is basically just Dijkstra's algorithm in parallel I think. But just knowing your neighbours' routes isn't enough; you need to be able to lookup a route for anyone, and that (by definition I think) means you need (f) a DHT of routes-to-beacons. Note that looking up a route has privacy implications, in that it implies you're probably going to make a payment along that route! > To receive a payment, B picks a few beacon nodes at random and sends > those routes (beacons -> B) to A. Presumably A knows its route to all > the beacon nodes and thus can choose the best. A trivial DHT would be to have each node store its routes locally, and just make a TCP/IP connection to the node directly to ask for its routes. That seems like it'd be pretty bad for privacy though. I'm a fan of being able to route to/through nodes you can't reach via IP, and this would prevent that too. Finally for (g), I don't think you want to store fees in the routes directly, since updating fees would require updating an unknown number of routes; but fees have to be queryable, so that's a separate DHT. This one doesn't need to be updated when new beacons are elected though. This DHT would want to be fairly high performance, because you're doing 2*B*L lookups everytime you want to find a route, and it has to accept and propogate updates fairly quickly if fees change. > There are some twisty details here: > 1) How many beacon nodes? > 12, and increase on a log scale with network size. That size can > be derived from previous beacons. I think it's also something you could set per-node, like the minrelaytxfee. > 2) How long between selection and a beacon becoming active? > 10 blocks. That gives time for others to connect to beacon node. Beacons can be "active" as soon as you can route through them, and that's just a DHT lookup to determine, and then a matter of comparing fees to what the old beacons give you. So I think no artificial delay is needed, and the real question is just when you expire your routes to/from the old beacons? > 3) How long before a beacon node is invalid? > No idea. Let's keep a day's worth for the moment? Sounds fine; also mostly a client side parameter. (Though if the routing DHT is non-trivial, old beacons should expire from there after some interval too to deal with nodes that disappear) > 5) How to update beacon routing? > Particularly for fee changes, this is important. Best effort, > with ratelimiting. I hope eventually we'll have reasonable traffic > models so a node can say "I'm going to ramp up/down my fees for > this long at this rate" and have a reasonable chance that it's true. I think this is reinventing DHTs, though maybe none of the existing ones work well enough for this use case? I think rate limiting decreases in fees is always safe (it won't prevent any transactions going through, it will only prevent them being started). (I'm not sure a programmed ramp-up/down of fees makes sense; though maybe it would be a good way to perform price discovery) > PS. For the immediate term, we'll use a global comms mechanism like > IRC, but that's just a prototype hack. Hmm. Counterproposal: no beacons or routing DHT, just fees by gossip protocol (or IRC channel as prototype hack). Everyone has a complete (but possibly slightly out of date) database of node-node (but not node-wallet) channel fees, and changes are propogated by gossip. Back of the envelope maths: everyone uses lightning ==> 8B wallets (5B teens/adults, 3B businesses) every wallet has ~4 channels ==> 32B node-wallet channels 100k wallet channels per node on average ==> 320k nodes 16 channels to other nodes per node ==> 2.5M node-node channels (average path length between nodes ~= 4 hops) 128b id per node, 32b fee per direction ==> ~100MB of graph data An update rate of 10kB/s allows fees to update up to once every three hours on average; 100kB/s would be once every 20 minutes or so. Those numbers aren't /great/ but don't seem completely infeasible. Probably needs some signatures or similar to avoid DoS by spam which would slow things down by maybe a factor of two. If you do know the entire graph, you don't need to give away any information about who you want to pay prior to sending the transaction. Knowing the graph is potentially interesting for commercial and academic reasons beyond wanting privacy. (Knowing the fees others charge helps you work out what fees you should charge; but just querying your neighbours' routes is probably sufficient to work that out too) You can even do it so that only nodes (not wallets) need to know the full graph, I think. If you're running a wallet without the full graph, when finding a route, you: - get a list of most/all nodes (about 5MB; maybe you already have this, or can just rsync changes) - determine how oniony you want to be, picking n nodes, including your desired destination, and one or more of the nodes you have channels open with - ask for the shortest path between each pair of those nodes - build a cheap path out of that data Even if n=2 (your node and their node), you only reveal you're paying one of 100k wallets; and with larger n, any particular node could just be being used for onion routing, so you're adding ~100k potential recipient wallets with each additional node. (Initially, it won't be "100k" per node so that doesn't work so well, but initially you could happily run a full node rather than a wallet anyway) When asking for payment, you just indicate your id, and one or more nodes you have a channel open to. You probably have to indicate the final hop fee for each node as well, since that can't be looked up in the graph. Cheers, aj