From laolu32 at gmail.com Mon Aug 16 19:55:48 2021 From: laolu32 at gmail.com (Olaoluwa Osuntokun) Date: Mon, 16 Aug 2021 12:55:48 -0700 Subject: [Lightning-dev] #zerobasefee In-Reply-To: <8c14f528-03e8-495a-bf0f-27c15fe1e6b0@mattcorallo.com> References: <20210815010037.GA3971@erisian.com.au> <8c14f528-03e8-495a-bf0f-27c15fe1e6b0@mattcorallo.com> Message-ID: Matt wrote: > I'm frankly still very confused why we're having these conversations now 1000% this!! This entire conversation strikes me as extremely premature and backwards tbh. Someone experimenting with a new approach shouldn't prompt us to immediately modify the protocol to better "fit" the approach, particularly before any sort of comparative analysis has even been published. At this point, to my knowledge we don't even have an independent implementation of the algorithm that has been tightly integrated into an existing LN implementation. We don't know in which conditions the algorithm excels, and in which conditions this improvement is maybe negligible (likely when payAmt << chanCapacity). I think part of the difficulty here lies in the current lack of a robust framework to use in comparing the efficacy of different approaches. Even in this domain, there're a number of end traits to optimize for including: path finding length, total CLTV delay across all shards, the amount of resulting splits (goal should be consume less commitment space), attempt iteration latency, amount/path randomization, path finding memory, etc, etc. This also isn't the first time someone has attempted to adapt typical flow-based algorithms to path finding in the LN setting. T-bast from ACINQ initially attempted to adapt a greedy flow-based algorithm [1], but found that a number of implementation related edge cases (he cites that the min+max constraints in addition to the normal fee limit most implementations as barriers to adapting the algorithm) led him to go with a simpler approach to then iterate off of. I'd be curious to hear from T-bast w.r.t how this new approach differs from his initial approach, and if he spots any yet-to-be-recognized implementation level complexities to properly integrating flow based algorithms into path finding. > a) to my knowledge, no one has (yet) done any follow-on work to > investigate pulling many of the same heuristics Rene et al use in a > Dijkstras/A* algorithm with multiple passes or generating multiple routes > in the same pass to see whether you can emulate the results in a faster > algorithm without the drawbacks here, lnd's current approach (very far from perfect, derived via satisficement) has some similarities to the flow-based approach in its use of probabilities to attempt to quantify the level of uncertainty of internal network channel balances. We start by assuming a config level a priori probability of any given route working, we then take that, and the fee to route across a given link and convert the two values into a scalar "distance/weight" (mapping to an expected cost) we can plug into vanilla Dijkstras [2]. A fresh node uses this value to compare routes instead of the typical hop count distance metric. With a cold cache this doesn't really do much, but then we'll update all the a priori probabilities with observations we gain in the wild. If a node is able to forward an HTLC to the next hop, we boost their probability (conditional on the amount forward/failed, so there's a bayesian aspect). Each failure results in the probabilities of nodes being affected differently (temp chan failure vs no next node, etc). For example, if we're able to route through the first 3 hops of the route, but the final hop fails with a temp chan failure. We'll rewards all the nodes with a success probability amount (default rn is 95%) that applies when the amount being carried is < that prior attempt. As we assume balances are always changing, we then apply a half life decay that slows increases a penalized probability back to the baseline. The resulting algorithm starts with no information/memory, but then gains information with each attempt (increasing and decreasing probabilities as a function of the amount attempted and time that has passed since the last attempt). The APIs also let you mark certain nodes as having a higher apriori probability which can reduce the amount of bogus path exploration. This API can be used "at scale" to create a sort of active learning system that learns from the attempts of a fleet of nodes, wallets, trampoline nodes, wallets, etc (some privacy caveats apply, though there're ways to fuzz things a bit differential style). Other knobs exist such as the min probability setting, which controls how high a success probability a candidate edge needs to have before it is explored. If the algo is exploring too many lackluster paths (and there're a lot of these on mainnet due to normal balance imbalance), this value can be increased which will let it shed a large number of edges to be explored. When comparing this to the discussed approach that doesn't use any prior memory, there may be certain cases that allows this algorithm to "jump" to the correct approach and skip all the pre-processing and optimization that may result in the same route, just with added latency before the initial attempt. I'm skipping some other details like how we handle repeated failures of a node's channels (there's a mechanism that penalizes them more heavily, as the algo assumes most of the node's channels aren't well maintained/balanced, so why waste time trying the other 900 channels). Our approach differs more dramatically from this new approach further when it comes to the question of how to split a payment either due to a failure, or when it's necessary (amt > max(chanCapacities...)). lnd takes a very a simple approach of just divides the problem in half (divide and conquer, so fork into two instances of amt/2) and try again. The divide and conquer approach typically means you'll end up with a minimal-ish amount of fees. There's also a knob that lets you control the largest split size, which can force the algo to split sooner, as otherwise it only splits when no route exists (either we explored a ton and they all failed, or the payment amt is larger than chan capacity). The algo works pretty well when the probabilities combined with well tuning of the parameters help the algorithm naturally ignore lower probability routes during the edge relaxation step of Dijkstras. However if a client has no prior observations (and didn't get any from say it's wallet provider or w/e), then it can end up exploring poor routes for a while and eventually hit the default timeout (particularly with a slow disk, but that'll be optimized away in lnd 0.14). One interesting area of research would be to investigate if a small amount of flow pre-planning can help the algorithm effectively re-summarize its current working memory to ignore more paths we know likely won't work. In the end, there's still so much of the design space that needs exploring, so settling on something (and advertising that everything is solved by magically setting a particular value to zero) that appears (so far we're going mainly off of experimental anecdotes w/ unclear methods) to improve on things in certain scenarios, and morphing the protocol around it is a premature declaration of "Mission Accomplished!". In any case, major kudos to Rene and Stefan for examining the problem in a new light, and re-invigorating research of related areas! -- Laolu [1]: https://github.com/ACINQ/eclair/pull/1427 [2]: https://github.com/lightningnetwork/lnd/blob/958119a12ab60d24c75c7681f344ceb5a450c4ad/routing/pathfind.go#L930 On Sun, Aug 15, 2021 at 5:07 PM Matt Corallo wrote: > Thanks, AJ, for kicking off the thread. > > I'm frankly still very confused why we're having these conversations now. > In one particular class of applicable routing > algorithms you could use for lightning routing having a base fee makes the > algorithm intractably slow, but: > > a) to my knowledge, no one has (yet) done any follow-on work to > investigate pulling many of the same heuristics Rene et > al use in a Dijkstras/A* algorithm with multiple passes or generating > multiple routes in the same pass to see whether > you can emulate the results in a faster algorithm without the drawbacks > here, > > b) to my knowledge, no one has (yet) done any follow-on work to > investigate mapping the base fee to other, more > flow-based-routing-compatible numbers, eg you could convert the base fee > to a minimum fee by increasing the "effective" > proportional fees. From what others have commented, this may largely > "solve" the issue. > > c) to my knowledge, no one has (yet) done any follow-on work to analyze > where the proposed algorithm may be most optimal > in the HTLC-value<->channel liquidity ratio ranges. We may find that the > proposed algorithm only provides materially > better routing when the HTLC value approaches X% of common network channel > liquidity, allowing us to only use it for > large-value payments where we can almost ignore the base fees entirely. > > There's real cost to distorting the fee structures on the network away > from the costs of node operators, especially as > we move towards requiring and using Real (tm) amounts of capital on > routing nodes. If we're relying purely on hobbyists > forever who are operating out of goodwill, we should just remove all fees. > If we think Lightning is going to involve > capital with real opportunity cost, matching fees to the costs is > important, or at least important enough that we > shouldn't throw it away after one (pretty great) paper and limited further > analysis. > > Imagine we find some great way to address HTLC slot flooding/DoS attacks > (or just chose to do it in a not-great way) by > charging for HTLC slot usage, now we can't fix a critical DoS issue > because the routing algorithms we deployed can't > handle the new costing. Instead, we should investigate how we can apply > the ideas here with the more complicated fee > structures we have. > > Color me an optimist, but I'm quite confident with sufficient elbow grease > and heuristics we can get 95% of the way > there. We can and should revisit these conversations if such exploration > is done and we find that its not possible, but > until then this all feels incredibly premature. > > Matt > > On 8/14/21 21:00, Anthony Towns wrote: > > Hey *, > > > > There's been discussions on twitter and elsewhere advocating for > > setting the BOLT#7 fee_base_msat value [0] to zero. I'm just writing > > this to summarise my understanding in a place that's able to easily be > > referenced later. > > > > Setting the base fee to zero has a couple of benefits: > > > > - it means you only have one value to optimise when trying to collect > > the most fees, and one-dimensional optimisation problems are > > obviously easier to write code for than two-dimensional optimisation > > problems > > > > - when finding a route, if all the fees on all the channels are > > proportional only, you'll never have to worry about paying more fees > > just as a result of splitting a payment; that makes routing easier > > (see [1]) > > > > So what's the cost? The cost is that there's no longer a fixed minimum > > fee -- so if you try sending a 1sat payment you'll pay 0.1% of the fee > > to send a 1000sat payment, and there may be fixed costs that you have > > in routing payments that you'd like to be compensated for (eg, the > > computational work to update channel state, the bandwith to forward the > > tx, or the opportunity cost for not being able to accept another htlc if > > you've hit your max htlcs per channel limit). > > > > But there's no need to explicitly separate those costs the way we do > > now; instead of charging 1sat base fee and 0.02% proportional fee, > > you can instead just set the 0.02% proportional fee and have a minimum > > payment size of 5000 sats (htlc_minimum_msat=5e6, ~$2), since 0.02% > > of that is 1sat. Nobody will be asking you to route without offering a > > fee of at least 1sat, but all the optimisation steps are easier. > > > > You could go a step further, and have the node side accept smaller > > payments despite the htlc minimum setting: eg, accept a 3000 sat payment > > provided it pays the same fee that a 5000 sat payment would have. That > is, > > treat the setting as minimum_fee=1sat, rather than > minimum_amount=5000sat; > > so the advertised value is just calculated from the real settings, > > and that nodes that want to send very small values despite having to > > pay high rates can just invert the calculation. > > > > I think something like this approach also makes sense when your channel > > becomes overloaded; eg if you have x HTLC slots available, and y channel > > capacity available, setting a minimum payment size of something like > > y/2/x**2 allows you to accept small payments (good for the network) > > when you're channel is not busy, but reserves the last slots for larger > > payments so that you don't end up missing out on profits because you > > ran out of capacity due to low value spam. > > > > Two other aspects related to this: > > > > At present, I think all the fixed costs are also incurred even when > > a htlc fails, so until we have some way of charging failing txs for > > incurring those costs, it seems a bit backwards to penalise successful > > txs who at least pay a proportional fee for the same thing. Until we've > > got a way of handling that, having zero base fee seems at least fair. > > > > Lower value HTLCs don't need to be included in the commitment transaction > > (if they're below the dust level, they definitely shouldn't be included, > > and if they're less than 1sat they can't be included), and as such don't > > incur all the same fixed costs that HTLCs that are committed too do. > > Having different base fees for microtransactions that incur fewer costs > > would be annoying; so having that be "amortised" into the proportional > > fee might help there too. > > > > I think eltoo can help in two ways by reducing the fixed costs: you no > > longer need to keep HTLC information around permanently, and if you do > > a multilevel channel factory setup, you can probably remove the ~400 > > HTLCs per channel at any one time limit. But there's still other fixed > > costs, so I think that would just lower the fixed costs, not remove them > > altogether and isn't a fundamental change. > > > > I think the fixed costs for forwarding a HTLC are very small; something > > like: > > > > 0.02sats -- cost of permanently storing the HTLC info > > (100 bytes, $500/TB/year, 1% discount rate) > > 0.04sats -- compute and bandwidth cost for updating an HTLC > ($40/month > > at linode, 1 second of compute) > > > > The opportunity cost of having HTLC slots or Bitcoin locked up until > > the HTLC succeeds/fails could be much more significant, though. > > > > Cheers, > > aj > > > > [0] > https://github.com/lightningnetwork/lightning-rfc/blob/master/07-routing-gossip.md#the-channel_update-message > > [1] https://basefee.ln.rene-pickhardt.de/ > > > > _______________________________________________ > > Lightning-dev mailing list > > Lightning-dev at lists.linuxfoundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev > > > _______________________________________________ > Lightning-dev mailing list > Lightning-dev at lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: