From ZmnSCPxj at protonmail.com  Mon Jan 20 01:21:46 2020
From: ZmnSCPxj at protonmail.com (ZmnSCPxj)
Date: Mon, 20 Jan 2020 01:21:46 +0000
Subject: [Lightning-dev] On Path Privacy
In-Reply-To: <tPmu0vrofH-IjJj6swN_ADX3BNyrC6_07gxxeGNYEhqQVwTTUxlNgd53MRFwklBTjzNlCL1jieTbHxSNxCEmO3Rv2crhqS9cRFYRxz_5Jag=@protonmail.com>
References: <pmZF6PMBHmqzMbp93eA_22_Ky11jQkv0i79Vxku-JuRdFXh8_uRK0EfTegOkoqqGsWELW5fEnoMwg1pN6JIkRIiwJIWfOS0CSvyF1aC5N78=@protonmail.com>
 <tPmu0vrofH-IjJj6swN_ADX3BNyrC6_07gxxeGNYEhqQVwTTUxlNgd53MRFwklBTjzNlCL1jieTbHxSNxCEmO3Rv2crhqS9cRFYRxz_5Jag=@protonmail.com>
Message-ID: <awfYFw-SH5hxocjoDmiO35xBwp_VTEZHgA3kjwrwvoyduKGCWXN9SJK0m1W8ajDw8Tk61yzVix6vkZ7PcxDPSMW_EMz6SncFmOp8OS6eAgk=@protonmail.com>

Good morning list,

Few people have responded to this topic, but when has that ever stopped me from spamming the mailinglist?

Analysis of Path Extension for Privacy
======================================

As mentioned before, increasing the path length deliberately, by any means, intuitively makes it seem that privacy is improved.
I will now show how this is not a panacea and that its cost in terms of increased fees, increased risk of stuckness, and increased worst-case stuckness might not justify its benefits, which are weaker (at least on the current network) than they might seem at first glance.

Let us first start with an example network:


    A -- B -- S1-- C -- D -- G -- H
         |  /    /   \  |  /      |
         | /    /     \ | /       |
         I -- J         E -- S2-- F

Let us suppose that `A` wishes to make a payment to `F`.
Further, let us assume for simplicity that all nodes charge the same amount for forwarding, and that they all charge 0.01 units base and 0 proportional.

Finally, let us also assume that `S1` and `S2` are two nodes run by a single surveillor, and that all the other nodes are otherwise not interested in destroying Lightning privacy.
Now, obviously `A` and `F` do not know that `S1` and `S2` are run run by a surveillor, thus cannot identify S1 and S2 as belonging to a single actor that wants to snoop their payment.


Now let us suppose that A takes the shortest path `A -> B -> S1 -> C -> E -> S2 -> F`.
Would it be improved to artifically increase the path length?

We can observe that, with the current network, due to the same hash being used in the entire route, `S1` and `S2` can easily notice when they are on the same route.

Thus, suppose we increased the path length by taking this route instead: `A -> B -> S1 -> I -> J -> C -> D -> G -> E -> S2 -> F`.
Then our surveillor gets exactly the same information as in the case `A -> B -> S1 -> C -> E -> S2 -> F`.

* An incoming payment went into S1 via B, so the payer must be A or B (taking the shortest-path heuristic pre-S1).
* An outgoing payment went out of S2 via F, so the payer must be F (taking the shortest-path heuristic post-S2).

Thus, in many cases, it is immaterial if we actually inserted greater length onto the path.
We can generally expect that surveillors can easily just buy a bunch of BTC and insert many nodes into the network to act as surveillance nodes, especially since forwarding pays fees and thus the requirement to lock funds in channels is not in fact an opportunity cost.
Once a path passes through more than one surveillor nodes, any increase in length between the two endmost cooperating surveillor nodes does not improve privacy.
Thus `A` would end up paying the costs of increased path length (higher fees, higher risk of stuckness, highest worst-case stuckness) *without* any benefit to its privacy.

Indeed, we might point out that if `A` took the path `A -> B -> S1 -> C -> D -> G -> H -> F`, then it would have avoided `S2` and the single surveillor would have a much larger set of possible destinations to analyze.
But if it instead took the longer path `A -> B -> S1 -> I -> J -> C -> D -> G -> E -> S2 -> F`, then `S2` would have helped the surveillor pin down the destination that either A or B is paying to.

Which brings the next point: longer path also means increased chances of going through *two* surveillance nodes, and thus having the payment endpoints (ultimate sender, ultimate receiver) be identified by surveillor nodes.
The increased path length also means less reliable forwarding (more nodes can fail), meaning it is now more likely that `A` will have to find another route.
This increases the chances that `S1` and `S2` will be on *some* route.
And because the same hash will *also* be used on the alternate routing attempts, then if on one attempt we went `A -> B -> S1 -> C -> D -> G -> H -> F` and on the next attempt we went `A -> I -> J -> C -> E -> S2 -> F`, then both `S1` and `S2` can still triangulate the ultimate payer and ultimate payee as with getting the shortest path in the first place!

The intuition that "sub-optimal paths means better privacy" is countered by the intuition that "the more people you tell, the less private it is".
At least on the current network, the fact that we can identify a single payment (across multiple nodes within an attempt, and across multiple attempts trying to forward the same payment) means increased path length does not buy a lot of privacy, and the significant losses in useability might not be commensurate to the mild increase in privacy it *theoretically* could get.

Fortunately, the PTLC-based path decorrelation fixes most of these, and with that, at least it is hard for `S1` and `S2` to identify if forwards going through them are the same payment, at least from the payment hash.

Digression: `permuteroute` Privacy
----------------------------------

Previously, [I discussed the `permuteroute` algorithm](https://lists.linuxfoundation.org/pipermail/lightning-dev/2019-August/002095.html), which is basically the Path Splicing technique applied to Lightning payments.
TLDR: When a payment fails at a particular node or channel along a route, we copy the path prefix before the break, the path postfix after the break, and make a route around the break, getting better time behavior since it is likely that the area around the break has alternate routes that are still relatively short (Dijkstra algorithm in particular has a massive increase in runtime for every additional node on the shortest path, so if the shortest path is very short, the runtime for the Dijkstra run to bridge the break would be much faster than if we redid it from the source to the destination).

An advantage of `permuteroute` is also the fact that *you do not tell more people about your payment*.
The prefix of path is reused rather than looking for a completely new path, and the nodes along that prefix *already* knows about the payment and its hash and its amount and its CLTV-delta, from the previous failing payment attempt.
Telling them approximately the same details *again* in the next payment attempt is not an *additional* privacy leak at least.
(in general a `permuteroute` will either increase or give the same path length as the previous attempted path, so the first attempt will give, via the unremovable data CLTV-delta and amount, the shortest distance from a surveillor node to the destination, and subsequent `permuteroute` attempts will tend to give a *longer* distance to the destination, which does not help the surveillor to *narrow down* the destination, thus does not increase privacy leaks to the nodes on the reused prefix.)

Thus, I expect that `permuteroute` will give, not only a good improvement in practical pathfinding speed, it would also give a mild practical improvement in privacy compared to using high-randomization pathfinding techniques like the Rusty Russell Random Route Ralgorithm for *every* payment attempt.

Privacy Non-Improvement Due To Practical Limits
-----------------------------------------------

There is a practical limit to the number of nodes you can stuff onto an artificially lengthened path.
Unfortunately, it means as well that privacy improvement at the destination end translates to privacy reduction at the source end.

For instance, it is likely that wallets would want to bound the outgoing CLTV-delta at the ultimate sender.
If this CLTV-delta were, for instance, to be a year away from the current blockheight, then at the worst-case, in case of a stuck payment, the funds used in a stuck payment would be locked for a year in an HTLC.
Thus, obviously wallets will be designed to impose some maximum CLTV-delta at the ultimate sender.
For example, C-Lightning limits this, by default, to 576 blocks.

Knowing this limit, if an arbitrary forwarding node were, for instance, to note that the incoming HTLC had a CLTV that was 548 blocks in the future, then it can guess that the ultimate sender is a node within 28 CTLV-delta blocks away from it.
Of course, such a large CLTV means that the outgoing HTLC probably has a similarly large CLTV relative to now, meaning that the destination is harder to find.
But in effect, any improvement in making the destination harder to find comes with a decay in the privacy of the source.

As well, the amount can give a lesser degree of this, if the forwarding node can guess some standard amount is the final payment at the destination; C-Lightning imposes a default `maxfeepercent` limit of 0.5%.

This works with only analyzing CLTV-delta (and maybe amount, but that is much less reliable), from a single forwarding node, thus path decorrelation does *not* help with this (path decorrelation helps when multiple, but not all, forwarding nodes are controlled by a single surveillor).
CLTV-delta and amount are unremoveable data and thus this analysis is always possible, so any improvement we can use to improve destination privacy is likely to come at the cost of the source privacy.

This may still be a good tradeoff: the onion protocol used makes identifying the source very difficult, whereas the destination is tightly constrained by the CLTV-delta that must be provided to every forwarding node, so sacrificing a little source privacy to get a little destination privacy may still be worth it, but it should as well be noted that payment information is much more valuable if information about *both* the payer *and* the payee is available.

Random Walk Loops
-----------------

Increasing the path length using a naive form of Rusty Russell Random Route Ralgorithm may lead to loops being formed in the proposed path, e.g. `A - > B -> S1 -> C -> E -> G -> D -> E -> S2 -> F`.
For the node `E`, which appears twice in that route, this is not much different from the case that it is running two separate nodes `S1` and `S2`: a node it controls appears more than once on the same route, thus leading to the ability to eliminate intermediate hops from consideration.
That is, the "detour" through `G` and `D` becomes immaterial to the analysis that `E` can perform on the route.
The end result is that we pay more for including `G` and `D`, but they do not increase our privacy against the node that happens to be the pivot on a loop.

Thus, we should use path length increasing algorithms that specifically avoid loops being formed.
Of course, this can now be used as a basis for a heuristic by which a forwarding node can eliminate some nodes from its analysis by simply assuming that loops do not occur.

With path decorrelation, identifying loops becomes difficult and they will have much lower privacy loss.

Analysis of Multipath and JIT Routing for Privacy
=================================================

Multipath gives an improvement to useability, allowing payments to be distributed across multiple paths rather than requiring that the entire payment be successfully routable as a single large payment.

I would point out as well that Just-In-Time Rebalancing, more commonly known as JIT Routing, gives you most of what multipath gives, but with much simpler software design.
(Nothing prevents Just-In-Time rebalancing from being done by the ultimate payer, so as long as the payer has *some* channel that can hold the entire payment, it would be perfectly fine to initially have the ultimate payer with its funds distributed across multiple channels, which is another thing that people have been thinking would require multipath to solve: Just-in-time rebalance works with this as well.)

I will now demonstrate that JIT Routing gives better privacy than multipath, at least with the current network.

With multipath, every part of the payment has the same hash.
Thus, suppose `A` splits a single payment into two parts, each taking paths `A -> B -> S1 -> C -> D -> G -> H -> F` and `A -> I -> J -> C -> E -> S2 -> F`.
Again, this gives `S1` and `S2` the same tight information, that it is likely the ultimate sender is either `A` or `B` and the ultimate destination is `F`.
`S1` and `S2` need only compare their CLTV-deltas (data that cannot be removed from the payment) to determine which one is likely to be nearer the source and which is likely to be nearer the destination, and thereby get better triangulation of both the ultimate sender and ultimate receiver.

In fact, it is worse than that.
Multiple cooperating surveillors that find themselves on different parts of the same payment can use the unremovable CLTV-delta information to create a Venn diagram of the possible destinations, then consider that the destination is the intersection of those nodes that are within the outgoing CLTV-delta of each of them.
(Similar information as well can be extracted from multiple attempts of a single payment.)

On the other hand, an important property of JIT Routing is that a rebalance attempt triggered by a payment has a different hash from the payment.
Thus, the other nodes that would be involved in the JIT Routing are never given the important detail of the hash of the actual payment, thus preventing them from easily being correlated.
Because there is only a single path for the "real" payment, we follow the principle "telling more people makes it less private": only the nodes specified by the ultimate payer in the payment onion are told of the *actual* payment details, all the other "supporting" nodes that facilitate the payment by allowing rebalances are not told about the payment details, they only get told about rebalances, which are much less interesting (it is literally just shuffling your own money around).
In particular, they are *not* told the CLTV-delta of the "main" payment path, which is the largest single-node privacy leak in Lightning (for multi-node surveillors, the hash is an even bigger privacy leak).


Of course, path decorrelation helps reduce the issues that multipath has.
CLTV-deltas are still an issue under multipath though, and the fact that rebalances triggered by a payment do not leak the actual CLTV-deltas of the original payment is still an important advantage that JIT Routing has over even decorrelated multipath.


Analysis of Path Decerrelation for Privacy
==========================================

As noted above, path decorrelation mitigates privacy issues with both artificially-lengthened paths, and multipath.

Of note is that these only close privacy leaks regarding hashes.
CLTV-delta in particular is still not fixed by path decorrelation.

However, path decorrelation does enable some important possible techniques.

Extended Shadow Routing
-----------------------

As noted before, artificially increasing the path length makes it harder for forwarding nodes to use the shortest-path heuristic to determine te ultimate payer and ultimate payee of a payment attempt.
Increasing the path length tends to increase the number of hop nodes that are traversed, increasing costs, risk of stuck payments, and lock time of stuck payments.

We can observe in particular that when considering stuckness, every additional hop node affects our useability twice:

* Every additional hop node is another hop node that could fail between forwarding and claiming.
* Every additional hop node demands additional CLTV-delta for that hop, increasing the total amount of time we will offer on the first outgoing HTLC.

Thus, very long routes tend to greatly reduce useability.

A way to mitigate this would be to use "extended shadow routing", where instead of *actually* taking a longer path, we *pretend* to take a longer path and overpay some forwarding nodes with more fees and CLTV-delta than they demand.

For example, as noted above, the shortest path from `A` to `F` is `A -> B -> S1 -> C -> E -> S2 -> F`.
Or more precisely, if `A` has to pay 1.0 unit to `F`, and all hop nodes have a 0.01 base fee and 0 proportional fee, the route is: `A ->1.05-> B ->1.04-> S1 ->1.03-> C ->1.02-> E ->1.01-> S2 ->1.00-> F`.
(I have elided the CLTV-delta requirements, but I am sure you can imagine the CLTV-delta requirements from this; indeed, CLTV-delta is a much better way to break privacy than the amount, so consider the example as a proxy for CLTV-delta being used to analyze the path.)

With path decorrelation, `S1` and `S2` can no longer rely on a single hash along the entire route.
Even so, they could still determine the shortest path between them, and thereby know the expected difference in amount and CLTV-delta from the outgoing of `S1` to the incoming of `S2`.
>From this, they can determine with high probability that forwards of similar value and sidereal timing are actually belonging to the same single payment.

Now suppose instead `A` took the sub-optimal path `A ->1.07-> B ->1.06-> S1 ->1.05-> C ->1.04-> D ->1.03-> G ->1.02-> E ->1.01-> S2 ->1.00-> F`.
Now `S1` and `S2` can no longer use the shortest path between them to determine that this is a single payment --- the outgoing of `S1` minus the incoming of `S2` is no longer the cost of the direct path between them, so they cannot assign high probability that the two forwards they see are hops of the same payment.
But for `A`, this comes at the cost of higher fees, higher CLTV-delta, and higher risk of payment attempt failure --- now any failures at `D` and `G` will affect the payment experience.

To mitigate this, instead `A` can take this sub-optimal path that has fewer nodes but overpays the fees and CLTV-delta of `C`: `A ->1.07-> B ->1.06-> S1 ->1.05-> C ->1.02-> E ->1.01-> S2 ->1.00-> F`.
This is extended shadow routing, where not only do we overpay at the destination, we also randomly overpay intermediate hops.
We still use the shortest path in terms of hops, but the random overpaying we do at intermediate hops makes it harder for other intermediate hops to determine if the forwards belong to the same payment or not.

(Again, just to be clear: we also adjust the CLTV-delta correspondingly to overpay the CLTV-delta demanded by `C`.)

Standardized Multipart Splitting Amounts
----------------------------------------

Of course, if the only payment going around with an amount of 1.0 units is the payment between `A` and `F`, and all the other payments do not, it is still possible, even with path decorrelation, for `S1` and `S2` to correlate the forwards by looking at the payment amounts.

Fortunately, if we are able to fix the triangulation issues with multipath by implementing path decorrelation, we can subsequently use multipath to fix the amount-correlation issue.
(Nothing to be done about the CLTV-delta, though...)
This is done by specifying a set of standardized payment amounts, and forcing our algorithms to split with those amounts always, possibly overpaying the destination a little just to hide all the payment amounts of a multipath to standardized payment amounts.

For example, suppose we define standardized payment amounts 16, 8, 4, 2, 1, 0.5, 0.25, 0.125, 0.0625.
If we want to pay 1.3 units, we could split our payment into 1, 0.25, and 0.0625, leading to a total of 1.3125 units, for a 0.96% overpayment.

The advantage of this is that it strengthens increasing the path length (whether for real, or by "extended shadow routing" which just artificially increases the path length by overpaying intermediate hops).
As noted before, if the only payment going around with 1.0 units is that between `A` and `F`, then `S1` and `S2` can increase the probability they assign that the two forwards they see belong to the same payment, even if we increased the path length between them so they can no longer use the shortest-path heuristic.
But if we use standardized payment amounts, and 1.0 is one of the standard amounts, then `S1` and `S2` cannot be certain that the two forwards they individually see are the same payment --- they could be two unrelated standard-amount payments instead.
This reduces amount-based correlation.

In particular, with path decorrelation, multiple attempts of a part are now also harder to correlate --- it could be that those are unrelated attempts from a different payment.

Of course, this topic is likely to lead to a lot of debate.
For instance, powers of two might not be the best (maybe a Fibonacci sequence would be better to minimize overpayment, in much the same way they reduce unused space in `malloc` implementations), multipart receivers might not want to receive a multipart payment composed of payments below the dust threshold (because if they have to drop onchain, they may end up getting multiple incoming parts worth zilch each), more standard payment amounts just split the anonymity set (and what number of standard payment amounts is optimal?) and so on.
Since the benefit of standardized multipart splitting amounts lies in there *being* a standard, I suggest we instead just vote on *whether or not* to define such a standardized set of amounts, and if we do decide to go with this, just hold a lottery on who exactly will define the standardized amounts.
(instead replacing the long debates on what standardized payment amounts to use with long debates on how to do a trustless online-only lottery)

No matter what set of standardized amounts we define, it seems to me that there will be payments so small that splitting them up would be unnecessarily uneconomical, or the payment amount might be smaller than the smallest standardized amount.
We might accept the loss of privacy in those if we consider them to be small and not of much interest.

Amount Decorrelation By Self-Payment
------------------------------------

We already know that we can perform donations by using a self-payment, hiding the donation in the fee of the node we wish to pay.

With PTLCs, which are a requirement of path decorrelation anyway, we can increase the security of such circular payments (they can no longer be stolen by wormhole attacks), and also have a proof-of-payment arrive from the payee (by using the sum of the payee secret plus a secret known by the payer at the payee hop).
This also implements a form of stuckless payments.

If we use two standardized sets of payments with a small difference between them, it becomes possible as well to pay smaller amounts by paying using the *difference* between two standard payment amounts.
As well, the return path ends up misleading successful analysis, as it *inverts* the direction of who is the *actual* payer and who is the *actual* payee.
This could be used to mislead analysis by using a standard amount in the forward section of the path and then use a non-standard amount in the return path, allowing payment of tiny amounts that are impractical given the standard set, but misleading the *actual* amount and who is the *actual* payer and *actual* payee.

As a concrete example, suppose `A` wants to pay `I` a tiny 0.1 units, and does not care that the fees it ends up paying may be a significant fraction of that micropayment.
It could do: `A ->8.01-> B ->8.0-> I ->7.9-> S1 ->7.89-> B ->7.88-> A`.
`A` ends up paying 8.01 - 7.88 = 0.13, so it paid 0.1 to `I` and 0.03 in fees.
`S1` might end up being able to analyze the non-standard amount of `7.9-> S1 ->7.89` and determine that `I` pays `A` about 7.88 units, when in fact the actual payment is `A` paying `I` 0.1 units.

Against this technique, we should also remind of the principle "telling more people makes it less private", however.
As well, this increases the number of hop nodes that could fail the payment, meaning more payment attempts (and thus more people getting told of the payment).
More analysis on the use of circular payments when paying may be in order.

In any case, this would require explicit support in the BOLT spec as well, as we would need invoices to be payable with a direct payment, and with an indirect circular payment as well.

Regards,
ZmnSCPxj