From rusty at rustcorp.com.au  Wed Feb  3 00:55:37 2016
From: rusty at rustcorp.com.au (Rusty Russell)
Date: Wed, 03 Feb 2016 11:25:37 +1030
Subject: [Lightning-dev] Laundry list of inter-peer wire protocol changes
In-Reply-To: <1454435768.2011.28.camel@ultimatestunts.nl>
References: <87d1snvhyf.fsf@rustcorp.com.au>
	<1453923255.11915.36.camel@ultimatestunts.nl>
	<87mvrpru3e.fsf@rustcorp.com.au>
	<1454435768.2011.28.camel@ultimatestunts.nl>
Message-ID: <874mdqmx2u.fsf@rustcorp.com.au>

CJP <cjp at ultimatestunts.nl> writes:
>> > * Message confirmation: this is done manually (instead of relying on
>> > TCP), so that a node knows which messages were received / need to be
>> > re-transmitted, even after a crash + restart.
>> 
>> I think the protocol itself needs to be robust against retransmissions.
>> There's no way to know if the other side received your acknowledgement
>> before a crash, so you will always need to handle duplication on
>> re-establishment.
>
> Yes. Amiko Pay does that: It assigns a number to every message, and the
> receiving side can confirm "I've received up to this number".
> Not-yet-confirmed messages will be retransmitted, and the receiving
> sides will ignore duplicates (except it will send a confirmation again,
> in case the previous confirmation was lost).

Yes, I think packet numbers make sense.  I encoded the packet length and
order in one single value (ie. byte counter at the end of this packet),
but it's too weird for too little gain IMHO.

>> > * There is not only two-way communication between linked peers, but also
>> > between payer and payee. This is necessary for Amiko Pay's
>> > bi-directional routing, but also useful e.g. for transmitting meta-data
>> > that doesn't fit in a QR code. Amiko Pay transmits an arbitrary-contents
>> > "receipt" from payee to payer; in the future, this might be digitally
>> > signed by the payee, as a "proof of transfer of ownership" of
>> > non-cryptographic goods.
>> 
>> I agree.  There's room in the initial onion design for payer -> payee
>> messages, but we don't have a channel for responses.
>> 
>> I can't see an easy way to implement the payee --> payer comms reliably:
>> to be reliable it would have to be published on-chain in the commit tx.
>> (Which we could do by constructing HTLCs such that they require a blob
>> signed by the payee, but that's tracable ...).
>> 
>> Mats and Laolu wanted to add an arbitrary comms protocol layer, but I
>> think that's something we can defer.
>
> In Amiko Pay, payer <-> payee communication is done on a direct TCP
> stream between them. Note that this also reduces latency: once
> transaction locking reaches the payee, the payee knows (s)he's capable
> of claiming the money, and can tell the payer that the payment is
> completed. If reduced latency is in the interest of the payee, this is
> likely to happen.

I think for v1.0 of the protocol we'll be assuming such a channel for
simplicity; that somehow the R hash and route is known by the payer.

> On latency: what latency do you think is needed for different use cases,
> and what can we reach? Does this extra step really make a difference?
>
> My estimate is that we'll typically have 10 hops ("six degrees of
> separation" theory), and 100ms to transmit a message(*) over one hop.
...
> (*) Not counting sending the confirmation back: a node that receives a
> message can immediately forward a message on the next hop; message
> confirmation on the receiving side can occur in parallel.

That's a good point; you can offer the next hop and abort if the prior
hop fails to deliver a signature.  Nice, my estimates were double
yours...

> Without reserving, you need to traverse all hops once(**) (the locking
> operation) before payer(***)+payee know that the transaction has
> succeeded. Actual settlement on the channels happens afterwards, but is
> no longer critical for the latency as seen by payer+payee.
>
> With reserving, you need to traverse all hops three times(**), in the
> worst case that the meeting point is on one of the end points of the
> route: once for making the route and reserving funds, once for
> confirming that the route has been established and once for locking.
>
> So, instead of one second, a transaction might take three seconds. Is
> that a game changer? Maybe it is for e.g. public transport access gates,
> where passenger throughput is essential. But then, people could reduce
> latency a lot by having a direct channel with the public transport
> operator.

I worry that higher latency is a centralization pressure, and encourages
people to sacrifice privacy.  I don't know where the threshold is,
though, so currently I'm more nervous about complexity :)

Cheers,
Rusty.