From tadge at lightning.network Mon Aug 15 15:18:24 2016 From: tadge at lightning.network (Tadge Dryja) Date: Mon, 15 Aug 2016 11:18:24 -0400 Subject: [Lightning-dev] Blinded channel observation In-Reply-To: <87twep3qra.fsf@rustcorp.com.au> References: <87a8gmpkde.fsf@rustcorp.com.au> <20160809192814.GA22477@lightning.network> <877fbpps8s.fsf@rustcorp.com.au> <20160809222938.GA25606@lightning.network> <87wpjpnzwd.fsf@rustcorp.com.au> <20160811041626.GA8114@lightning.network> <87d1leodg7.fsf@rustcorp.com.au> <20160812212034.GA29612@lightning.network> <87twep3qra.fsf@rustcorp.com.au> Message-ID: There's two approaches with encrypted vs non-encrypted: the non-encrypted design which I kindof like, is to make all the information given to the observer not mean anything on its own. With encrypted, you achieve the same result, but have some decryption key stuffed somewhere in the observed transaction to reveal meaningful data which identifies the channel, but is encrypted. Non-encrypted can be more efficient, because it's hard to squeeze down compact encrypted data (though see below for an attempt!). But most things in the channel states can be obfuscated such that even if you tell everything to the observer, they don't learn anything. (Even in the case where the observer is watching both sides of the channel, they shouldn't be able to match them... well other than timing, which is admittedly a very effective way to do it!) I skipped over HTLCs though because they didn't fit with this model. And they really don't -- unlike the updating pubkeys in the commit tx, HTLC's are passed though multiple nodes, so information about them can get to the observer pretty easily. So I think HTLCs would need to be in some kind of encrypted blob to send to the observer. I really like txid[0:16] as the truncated txid for the observer and txid[16:32] as the decryption key because it's simple and quite fast. This would allow constant-time lookups into the observer's database regarless of how many channels it's watching, which HMAC'ing the txid doesn't have. (You could hash txid[16:32] again for the decryption key if you want 32 bytes.) The non-HTLC data can be sent unencrypted -- it's pretty much just a signature and hash from the tree. If there is a new HTLC (or a few) added in that state, the node can elect to send that to the observer as well. I think the format can be something like: htlc #1 expiry (4 bytes) htlc #1 preimage (20 bytes) htlc #2 expiry (4 bytes) htlc #2 preimage (20 bytes) offset to previous blob (2 bytes) decrypt key for previous blob (16 bytes) having pointers to previous states can save a lot of space if HTLCs are added incrementally. The "blobs" can be kept in a separate data store indexed by state number, so it's quick to see that, e.g, state 471 also has an HTLC from state 465, which has HTLCs from state 442. This chained decryption may end up revealing more HTLCs than are needed (which are quick for the observer to detect and discard) but if the fraud has occurred then anonymity is gone anyway and it's no big deal if the observer learns a little more -- they already learned all the important stuff. I *think* 2 bytes is enough; it's not that an HTLC can't last more than 65K states, it's that an HTLC can't persist > 65K states with no other HTLCs being added during that period. A long-lived HTLC wouldn't be referenced directly; instead later states which still had it would point to a previous state that also pointed to it. It's a bit more work for the observer, who might end up with hundreds of extra preimages, but I think optimizing for space savings at the cost of CPU time when the fraud occurs is a good trade-off. (The fraud occurs almost never, while the state data transfers and storage happen always) This would also allow nodes to omit or include HTLCs to the observer as they see fit, which seems useful for micropayments which might outstrip the abilities of the observer. Also, yeah, padding (handwave) and timing are what make hiding the channel very tricky, especially for HTLCs. With non-HTLC updates, it can be hard to know when 2 nodes are updating a channel state, but with HTLCs there are more nodes in the mix with more points for data to leak out to the observer. That's another reason you might want to omit sending out some portion of HTLC recovery data. I will try coding some of this and see, because it seems to work in my head but that's no indication it'll work on the computer :) -Tadge -------------- next part -------------- An HTML attachment was scrubbed... URL: