From aj at erisian.com.au Tue Nov 17 21:14:36 2015 From: aj at erisian.com.au (Anthony Towns) Date: Wed, 18 Nov 2015 07:14:36 +1000 Subject: [Lightning-dev] Better privacy with SNARKs Message-ID: <20151117211436.GA17583@debian> Hi all, An obvious privacy limitation with lightning is that even with onion routing, differents hops can be associated as being part of the same transaction due to sharing a common R value. So if you see a HTLC from Alice to Bob, paying $5 to Bob on receipt of R where SHA(R)=12345..; and you see another HTLC from Carol to Dave, paying $4.95 to Bob on receipt of R under the same condition, SHA(R)=12345..., then you know it's part of the same transaction. If you could change R at each step in the route, this would go away, improving payment anonymity and making it harder to attack the system in general. That's hard, because as a forwarding node, if you receive a HTLC payable on R1, then to send a HTLC payable on R2, you need to be able to calculate R1 from R2 or you'll be out of pocket. But you also can't be able to calculate R1 *without* R2, or you could just rip off whoever's making the payment. And, of course you have to know SHA(R2) to forward the payment at all. And if you only know SHA(R1) and SHA(R2) it's hard to say anything at all about R1 and R2 because cryptographic hash functions are designed to make any structural relationships go away. BUT! I think the magic of SNARKs [0] lets you do this! With a SNARK, you can "prove" that you have some secrets (ie, R1 and R2) that satisfy some programmable condition (ie, SHA(R1)=H1 and SHA(R2)=H2 and R1=R2 XOR X), based on public inputs (H1, H2 and X), without revealing those secrets. I think that's pretty safe, because if you receive an HTLC asking for a preimage for H1, along with instructions in the onion saying ask Bob for a preimage for H2, and here's X and a proof, then either: - your forwarded HTLC will fail, and everything's fine - you'll receive R2, calculate R1=R2 XOR X and see SHA(R1)=H1 as expected, and everything's fine - you'll receive R2, calculate R1=R2 XOR X and see SHA(R1) != H1, which is only possible if the cryptography behind SNARKs are broken - you'll receive RX, such that H2=SHA(RX) but RX being too long or too short. If SNARKs aren't broken, this means that you know R2alt and someone else knows R2 that are different but hash to the same value, meaning SHA has been broken. It seems like there are research-level tools out there that actually make this practical to try out. I've had a go at implementing this using snarkfront [1]. Using it looks like: 1) initial setup of proof/verification keys $ ./test_lightning -m keygen > keygen.txt # global setup 2) generate a proof, using a 32 byte secret, and XOR key (64 hex digits) $ SECRET="the quick brown fox jumps lazily" $ XOR=$(echo "she sells sea shells" | sha256sum | head -c64) $ cat keygen.txt | ./test_lightning -m proof -s "$SECRET" -x "$XOR" > proof.txt m: proof. f: . b: . x: 5677d356ccd6ff29b296a697791d109a1703557ecbbcf52b4f38d4b680858912. F: 74686520717569636b2062726f776e20666f78206a756d7073206c617a696c79 B: 221fb676bda3964ad9b6c4e5166a7eba716c2d5ea1c9985b3c18b8d7faece56b #F: ae4d48f71fdb6f74149fab591e88f2cc07d4e696968def1aa7ca1e07096c5b85 #B: 166e4f9b8ec5895e870f0f0508327d73ba9ad9af4e9841599bafb1bf55c8a245 generate proof (6) .................................................. (5) .................................................. (4) .................................................. (3) .................................................. (2) .................................................. (1) .................................................. 3) Verify the proof: $ F=ae4d48f71fdb6f74149fab591e88f2cc07d4e696968def1aa7ca1e07096c5b85 $ B=166e4f9b8ec5895e870f0f0508327d73ba9ad9af4e9841599bafb1bf55c8a245 $ cat keygen.txt proof.txt | ./test_lightning -m verify -h "$F" -b "$B" -x "$XOR" m: verify. f: ae4d48f71fdb6f74149fab591e88f2cc07d4e696968def1aa7ca1e07096c5b85. b: 166e4f9b8ec5895e870f0f0508327d73ba9ad9af4e9841599bafb1bf55c8a245. x: 5677d356ccd6ff29b296a697791d109a1703557ecbbcf52b4f38d4b680858912. verify proof (6) (5) (4) (3) (2) (1) proof is verified 4) Verify it doesn't report a valid proof with different inputs: $ cat keygen.txt proof.txt | ./test_lightning -m verify -h "$B" -b "$F" -x "$XOR" m: verify. f: 166e4f9b8ec5895e870f0f0508327d73ba9ad9af4e9841599bafb1bf55c8a245. b: ae4d48f71fdb6f74149fab591e88f2cc07d4e696968def1aa7ca1e07096c5b85. x: 5677d356ccd6ff29b296a697791d109a1703557ecbbcf52b4f38d4b680858912. verify proof (6) (5) (4) (3) (2) proof is rejected Some results: * the proof/verification key data take up about 100MB -- in theory one set of this data can be used by everyone; the only catch is that everyone has to trust that nobody has kept the original random numbers used to generate it. * proof/verification key data takes about a minute to generate, and about 650MB of RAM. * the proof data itself (which would need to be sent to the node that's going to switch R's) is just 864 bytes; so it'd use up about 5 hops worth of onion routing at 192B per hop -- in a 4096 byte packet eg, you could have four hops, changing R each time; or you could have 9 hops, changing R only three times. * generating the proof data for a given R1,X pair takes about 10 seconds, and 260MB of RAM * verifying the proof is quick-ish -- it takes 0.5s on my laptop, and uses about 150MB of RAM. For comparison, that last point makes a SNARK verification 500+ times more expensive than an ECDH operation. If I got my maths right, you can translate 3c for a linode CPU-hour into 2.5 satoshi for a linode CPU-second (at $338/BTC), so you're probably looking at a minimum fee of a few satoshi per SNARK verification, but that's still pretty okay for transactions of 500 satoshi or more, ie anything more than about a fifth of a US cent. The 10s proof generation time is probably more of a limitation -- though you could generate them in advance easily enough and just store them until you need to use them, which would avoid lag being a problem at least. But even then it's still essentially adding up to 30c of additional costs to your transaction (ie 10s cpu time valued at up to 3c/s), which probably isn't worthwhile for transactions smaller than a dollar or two. A drawback is that you'd either (a) have to do all this on the merchant's side (not just sending SHA(R) to whoever wants to pay you, but sending SHA(R1), SHA(R2), SHA(R3), SHA(R4), X12, X23, X34, and three proofs, which would be pretty painful; or (b) you'd have to generate all the R secrets as a consumer, and you wouldn't get to use the fact that you know R as evidence that you paid the merchant. Anyway, it's obviously not ready for prime time today: SNARKs are still pretty new as a concept; I'm definitely not familiar enough with SNARK theory to be sure I'm not misusing the concept somehow; snarkfront may not have implemented the theory fully correctly; and I might not have captured everything I needed to in order for my "proof" to actually say what I want it to. So not a great idea to use this to protect real money today. But still, this seems like it's not all /that/ far from being practical, and if the crypto's not fundamentally broken, seems like it goes a long way to filling in the biggest privacy hole in lightning today [3]... Code is at https://github.com/ajtowns/snarkfront/ or more directly at: https://github.com/ajtowns/snarkfront/blob/lightning-sha/test_lightning.cpp Cheers, aj [0] https://tahoe-lafs.org/trac/tahoe-lafs/wiki/SNARKs [1] https://github.com/jancarlsson/snarkfront [2] This would also improve privacy/anonymity for other applications of HTLCs, such as atomic swaps across chains: 1 bitcoin, payable on R1 + Alice's sig or timeout + Bob's sig 100 litecoin, payable on R2 + Robert's sig or timeout + Ally's sig Alice and Bob communicate privately, agreeing to trade 1 BTC for 100 litecoin and revealing their aliases Robert and Ally; Alice generates R1, R2, and reveals SHA(R1), SHA(R2), R1^R2 and the SNARK proof and publishes the litecoin payment. Bob verifies the proof, and publishes the bitcoin payment. Alice claims the bitcoin payment, revealing R1; Bob calculates R2 and claims the litecoin payment. The swap can take place trustlessly because Bob knows the only way Alice can claim his bitcoin is by revealing enough info so he can claim the corresponding litecoin. But there isn't any on-chain information linking the two transactions, because R1 and R2 are independent (and could even be using different hash functions as well as different preimages). After the funds have been claimed, the private communication is also completely deniable, since anyone could generate R1^R2 and a corresponding SNARK proof just using the info on the blockchain.