Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id D98E7EC3 for ; Wed, 20 Apr 2016 16:32:29 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 8E78E18D for ; Wed, 20 Apr 2016 16:32:28 +0000 (UTC) Received: by mail-wm0-f48.google.com with SMTP id v188so210901909wme.1 for ; Wed, 20 Apr 2016 09:32:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:subject:message-id:date:user-agent:mime-version :content-transfer-encoding; bh=s1tr5f0jwlObzVlPiER1PUFDYDXmgZ1uoCpLgDLo/dw=; b=mwRI3wkDqluoIOsl2Ve2UGgsTmOQz/ooxWK/K0TVnEZ5dJEb1LIUjnTbmnvwHF61QT xRegCD9+l11a98pnZAAIBv4cnyxh5ZULgAdxXPpn3x4EIJ7iVg7IEtwAFryA71EIEb+C SgV9d/KJ2W7vEuBz2GILfgiauzabK+0wuOnLKIcAYOfD0VIAWD3ImoO4Scq+oczTwbAy bO4g7e2KlO5QgiSFku8IpP2ksXj3tQGawztsW2jQ4fdjjW62TFff3tqvW6+6AjsQA9kL xD9Rro1jZJF12ON85o/fdXw1C/8JPTlTUBnV7Jrx2qU4blVbO9H4scAysAFhNH9dh4xR Ledg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:message-id:date:user-agent :mime-version:content-transfer-encoding; bh=s1tr5f0jwlObzVlPiER1PUFDYDXmgZ1uoCpLgDLo/dw=; b=WAJ9sJ9YMAMV/lprOn9MiBbVH1/07UzTkJQEpOi+Q3GDmWaRZwYUyKhTy4SRryFTAR 752VNY7Y12ujUn+eHzkMXnRuEaIrVlVbrNdd2P3DQRAUocxnMkwmUBxcnKu3KBiJGbB2 bbyn4CEKeH9UNP11kHioIyCYUHFtTzxmdB0xu0x4k3rIEMtyBLitICc6qYCZlsvUycFZ TxXlrN5PKrxLGNcGA01VReoBHB7vqwLcYxDBzLBj54E0+os4uRbTdMVz8n/OQvUAHv1O Dg4hV7fV2ygsrZgTBel3Y77aGDMeNQMcwNLTWXPCUXXCpEL+K8V8GUQG/z0Z3isgAyFg LhZQ== X-Gm-Message-State: AOPr4FW/k7FOpmGKUn1xI7SqVrOXWRJ0jWY5gGjhxlBuo+87iRlpFl5/gVM0IpskLQLXUg== X-Received: by 10.28.142.5 with SMTP id q5mr9951090wmd.21.1461169947265; Wed, 20 Apr 2016 09:32:27 -0700 (PDT) Received: from [10.34.0.144] (nat-0-15.lam.cz. [80.92.242.254]) by smtp.googlemail.com with ESMTPSA id t4sm10461448wmf.8.2016.04.20.09.32.26 for (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2016 09:32:26 -0700 (PDT) From: Jochen Hoenicke X-Enigmail-Draft-Status: N1110 To: bitcoin-dev@lists.linuxfoundation.org Message-ID: <5717AF19.1030102@gmail.com> Date: Wed, 20 Apr 2016 18:32:25 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Wed, 20 Apr 2016 17:08:40 +0000 Subject: [bitcoin-dev] Proposal to update BIP-32 X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Development Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Apr 2016 16:32:30 -0000 Hello Bitcoin Developers, I would like to make a proposal to update BIP-32 in a small way. TL;DR: BIP-32 is hard to use right (due to its requirement to skip addresses). This proposal suggests a modification such that the difficulty can be encapsulated in the library. #MOTIVATION: The current BIP-32 specifies that if for some node in the hierarchy the computed hash I_L is larger or equal to the prime or 0, then the node is invalid and should be skipped in the BIP-32 tree. This has several unfortunate consequences: - All callers of CKDpriv or CKDpub have to check for errors and handle them appropriately. This shifts the burden to the application developer instead of being able to handle it in the BIP-32 library. - It is not clear what to do if an intermediate node is missing. E.g. for the default wallet layout, if m/i_H/0 is missing should m/i_H/1 be used for external chain and m/i_H/2 for internal chain? This would make the wallet handling much more difficult. - It gets even worse with standards like BIP-44. If m/44' is missing should we use m/45' instead? If m/44'/0' is missing should we use m/44'/1' instead, using the same addresses as for testnet? One could also restart with a different seed in this case, but this wouldn't work if one later wants to support another BIP-43 proposal and still keep the same wallet. I think the first point alone is reason enough to change this. I am not aware of a BIP-32 application that handles errors like this correctly in all cases. It is also very hard to test, since it is infeasible to brute-force a BIP-32 key and a path where the node does not exists. This problem can be avoided by repeating the hashing with slightly different input data until a valid private key is found. This would be in the same spirit as RFC-6979. This way, the library will always return a valid node for all paths. Of course, in the case where the node is valid according to the current standard the behavior should be unchanged. I think the backward compatibility issues are minimal. The chance that this affects anyone is less than 10^-30. Even if it happens, it would only create some additional addresses (that are not seen if the user downgrades). The main reason for suggesting a change is that we want a similar method for different curves where a collision is much more likely. #QUESTIONS: What is the procedure to update the BIP? Is it still possible to change the existing BIP-32 even though it is marked as final? Or should I make a new BIP for this that obsoletes BIP-32? What algorithm is preferred? (bike-shedding) My suggestion: --- Change the last step of the private -> private derivation functions to: . In case parse(I_L) >= n or k_i = 0, the procedure is repeated at step 2 with I = HMAC-SHA512(Key = c_par, Data = 0x01 || I_R || ser32(i)) --- I think this suggestion is simple to implement (a bit harder to unit test) and the string to hash with HMAC-SHA512 always has the same length. I use I_R, since I_L is obviously not very random if I_L >= n. There is a minimal chance that it will lead to an infinite loop if I_R is the same in two consecutive iterations, but that has only a chance of 1 in 2^512 (if the algorithm is used for different curves that make I_L >= n more likely, the chance is still less than 1 in 2^256). In theory, this loop can be avoided by incrementing i in every iteration, but this would make an implementation error in the "hard to test" path of the program more likely. The other derivation functions should be updated in a similar matter. Also the derivation of the root node from the seed should be updated in a similar matter to avoid invalid seeds. If you followed until here, thanks for reading this long posting. Jochen