Received: from sog-mx-1.v43.ch3.sourceforge.com ([172.29.43.191] helo=mx.sourceforge.net) by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1X7jrL-0005qe-57 for bitcoin-development@lists.sourceforge.net; Thu, 17 Jul 2014 11:28:23 +0000 Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of m.gmane.org designates 80.91.229.3 as permitted sender) client-ip=80.91.229.3; envelope-from=gcbd-bitcoin-development@m.gmane.org; helo=plane.gmane.org; Received: from plane.gmane.org ([80.91.229.3]) by sog-mx-1.v43.ch3.sourceforge.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.76) id 1X7jrH-0002LW-8N for bitcoin-development@lists.sourceforge.net; Thu, 17 Jul 2014 11:28:23 +0000 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1X7jr8-0004Ce-Dq for bitcoin-development@lists.sourceforge.net; Thu, 17 Jul 2014 13:28:10 +0200 Received: from f052021167.adsl.alicedsl.de ([78.52.21.167]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 17 Jul 2014 13:28:10 +0200 Received: from andreas by f052021167.adsl.alicedsl.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 17 Jul 2014 13:28:10 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: bitcoin-development@lists.sourceforge.net From: Andreas Schildbach Date: Thu, 17 Jul 2014 13:27:57 +0200 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: f052021167.adsl.alicedsl.de User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 In-Reply-To: X-Enigmail-Version: 1.5.2 X-Spam-Score: -0.4 (/) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [80.91.229.3 listed in list.dnswl.org] -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 1.1 DKIM_ADSP_ALL No valid author signature, domain signs all mail -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain X-Headers-End: 1X7jrH-0002LW-8N Subject: Re: [Bitcoin-development] BIP 38 NFC normalisation issue X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jul 2014 11:28:23 -0000 Here is a good article that helped me with what's going wrong: http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html Basically, Java is stuck at 16 bits per char due to legacy reasons. They admit that for a new language, they would probably use 32 (or 24?) bits per char. \u literals express UTF-16 encoding, so you have to use 16 bits. I learned that for codepoint 0x010400, I could write "\uD801\uDC00", which is the UTF-16 encoding of that codepoint. Other languages have literals for codepoints. E.g. Python can use u"\U00010400" or HTML has 𐐀 Unfortunately, Java is missing such a construct (at least in Java6). On 07/17/2014 12:59 PM, Mike Hearn wrote: > Glad we got to the bottom of that. That's quite a nasty > compiler/language bug I must say. Not even a warning. Still, python > crashes when trying to print the name of a null character. It wouldn't > surprise me if there are other weird issues lurking. Would definitely > sleep better with a more restricted character set. > > On 17 Jul 2014 00:04, "Andreas Schildbach" > wrote: > > Please excuse me. I had a more thorough look at the original problem and > found that the only problem with the original test case was that you > cannot specify codepoints from the SMP using \u in Java. I always tried > \u010400 but that doesn't work. > > Here is a fix for bitcoinj. The test now passes. > > https://github.com/bitcoinj/bitcoinj/pull/143 > > We can (and probably should) still need to filter control chars, I'll > have a look at that now again. > > > On 07/16/2014 11:06 PM, Aaron Voisine wrote: > > If I first remove \u0000, so the non-normalized passphrase is > > "\u03D2\u0301\U00010400\U0001F4A9", and then NFC normalize it, it > > becomes "\u03D3\U00010400\U0001F4A9" > > > > UTF-8 encoded this is: 0xcf93f0909080f09f92a9 (not the same as what > > you got, Andreas!) > > > > Encoding private key: > 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 > > with this passphrase, I get a BIP38 key of: > > 6PRW5o9FMb4hAYRQPmgcvVDTyDtr6R17VMXGLmvKjKVpGkYhBJ4uYuR9wZ > > > > I recommend rather than simply removing control characters from the > > password that instead the spec require that passwords containing > > control characters are invalid. We don't want people trying to be > > clever and putting them in thinking they are adding to the password > > entropy. > > > > Also for UI compatibility across many platforms, I'm also in favor > > disallowing any character below U+0020 (space) > > > > I can submit a PR once we figure out why Andreas's passphrase was > > different than what I got. > > > > Aaron Voisine > > breadwallet.com > > > > > > On Wed, Jul 16, 2014 at 4:04 AM, Andreas Schildbach > > > wrote: > >> Damn, I just realized that I implement only the decoding side of > BIP38. > >> So I cannot propose a complete test vector. Here is what I have: > >> > >> > >> Passphrase: ϓ␀𐐀💩 (\u03D2\u0301\u0000\U00010400\U0001F4A9; GREEK > >> UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL > LETTER > >> LONG I, PILE OF POO) > >> > >> Passphrase bytes after removing ISO control characters and NFC > >> normalization: 0xcf933034303066346139 > >> > >> Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF > >> > >> Unencrypted private key (WIF): > >> 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 > >> > >> > >> Can someone calculate the encrypted key from it (using whatever > >> implementation) and I will verify it decodes properly in bitcoinj? > >> > >> > >> > >> On 07/16/2014 12:46 PM, Andreas Schildbach wrote: > >>> I will change the bitcoinj implementation and propose a new test > vector. > >>> > >>> > >>> > >>> On 07/16/2014 11:29 AM, Mike Hearn wrote: > >>>> Yes sorry, you're right, the issue starts with the null code point. > >>>> Python seems to have problems starting there too. It might work > if we > >>>> took that out. > >>>> > >>>> > >>>> On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach > >>>> > >> wrote: > >>>> > >>>> Guys, you are always talking about the Unicode astral > plane, but in fact > >>>> its a plain old (ASCII) control character where this > problem starts and > >>>> likely ends: \u0000. > >>>> > >>>> Let's ban/filter ISO control characters and be done with > it. Most > >>>> control characters will never be enterable by any keyboard > into a > >>>> password field. Of course I assume that > Character.isISOControl() works > >>>> consistently across platforms. > >>>> > >>>> > http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 > >>>> > >>>> > >>>> On 07/16/2014 12:23 AM, Aaron Voisine wrote: > >>>> > If the user creates a password on an iOS device with an > astral > >>>> > character and then can't enter that password on a JVM > wallet, that > >>>> > sucks. If JVMs really can't support unicode NFC then > that's a strong > >>>> > case to limit the spec to the subset of unicode that all > popular > >>>> > platforms can support, but it sounds like it might just > be a JVM > >>>> > string library bug that could hopefully be reported and > fixed. I get > >>>> > the same result as in the test case using apple's > >>>> > CFStringNormalize(passphrase, kCFStringNormalizationFormC); > >>>> > > >>>> > Aaron Voisine > >>>> > breadwallet.com > > >>>> > > >>>> > > >>>> > On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn > > >>>> >> wrote: > >>>> >> Yes, we know, Andreas' code is indeed doing normalisation. > >>>> >> > >>>> >> However it appears the output bytes end up being > different. What > >>>> I get back > >>>> >> is: > >>>> >> > >>>> >> cf930001303430300166346139 > >>>> >> > >>>> >> vs > >>>> >> > >>>> >> cf9300f0909080f09f92a9 > >>>> >> > >>>> >> from the spec. > >>>> >> > >>>> >> I'm not sure why. It appears this is due to the > character from > >>>> the astral > >>>> >> planes. Java is old and uses 16 bit characters > internally - it > >>>> wouldn't > >>>> >> surprise me if there's some weirdness that means it > doesn't/won't > >>>> support > >>>> >> this kind of thing. > >>>> >> > >>>> >> I recommend instead that any implementation that wishes > to be > >>>> compatible > >>>> >> with JVM based wallets (I suspect Android is the same) just > >>>> refuse any > >>>> >> passphrase that includes characters outside the BMP. At > least > >>>> unless someone > >>>> >> can find a fix. I somehow doubt this will really hurt > anyone. > >>>> >> > >>>> >> > >>>> > ------------------------------------------------------------------------------ > >>>> >> Want fast and easy access to all the code in your > enterprise? > >>>> Index and > >>>> >> search up to 200,000 lines of code with a free copy of > Black Duck > >>>> >> Code Sight - the same software that powers the world's > largest code > >>>> >> search on Ohloh, the Black Duck Open Hub! Try it now. > >>>> >> http://p.sf.net/sfu/bds > >>>> >> _______________________________________________ > >>>> >> Bitcoin-development mailing list > >>>> >> Bitcoin-development@lists.sourceforge.net > > >>>> > > >>>> >> > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > >>>> >> > >>>> > > >>>> > > >>>> > ------------------------------------------------------------------------------ > >>>> > Want fast and easy access to all the code in your enterprise? > >>>> Index and > >>>> > search up to 200,000 lines of code with a free copy of > Black Duck > >>>> > Code Sight - the same software that powers the world's > largest code > >>>> > search on Ohloh, the Black Duck Open Hub! Try it now. > >>>> > http://p.sf.net/sfu/bds > >>>> > > >>>> > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------ > >>>> Want fast and easy access to all the code in your > enterprise? Index and > >>>> search up to 200,000 lines of code with a free copy of > Black Duck > >>>> Code Sight - the same software that powers the world's > largest code > >>>> search on Ohloh, the Black Duck Open Hub! Try it now. > >>>> http://p.sf.net/sfu/bds > >>>> _______________________________________________ > >>>> Bitcoin-development mailing list > >>>> Bitcoin-development@lists.sourceforge.net > > >>>> > > >>>> > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > >>>> > >>>> > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------ > >>>> Want fast and easy access to all the code in your enterprise? > Index and > >>>> search up to 200,000 lines of code with a free copy of Black Duck > >>>> Code Sight - the same software that powers the world's largest code > >>>> search on Ohloh, the Black Duck Open Hub! Try it now. > >>>> http://p.sf.net/sfu/bds > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bitcoin-development mailing list > >>>> Bitcoin-development@lists.sourceforge.net > > >>>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development > >>>> > >>> > >>> > >>> > >>> > ------------------------------------------------------------------------------ > >>> Want fast and easy access to all the code in your enterprise? > Index and > >>> search up to 200,000 lines of code with a free copy of Black Duck > >>> Code Sight - the same software that powers the world's largest code > >>> search on Ohloh, the Black Duck Open Hub! Try it now. > >>> http://p.sf.net/sfu/bds > >>> > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Want fast and easy access to all the code in your enterprise? > Index and > >> search up to 200,000 lines of code with a free copy of Black Duck > >> Code Sight - the same software that powers the world's largest code > >> search on Ohloh, the Black Duck Open Hub! Try it now. > >> http://p.sf.net/sfu/bds > >> _______________________________________________ > >> Bitcoin-development mailing list > >> Bitcoin-development@lists.sourceforge.net > > >> https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > > > > ------------------------------------------------------------------------------ > > Want fast and easy access to all the code in your enterprise? > Index and > > search up to 200,000 lines of code with a free copy of Black Duck > > Code Sight - the same software that powers the world's largest code > > search on Ohloh, the Black Duck Open Hub! Try it now. > > http://p.sf.net/sfu/bds > > _______________________________________________ > > Bitcoin-development mailing list > > Bitcoin-development@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > > > > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > > > > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development >