1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
|
Received: from sog-mx-1.v43.ch3.sourceforge.com ([172.29.43.191]
helo=mx.sourceforge.net)
by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
(envelope-from <boydb@midnightdesign.ws>) id 1X76oV-0003Pl-R4
for bitcoin-development@lists.sourceforge.net;
Tue, 15 Jul 2014 17:46:51 +0000
Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of
midnightdesign.ws designates 50.87.144.70 as permitted sender)
client-ip=50.87.144.70; envelope-from=boydb@midnightdesign.ws;
helo=gator3054.hostgator.com;
Received: from gator3054.hostgator.com ([50.87.144.70])
by sog-mx-1.v43.ch3.sourceforge.com with esmtps (TLSv1:AES256-SHA:256)
(Exim 4.76) id 1X76oU-0008VH-2Y
for bitcoin-development@lists.sourceforge.net;
Tue, 15 Jul 2014 17:46:51 +0000
Received: from [209.85.215.43] (port=55194 helo=mail-la0-f43.google.com)
by gator3054.hostgator.com with esmtpsa (TLSv1:RC4-SHA:128)
(Exim 4.82) (envelope-from <boydb@midnightdesign.ws>)
id 1X74QN-0001Ut-II for bitcoin-development@lists.sourceforge.net;
Tue, 15 Jul 2014 10:13:47 -0500
Received: by mail-la0-f43.google.com with SMTP id hr17so3978118lab.30
for <bitcoin-development@lists.sourceforge.net>;
Tue, 15 Jul 2014 08:13:44 -0700 (PDT)
X-Gm-Message-State: ALoCoQnVMnSP3FuwMyDTrksCRTDqmMZc7K6pZjHUEg6TDH2EutbbE5Xj4uV8Cu5mrQobLfHhjyyU
MIME-Version: 1.0
X-Received: by 10.112.160.105 with SMTP id xj9mr18664292lbb.2.1405437224694;
Tue, 15 Jul 2014 08:13:44 -0700 (PDT)
Received: by 10.152.30.106 with HTTP; Tue, 15 Jul 2014 08:13:44 -0700 (PDT)
In-Reply-To: <CAObn+gfbH61kyv_ttT4vsQuNFRWLB5H3xaux7GQ0co82ucO_eA@mail.gmail.com>
References: <CANEZrP3ZzCBohXWZmZxE=ofP74Df4Hd-hCLH6jYn=JKbiqNQXA@mail.gmail.com>
<CAObn+gfbH61kyv_ttT4vsQuNFRWLB5H3xaux7GQ0co82ucO_eA@mail.gmail.com>
Date: Tue, 15 Jul 2014 10:13:44 -0500
Message-ID: <CANg-TZAe2PO9nwQktmDSJFtaLsg6hogOw6mj0SaROdJJr33vog@mail.gmail.com>
From: Brooks Boyd <boydb@midnightdesign.ws>
To: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Content-Type: multipart/alternative; boundary=001a11c2ab5c884c4204fe3cda91
X-AntiAbuse: This header was added to track abuse,
please include it with any abuse report
X-AntiAbuse: Primary Hostname - gator3054.hostgator.com
X-AntiAbuse: Original Domain - lists.sourceforge.net
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - midnightdesign.ws
X-BWhitelist: no
X-Source-IP: 209.85.215.43
X-Exim-ID: 1X74QN-0001Ut-II
X-Source:
X-Source-Args:
X-Source-Dir:
X-Source-Sender: (mail-la0-f43.google.com) [209.85.215.43]:55194
X-Source-Auth: midnight
X-Email-Count: 0
X-Source-Cap: bWlkbmlnaHQ7bWlkbmlnaHQ7Z2F0b3IzMDU0Lmhvc3RnYXRvci5jb20=
X-Spam-Score: -0.5 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
See http://spamassassin.org/tag/ for more details.
-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
sender-domain
-0.0 SPF_HELO_PASS SPF: HELO matches SPF record
-0.0 SPF_PASS SPF: sender matches SPF record
1.0 HTML_MESSAGE BODY: HTML included in message
X-Headers-End: 1X76oU-0008VH-2Y
Subject: Re: [Bitcoin-development] BIP 38 NFC normalisation issue
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Tue, 15 Jul 2014 17:46:52 -0000
--001a11c2ab5c884c4204fe3cda91
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
I was part of adding in that test vector, and I think it's a good test
vector since it is an extreme edge-case of the current definition: If the
BIP38 proposal allows any password that can be in UTF-8, NFC normalized
form, those characters cover the various edge cases (combining characters,
null character, astral range) that if your implementation doesn't handle,
then it can't really be said to be "BIP38-compatible/compliant", right?
The "passphrase" in the test vector is NOT in NFC form; that's the point.
Whatever implementation gets designed has to assume the input is not
already NFC-normalized and needs to handle/sanitize that input before
further processing. To test your implementation for compliance, you should
not be inputting the NFC-normalized bytestring as the password input, you
should be entering the original passphrase as the test. My original pull
request for this change (https://github.com/bitcoin/bips/pull/29) shows a
Python and a NodeJS way to input that test vector password as intended.
Some input devices may already handle the input as NFC, which is great, but
per the BIP38 proposal, that shouldn't be assumed, so various
implementations are cross-compatible. If one implementation assumes the
input is already NFC, they may encode/decode the password incorrectly, and
lock a user out of their wallet. Android allows different user keyboards to
be used, so I'm guessing there's one somewhere that allows manual entry of
unicode codepoints that could be used to enter a null character, and with
the next version of iOS, Apple devices will also get custom keyboard
options, too, so even if the default Apple keyboard does NFC-form properly,
other developers' keyboards may not. So while it is an extreme edge case,
that is not very likely to be used as a "real password" by any user, that's
what test vectors are for: to test for the edge case that you might not
have expected and handled in your implementation.
Brooks
On Tue, Jul 15, 2014 at 8:07 AM, Eric Winer <enwiner@gmail.com> wrote:
> I don't know for sure if the test vector is correct NFC form. But for
> what it's worth, the Pile of Poo character is pretty easily accessible on
> the iPhone and Android keyboards, and in this string it's already in NFC
> form (f09f92a9 in the test result). I've certainly seen it in usernames
> around the internet, and wouldn't be surprised to see it in passphrases
> entered on smartphones, especially if the author of a BIP38-compatible ap=
p
> includes a (possibly ill-advised) suggestion to have your passphrase
> "include special characters".
>
> I haven't seen the NULL character on any smartphone keyboards, though - I
> assume the iOS and Android developers had the foresight to know how much
> havoc that would wreak on systems assuming null-terminated strings. It
> seems unlikely that NULL would be in a real-world passphrase entered by a
> sane user.
>
>
> On Tue, Jul 15, 2014 at 8:03 AM, Mike Hearn <mike@plan99.net> wrote:
>
>> [+cc aaron]
>>
>> We recently added an implementation of BIP 38 (password protected privat=
e
>> keys) to bitcoinj. It came to my attention that the third test vector ma=
y
>> be broken. It gives a hex version of what the NFC normalised version of =
the
>> input string should be, but this does not match the results of the Java
>> unicode normaliser, and in fact I can't even get Python to print the nam=
es
>> of the characters past the embedded null. I'm curious where this normali=
sed
>> version came from.
>>
>> Given that "pile of poo" is not a character I think any sane user would
>> put into a passphrase, I question the value of this test vector. NFC for=
m
>> is intended to collapse things like umlaut control characters onto their
>> prior code point, but here we're feeding the algorithm what is basically
>> garbage so I'm not totally surprised that different implementations appe=
ar
>> to disagree on the outcome.
>>
>> Proposed action: we remove this test vector as it does not represent any
>> real world usage of the spec, or if we desperately need to verify NFC
>> normalisation I suggest using a different, more realistic test string, l=
ike
>> Z=C3=BCrich, or something written in Thai.
>>
>>
>>
>> Test 3:
>>
>> - Passphrase =CF=92=CC=81=E2=90=80=F0=90=90=80=F0=9F=92=A9 (\u03D2\u0=
301\u0000\U00010400\U0001F4A9; GREEK
>> UPSILON WITH HOOK <http://codepoints.net/U+03D2>, COMBINING ACUTE
>> ACCENT <http://codepoints.net/U+0301>, NULL
>> <http://codepoints.net/U+0000>, DESERET CAPITAL LETTER LONG I
>> <http://codepoints.net/U+10400>, PILE OF POO
>> <http://codepoints.net/U+1F4A9>)
>> - Encrypted key:
>> 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn
>> - Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF
>> - Unencrypted private key (WIF):
>> 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4
>> - *Note:* The non-standard UTF-8 characters in this passphrase should
>> be NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a=
9 before
>> further processing
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------=
------
>> Want fast and easy access to all the code in your enterprise? Index and
>> search up to 200,000 lines of code with a free copy of Black Duck
>> Code Sight - the same software that powers the world's largest code
>> search on Ohloh, the Black Duck Open Hub! Try it now.
>> http://p.sf.net/sfu/bds
>> _______________________________________________
>> Bitcoin-development mailing list
>> Bitcoin-development@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>>
>>
>
>
> -------------------------------------------------------------------------=
-----
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
>
--001a11c2ab5c884c4204fe3cda91
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">I was part of adding in that test vector, and I think it&#=
39;s a good test vector since it is an extreme edge-case of the current def=
inition: If the BIP38 proposal allows any password that can be in UTF-8, NF=
C normalized form, those characters cover the various edge cases (combining=
characters, null character, astral range) that if your implementation does=
n't handle, then it can't really be said to be "BIP38-compatib=
le/compliant", right?<div>
<br></div><div>The "passphrase" in the test vector is NOT in NFC =
form; that's the point. Whatever implementation gets designed has to as=
sume the input is not already NFC-normalized and needs to handle/sanitize t=
hat input before further processing. To test your implementation for compli=
ance, you should not be inputting the NFC-normalized bytestring as the pass=
word input, you should be entering the original passphrase as the test. My =
original pull request for this change (<a href=3D"https://github.com/bitcoi=
n/bips/pull/29">https://github.com/bitcoin/bips/pull/29</a>) shows a Python=
and a NodeJS way to input that test vector password as intended.<br>
<div><br></div><div>Some input devices may already handle the input as NFC,=
which is great, but per the BIP38 proposal, that shouldn't be assumed,=
so various implementations are cross-compatible. If one implementation ass=
umes the input is already NFC, they may encode/decode the password incorrec=
tly, and lock a user out of their wallet. Android allows different user key=
boards to be used, so I'm guessing there's one somewhere that allow=
s manual entry of unicode codepoints that could be used to enter a null cha=
racter, and with the next version of iOS, Apple devices will also get custo=
m keyboard options, too, so even if the default Apple keyboard does NFC-for=
m properly, other developers' keyboards may not. So while it is an extr=
eme edge case, that is not very likely to be used as a "real password&=
quot; by any user, that's what test vectors are for: to test for the ed=
ge case that you might not have expected and handled in your implementation=
.</div>
<div><br></div><div>Brooks<br><div class=3D"gmail_extra"><br><br><div class=
=3D"gmail_quote">On Tue, Jul 15, 2014 at 8:07 AM, Eric Winer <span dir=3D"l=
tr"><<a href=3D"mailto:enwiner@gmail.com" target=3D"_blank">enwiner@gmai=
l.com</a>></span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex"><div dir=3D"ltr"><div>I don't know for sure if the tes=
t vector is correct NFC form. =C2=A0But for what it's worth, the Pile o=
f Poo character is pretty easily accessible on the iPhone and Android keybo=
ards, and in this string it's already in NFC form (f09f92a9 in the test=
result). =C2=A0I've certainly seen it in usernames around the internet=
, and wouldn't be surprised to see it in passphrases entered on smartph=
ones, especially if the author of a BIP38-compatible app includes a (possib=
ly ill-advised) suggestion to have your passphrase "include special ch=
aracters".<br>
</div><div><br></div><div>I haven't seen the NULL character on any smar=
tphone keyboards, though - I assume the iOS and Android developers had the =
foresight to know how much havoc that would wreak on systems assuming null-=
terminated strings. =C2=A0It seems unlikely that NULL would be in a real-wo=
rld passphrase entered by a sane user.</div>
</div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Tue,=
Jul 15, 2014 at 8:03 AM, Mike Hearn <span dir=3D"ltr"><<a href=3D"mailt=
o:mike@plan99.net" target=3D"_blank">mike@plan99.net</a>></span> wrote:<=
br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bord=
er-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:soli=
d;padding-left:1ex">
<div dir=3D"ltr">[+cc aaron]<div><br></div><div>We recently added an implem=
entation of BIP 38 (password protected private keys) to bitcoinj. It came t=
o my attention that the third test vector may be broken. It gives a hex ver=
sion of what the NFC normalised version of the input string should be, but =
this does not match the results of the Java unicode normaliser, and in fact=
I can't even get Python to print the names of the characters past the =
embedded null. I'm curious where this normalised version came from.<div=
>
<br></div><div>Given that "pile of poo" is not a character I thin=
k any sane user would put into a passphrase, I question the value of this t=
est vector. NFC form is intended to collapse things like umlaut control cha=
racters onto their prior code point, but here we're feeding the algorit=
hm what is basically garbage so I'm not totally surprised that differen=
t implementations appear to disagree on the outcome.</div>
<div><br></div><div>Proposed action: we remove this test vector as it does =
not represent any real world usage of the spec, or if we desperately need t=
o verify NFC normalisation I suggest using a different, more realistic test=
string, like Z=C3=BCrich, or something written in Thai.<br>
<div><br></div><div><br><div><br></div><div><span style=3D"color:rgb(51,51,=
51);font-family:Helvetica,arial,freesans,clean,sans-serif,'Segoe UI Emo=
ji','Segoe UI Symbol';font-size:15.454545021057129px;line-heigh=
t:23.18181800842285px">Test 3:</span><ul style=3D"padding:0px 0px 0px 30px;=
margin:15px 0px;color:rgb(51,51,51);font-family:Helvetica,arial,freesans,cl=
ean,sans-serif,'Segoe UI Emoji','Segoe UI Symbol';font-size=
:15.454545021057129px;line-height:23.18181800842285px">
<li>Passphrase =CF=92=CC=81=E2=90=80=F0=90=90=80=F0=9F=92=A9 (<tt style=3D"=
font-family:Consolas,'Liberation Mono',Menlo,Courier,monospace;font=
-size:12px;margin:0px;border:1px solid rgb(221,221,221);border-top-left-rad=
ius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-b=
ottom-left-radius:3px;padding:0px;background-color:rgb(248,248,248)">\u03D2=
\u0301\u0000\U00010400\U0001F4A9</tt>;=C2=A0<a href=3D"http://codepoints.ne=
t/U+03D2" style=3D"color:rgb(65,131,196);text-decoration:none;background:tr=
ansparent" target=3D"_blank">GREEK UPSILON WITH HOOK</a>,=C2=A0<a href=3D"h=
ttp://codepoints.net/U+0301" style=3D"color:rgb(65,131,196);text-decoration=
:none;background:transparent" target=3D"_blank">COMBINING ACUTE ACCENT</a>,=
=C2=A0<a href=3D"http://codepoints.net/U+0000" style=3D"color:rgb(65,131,19=
6);text-decoration:none;background:transparent" target=3D"_blank">NULL</a>,=
=C2=A0<a href=3D"http://codepoints.net/U+10400" style=3D"color:rgb(65,131,1=
96);text-decoration:none;background:transparent" target=3D"_blank">DESERET =
CAPITAL LETTER LONG I</a>,=C2=A0<a href=3D"http://codepoints.net/U+1F4A9" s=
tyle=3D"color:rgb(65,131,196);text-decoration:none;background:transparent" =
target=3D"_blank">PILE OF POO</a>)</li>
<li>Encrypted key: 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcD=
Qn</li><li>Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF</li><li>Unen=
crypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTM=
SZ4</li>
<li><i>Note:</i>=C2=A0The non-standard UTF-8 characters in this passphrase =
should be NFC normalized to result in a passphrase of<tt style=3D"font-fami=
ly:Consolas,'Liberation Mono',Menlo,Courier,monospace;font-size:12p=
x;margin:0px;border:1px solid rgb(221,221,221);border-top-left-radius:3px;b=
order-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-lef=
t-radius:3px;padding:0px;background-color:rgb(248,248,248)">0xcf9300f090908=
0f09f92a9</tt>=C2=A0before further processing</li>
</ul><div><font color=3D"#333333" face=3D"Helvetica, arial, freesans, clean=
, sans-serif, Segoe UI Emoji, Segoe UI Symbol"><span style=3D"font-size:15p=
x;line-height:23.18181800842285px"><br></span></font></div></div><div><br><=
/div>
</div></div></div></div>
<br>-----------------------------------------------------------------------=
-------<br>
Want fast and easy access to all the code in your enterprise? Index and<br>
search up to 200,000 lines of code with a free copy of Black Duck<br>
Code Sight - the same software that powers the world's largest code<br>
search on Ohloh, the Black Duck Open Hub! Try it now.<br>
<a href=3D"http://p.sf.net/sfu/bds" target=3D"_blank">http://p.sf.net/sfu/b=
ds</a><br>_______________________________________________<br>
Bitcoin-development mailing list<br>
<a href=3D"mailto:Bitcoin-development@lists.sourceforge.net" target=3D"_bla=
nk">Bitcoin-development@lists.sourceforge.net</a><br>
<a href=3D"https://lists.sourceforge.net/lists/listinfo/bitcoin-development=
" target=3D"_blank">https://lists.sourceforge.net/lists/listinfo/bitcoin-de=
velopment</a><br>
<br></blockquote></div><br></div>
<br>-----------------------------------------------------------------------=
-------<br>
Want fast and easy access to all the code in your enterprise? Index and<br>
search up to 200,000 lines of code with a free copy of Black Duck<br>
Code Sight - the same software that powers the world's largest code<br>
search on Ohloh, the Black Duck Open Hub! Try it now.<br>
<a href=3D"http://p.sf.net/sfu/bds" target=3D"_blank">http://p.sf.net/sfu/b=
ds</a><br>_______________________________________________<br>
Bitcoin-development mailing list<br>
<a href=3D"mailto:Bitcoin-development@lists.sourceforge.net">Bitcoin-develo=
pment@lists.sourceforge.net</a><br>
<a href=3D"https://lists.sourceforge.net/lists/listinfo/bitcoin-development=
" target=3D"_blank">https://lists.sourceforge.net/lists/listinfo/bitcoin-de=
velopment</a><br>
<br></blockquote></div><br></div></div></div></div>
--001a11c2ab5c884c4204fe3cda91--
|