Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 3515F5A8 for ; Fri, 9 Nov 2018 05:18:38 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-yb1-f175.google.com (mail-yb1-f175.google.com [209.85.219.175]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C390375B for ; Fri, 9 Nov 2018 05:17:42 +0000 (UTC) Received: by mail-yb1-f175.google.com with SMTP id d18-v6so530392yba.4 for ; Thu, 08 Nov 2018 21:17:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bitcoinbank.co.jp; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ipaaIs/FsJzcBvhdbaM9jDz4ZvLpQjOpQggJ9Ns6M/0=; b=SGP/If4nbU8ApHpbnuO8RtcG0rWNp8JMKV4hfWMT7x7yjaf/EEXn9CrZmIRMlDzWC3 +jg6r4Q7Yd/mObllyGz2/vzYwDQ6Oe1mLsNMXHQnammyANc47h4NF5jnlQWX6NWQaZpc 9lXyEB6tXtofZOPJ/hkXI6EbcXOAJ7eW2b0yTWv8b/aPrLoOPLf1CpUJ9eJm8mBVeWmD znPonAseIqM0QESXc/CUySKoJrywCurK45HAZGwm7w0233zRaaDXXHAW8kEkiDJlzRj/ 2w9RtBnj8ZHEtb8UUNVci92pNd1X+a1Kzd0AQ/K5VjU02m7zneIHOzXjiJb2NWCRNoRO akMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ipaaIs/FsJzcBvhdbaM9jDz4ZvLpQjOpQggJ9Ns6M/0=; b=m5wZsfwu80JeYFhX9vhgmgjH3w4Cx7C+Ao73hA0w7PH5bChMSzDkr2YPqomZz0rRWP UKgju/zyLRm4cqr7jVobzHB/JeARVZlvlYj6c/BKoxN2Mcxsucsk/7ZxsZVXkCMrhcwK ehxNhGggOPC8xzK0hF5smriWWRFSSKn6GiCTRha7rmqfNH9YyGZva/NmX9hq8Sx2+jHC gNwntSiUadDcf54LBY4I8pVsdSzu1IxnUGLEgWhsqZTWC2xhgXujY6AMUoDCu+Mzs+Pt tnB+D9MtBNU9WCj3o1K/6iTO0xsBxTGETha+sfp2HqmKGekrJ5J5oEbsaEBtP7JQ5Boe kMwQ== X-Gm-Message-State: AGRZ1gJBLh3n2KJnqqgoVzLaohOgvbgdCs+3G1glHz7t+3ZNRmrmyf21 Uu6Ha7bqq+81pIQZyxqRonwLEG3DAoKCfZ/rKSWa X-Google-Smtp-Source: AJdET5cr5yAsu4p/bNG0AABafP7ObciyU9YMvbWFgo/rWMr9hVLTzqZyqvlOLoetvmTHnRPM++CdlqbKjpycHBAZ0qc= X-Received: by 2002:a25:a226:: with SMTP id b35-v6mr7142826ybi.231.1541740661102; Thu, 08 Nov 2018 21:17:41 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Jonathan Underwood Date: Fri, 9 Nov 2018 14:17:30 +0900 Message-ID: To: somber.night@protonmail.com, bitcoin-dev@lists.linuxfoundation.org Content-Type: multipart/alternative; boundary="00000000000070f5d5057a34780b" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Fri, 09 Nov 2018 06:11:46 +0000 Subject: Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Nov 2018 05:18:38 -0000 --00000000000070f5d5057a34780b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > as it seems bad design to have to fix and maintain a wordlist for every language as the checksum depends on it. From BIP39: > The conversion of the mnemonic sentence to a binary seed is completely independent from generating the sentence. This results in rather simple code; there are no constraints on sentence structure and clients are free to implement their own wordlists or even whole sentence generators, allowing for flexibility in wordlists for typo detection or other purposes. > > Although using a mnemonic not generated by the algorithm described in "Generating the mnemonic" section is possible, this is not advised and software must compute a checksum for the mnemonic sentence using a wordlist and issue a warning if it is invalid. So BIP39 states "no constraints on sentence structure and clients are free to implement their own wordlists or even whole sentence generators" and yet at the same time one paragraph later "this is not advised and software must compute a checksum for the mnemonic sentence using a wordlist and issue a warning if it is invalid"... My interpretation of this: 1. ChecksumCheck function attempts to 1. find the wordlist 2. calculate the checksum. 2. If it fails to find the wordlist, return false 3. If the checksum doesn't match return false 4. If ChecksumCheck returns false, "issue a warning" but do not block seed generation. "We couldn't check if your phrase is correct... you're on your own" 99.99% of implementing apps interpretation: (remember, error handling for userspace is not done by the BIP39 library, but the app that uses it) 1. Run ChecksumCheck 2. If False, hard fail, do not allow seed generation. If more apps would implement to the word of the BIP39 spec, multiple languages make sense, but since reality is no one follows the spec (/the spec is way too open to interpretation) then expecting every app to load every language is unreasonable. Electrum actually handles BIP39 recovery the way the BIP specifies. I can restore random strings if I want, and it warns me, and I can ignore it if I wish. Anywho. The BIP39 multi-language feature is crucial for non-English speakers especially from Asia. Maybe northern Europeans have no problem with English word spelling, but watching a normal Japanese person write down their English mnemonic is painful. One letter at a time, worried they wrote it wrong... still make mistakes... lose money because of it. Whereas users of Copay etc. that support Japanese wordlist write down their seed easily, and I have never heard of a Japanese newbie complaining about "but I'm writing it just how I have it written down" about their Japanese seed... only English. Not trying to give anyone a hard time, just telling the facts: lack of localized words for recovery phrase causes more money loss than supporting it. (When push comes to shove, at the very least Electrum will always support their recovery because it lets you hash anything) This is all anecdotal of course. Just sharing my experience evangelizing in Japan. Thanks, Jon 2018=E5=B9=B411=E6=9C=888=E6=97=A5(=E6=9C=A8) 21:16 SomberNight via bitcoin= -dev < bitcoin-dev@lists.linuxfoundation.org>: > Do you specifically want to support changing the language of seed > words, while keeping the bip32 root seed they generate unchanged? > What is the usecase for this? > > You mention that BIP39 already supports a few different languages. > While this is true, many (I would guess most!) wallets only > support the English wordlist. > There are doubts even from the authors of the BIP whether it was > a good idea in the first place to support multiple languages [0]. > I don't find this surprising as it seems bad design to have to fix and > maintain a wordlist for every language as the checksum depends on it. > The supported wordlists are effectively a part of the specification, > and every new list would just make that specification larger. > > If changing the language of seeds is not a requirement, then look > into Electrum seeds. They are language/wordlist agnostic. > > Mnemonic Sentence =3D> PBKDF2 =3D> BIP-0032 Seed > > The bip32 seed is derived by hashing the normalized mnemonic, and the > checksum is derived the same way but by using a different cheaper > hash (single round of HMAC-SHA512; generation grinds until it matches > a pattern) [1]. For example, "9dk" is a valid segwit electrum seed. > > > [0]: > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-January/0155= 07.html > [1]: http://docs.electrum.org/en/latest/seedphrase.html > > > > Date: Wed, 7 Nov 2018 00:16:41 +0800 > > From: Weiji Guo weiji.g@gmail.com > > Subject: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language > > support > > > > Hello everyone, > > > > I just realized that BIP-0039 is language dependent. I was assuming the > > other way till I looked closer. The way the seed is derived from a > BIP-0039 > > entropy, as is shown below, depends on which language to generate the > > mnemonic sentence: > > > > Entropy <=3D> Mnemonic Sentence =3D> PBKDF2 =3D> BIP-0032 Seed > > > > Therefore when a user choose a non-English mnemonic code he or she is > stuck > > with that language. Meanwhile only a few native languages are supported= . > > > > SLIP-0039 does not solve this issue in a user friendly way by providing > > only an English wordlist. That's understandable as it aims to provide S= SS > > capability. However those users who do not speak English or recognize > > English words will suffer. > > > > What I am trying to bring to attention of the community is that, no > matter > > if we make a new version of BIP-0039, or a new BIP (with SSS support), = or > > to enhance SLIP-0039, we really need to address this language issue. > > > > Here are what I propose: > > > > 1. The mnemonic code should be only a representation of underlying > entropy > > or (pre) master secret, seed, whatever. In this way, the same > seed/secret > > could be displayed in English or in Chinese or other languages. The= n > there > > could be 3rd party conversion tools to support translations in case > any > > wallet software or device does not support all specified languages. > Now it > > looks like: > > > > Mnemonic Sentence <=3D> Entropy =3D> PBKDF2 =3D> BIP-0032 Seed > > > > > > 2. Given that only 8 languages are supported in BIP-0039, we should all= ow > > the seed/secret to be represented in decimal numbers, each ranging from= 0 > > to 2047. So those who cannot find a native language support yet having > > difficulty coping words in other languages could choose to just use > numbers. > > > > So far I don't have a preference how this should be implemented. I'd li= ke > > to hear from community first. > > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > --=20 ----------------- Jonathan Underwood =E3=83=93=E3=83=83=E3=83=88=E3=83=90=E3=83=B3=E3=82=AF=E7=A4=BE =E3=83=81= =E3=83=BC=E3=83=95=E3=83=93=E3=83=83=E3=83=88=E3=82=B3=E3=82=A4=E3=83=B3=E3= =82=AA=E3=83=95=E3=82=A3=E3=82=B5=E3=83=BC ----------------- =E6=9A=97=E5=8F=B7=E5=8C=96=E3=81=97=E3=81=9F=E3=83=A1=E3=83=83=E3=82=BB=E3= =83=BC=E3=82=B8=E3=82=92=E3=81=8A=E9=80=81=E3=82=8A=E3=81=AE=E6=96=B9=E3=81= =AF=E4=B8=8B=E8=A8=98=E3=81=AE=E5=85=AC=E9=96=8B=E9=8D=B5=E3=82=92=E3=81=94= =E5=88=A9=E7=94=A8=E4=B8=8B=E3=81=95=E3=81=84=E3=80=82 =E6=8C=87=E7=B4=8B: 0xCE5EA9476DE7D3E45EBC3FDAD998682F3590FEA3 --00000000000070f5d5057a34780b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
>=C2=A0 as it seems bad design to= have to fix and maintain a wordlist for every language as the checksum dep= ends on it.

From BIP39:

&= gt; The conversion of the mnemonic sentence to a binary seed is completely = independent from generating the sentence. This results in rather simple cod= e; there are no constraints on sentence structure and clients are free to i= mplement their own wordlists or even whole sentence generators, allowing fo= r flexibility in wordlists for typo detection or other purposes.
= >=C2=A0
> Although using a mnemonic not generated by the al= gorithm described in "Generating the mnemonic" section is possibl= e, this is not advised and software must compute a checksum for the mnemoni= c sentence using a wordlist and issue a warning if it is invalid.

So BIP39 states "no constraints on sentence structure = and clients are free to implement their own wordlists or even whole sentenc= e generators" and yet at the same time one paragraph later "this = is not advised and software must compute a checksum for the mnemonic senten= ce using a wordlist and issue a warning if it is invalid"...

My interpretation of this:

1. Chec= ksumCheck function attempts to 1. find the wordlist 2. calculate the checks= um.
2. If it fails to find the wordlist, return false
3= . If the checksum doesn't match return false
4. If ChecksumCh= eck returns false, "issue a warning" but do not block seed genera= tion. "We couldn't check if your phrase is correct... you're o= n your own"

99.99% of implementing apps inter= pretation: (remember, error handling for userspace is not done by the BIP39= library, but the app that uses it)

1. Run Checksu= mCheck
2. If False, hard fail, do not allow seed generation.

If more apps would implement to the word of the BIP39 = spec, multiple languages make sense, but since reality is no one follows th= e spec (/the spec is way too open to interpretation) then expecting every a= pp to load every language is unreasonable.

Electru= m actually handles BIP39 recovery the way the BIP specifies. I can restore = random strings if I want, and it warns me, and I can ignore it if I wish.


Anywho. The BIP39 multi-language fea= ture is crucial for non-English speakers especially from Asia. Maybe northe= rn Europeans have no problem with English word spelling, but watching a nor= mal Japanese person write down their English mnemonic is painful.

One letter at a time, worried they wrote it wrong... still = make mistakes... lose money because of it.

Whereas= users of Copay etc. that support Japanese wordlist write down their seed e= asily, and I have never heard of a Japanese newbie complaining about "= but I'm writing it just how I have it written down" about their Ja= panese seed... only English.

Not trying to give an= yone a hard time, just telling the facts: lack of localized words for recov= ery phrase causes more money loss than supporting it. (When push comes to s= hove, at the very least Electrum will always support their recovery because= it lets you hash anything)

This is all anecdotal = of course. Just sharing my experience evangelizing in Japan.

=
Thanks,
Jon


2018=E5=B9=B411=E6=9C=888=E6=97=A5(= =E6=9C=A8) 21:16 SomberNight via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org>= ;:
Do you specifically want to supp= ort changing the language of seed
words, while keeping the bip32 root seed they generate unchanged?
What is the usecase for this?

You mention that BIP39 already supports a few different languages.
While this is true, many (I would guess most!) wallets only
support the English wordlist.
There are doubts even from the authors of the BIP whether it was
a good idea in the first place to support multiple languages [0].
I don't find this surprising as it seems bad design to have to fix and<= br> maintain a wordlist for every language as the checksum depends on it.
The supported wordlists are effectively a part of the specification,
and every new list would just make that specification larger.

If changing the language of seeds is not a requirement, then look
into Electrum seeds. They are language/wordlist agnostic.

Mnemonic Sentence =3D> PBKDF2 =3D> BIP-0032 Seed

The bip32 seed is derived by hashing the normalized mnemonic, and the
checksum is derived the same way but by using a different cheaper
hash (single round of HMAC-SHA512; generation grinds until it matches
a pattern) [1]. For example, "9dk" is a valid segwit electrum see= d.


[0]: https://lists.l= inuxfoundation.org/pipermail/bitcoin-dev/2018-January/015507.html
[1]: http://docs.electrum.org/en/latest/seedphrase= .html


> Date: Wed, 7 Nov 2018 00:16:41 +0800
> From: Weiji Guo weiji.g@gmail.com
> Subject: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language > support
>
> Hello everyone,
>
> I just realized that BIP-0039 is language dependent. I was assuming th= e
> other way till I looked closer. The way the seed is derived from a BIP= -0039
> entropy, as is shown below, depends on which language to generate the<= br> > mnemonic sentence:
>
> Entropy <=3D> Mnemonic Sentence =3D> PBKDF2 =3D> BIP-0032 = Seed
>
> Therefore when a user choose a non-English mnemonic code he or she is = stuck
> with that language. Meanwhile only a few native languages are supporte= d.
>
> SLIP-0039 does not solve this issue in a user friendly way by providin= g
> only an English wordlist. That's understandable as it aims to prov= ide SSS
> capability. However those users who do not speak English or recognize<= br> > English words will suffer.
>
> What I am trying to bring to attention of the community is that, no ma= tter
> if we make a new version of BIP-0039, or a new BIP (with SSS support),= or
> to enhance SLIP-0039, we really need to address this language issue. >
> Here are what I propose:
>
> 1.=C2=A0 The mnemonic code should be only a representation of underlyi= ng entropy
>=C2=A0 =C2=A0 =C2=A0or (pre) master secret, seed, whatever. In this way= , the same seed/secret
>=C2=A0 =C2=A0 =C2=A0could be displayed in English or in Chinese or othe= r languages. Then there
>=C2=A0 =C2=A0 =C2=A0could be 3rd party conversion tools to support tran= slations in case any
>=C2=A0 =C2=A0 =C2=A0wallet software or device does not support all spec= ified languages. Now it
>=C2=A0 =C2=A0 =C2=A0looks like:
>
>=C2=A0 =C2=A0 =C2=A0Mnemonic Sentence <=3D> Entropy =3D> PBKDF= 2 =3D> BIP-0032 Seed
>
>
> 2. Given that only 8 languages are supported in BIP-0039, we should al= low
> the seed/secret to be represented in decimal numbers, each ranging fro= m 0
> to 2047. So those who cannot find a native language support yet having=
> difficulty coping words in other languages could choose to just use nu= mbers.
>
> So far I don't have a preference how this should be implemented. I= 'd like
> to hear from community first.

_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev


--
-----------------
=
Jonathan Underwood
=E3=83=93=E3=83=83=E3=83=88=E3=83=90=E3= =83=B3=E3=82=AF=E7=A4=BE=E3=80=80=E3=83=81=E3=83=BC=E3=83=95=E3=83=93=E3=83= =83=E3=83=88=E3=82=B3=E3=82=A4=E3=83=B3=E3=82=AA=E3=83=95=E3=82=A3=E3=82=B5= =E3=83=BC
-----------------

=E6=9A=97=E5= =8F=B7=E5=8C=96=E3=81=97=E3=81=9F=E3=83=A1=E3=83=83=E3=82=BB=E3=83=BC=E3=82= =B8=E3=82=92=E3=81=8A=E9=80=81=E3=82=8A=E3=81=AE=E6=96=B9=E3=81=AF=E4=B8=8B= =E8=A8=98=E3=81=AE=E5=85=AC=E9=96=8B=E9=8D=B5=E3=82=92=E3=81=94=E5=88=A9=E7= =94=A8=E4=B8=8B=E3=81=95=E3=81=84=E3=80=82

=E6=8C= =87=E7=B4=8B: 0xCE5EA9476DE7D3E45EBC3FDAD998682F3590FEA3
<= /div>
--00000000000070f5d5057a34780b--