Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id BB16EBE7 for ; Fri, 5 Jan 2018 13:59:00 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from mx2.mailbox.org (mx2.mailbox.org [80.241.60.215]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 2F72CCE for ; Fri, 5 Jan 2018 13:59:00 +0000 (UTC) Received: from smtp2.mailbox.org (smtp2.mailbox.org [80.241.60.241]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx2.mailbox.org (Postfix) with ESMTPS id 8CC644CA2B; Fri, 5 Jan 2018 14:58:58 +0100 (CET) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by spamfilter01.heinlein-hosting.de (spamfilter01.heinlein-hosting.de [80.241.56.115]) (amavisd-new, port 10030) with ESMTP id tZUx_KFmdDUV; Fri, 5 Jan 2018 14:58:54 +0100 (CET) Date: Fri, 5 Jan 2018 13:58:37 +0000 From: nullius To: bitcoin-dev@lists.linuxfoundation.org Message-ID: <57f5fcd8644c6f6472cd6a91144a6152@nym.zone> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="3z3vmhlva5wzti7s" Content-Disposition: inline X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Fri, 05 Jan 2018 14:55:57 +0000 Subject: [bitcoin-dev] BIP 39: Add language identifier strings for wordlists X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jan 2018 13:59:00 -0000 --3z3vmhlva5wzti7s Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I propose and request as an enhancement that the BIP 39 wordlist set=20 should specify canonical native language strings to identify each=20 wordlist, as well as short ASCII language codes. At present, the=20 languages are identified only by their names in English. Strings properly vetted and recommended by native speakers should=20 facilitate language identification in user interface options or menus. =20 Specification of language identifier strings would also promote=20 interface consistency between implementations; this may be important if=20 a user creates a mnemonic in Implementation A, then restores a wallet=20 using that mnemonic in Implementation B. As an independent implementer who does not know *all* these different=20 languages, I monkey-pasted language-native strings from a popular wiki=20 site. I cannot guarantee that they be all accurate, sensible, or even=20 non-embarrassing. https://github.com/nym-zone/easyseed/blob/1a6e48bbdac9366d9d5d1912dc062dfc3= f0db2c6/easyseed.c#L99 ``` LANG(english, u8"English", "en", ascii_space ), LANG(chinese_simplified, u8"=E6=B1=89=E8=AF=AD", "zh-CN",ascii_space ), LANG(chinese_traditional, u8"=E6=BC=A2=E8=AA=9E", "zh-TW",ascii_space ), LANG(french, u8"Fran=C3=A7ais", "fr", ascii_space ), LANG(italian, u8"Italiano", "it", ascii_space ), LANG(japanese, u8"=E6=97=A5=E6=9C=AC=E8=AA=9E", "ja", u8"\u3000" ), LANG(korean, u8"=ED=95=9C=EA=B5=AD=EC=96=B4", "ko", ascii_space ), LANG(spanish, u8"Espa=C3=B1ol", "es", ascii_space ) ``` Per the comment at #L85 of the quoted file, I also know that for my=20 short identifiers for Chinese, =E2=80=9Czh-CN=E2=80=9D and =E2=80=9Czh-TW= =E2=80=9D, are imprecise at=20 best=E2=80=94insofar as Hong Kong uses Traditional; and overseas Chinese ma= y use=20 either. For differentiating the two Chinese writing variants, are there=20 any appropriate standardized or customary short ASCII language IDs=20 similar to ISO 3166-1 alpha-2 which are purely linguistic, and not fit=20 to present-day political boundaries? My general suggestion is that the specification of appropriate strings=20 in bitcoin:bips/bip-0039/bip-0039-wordlists.md be made part of the=20 process for accepting new wordlists. My specific request is that such=20 strings be ascertained for the wordlists already existing, preferably=20 =66rom the persons involved in the original pull requests therefor. Should this proposal be =E2=80=9Cconcept ACKed=E2=80=9D by appropriate part= ies, then I=20 may open a pull request suggesting an appropriate format for specifying=20 this information in the repository. However, I will must needs leave=20 the vetting of appropriate strings to native speakers or experts in the=20 respective languages. Prior references: The wordlist additions at PRs #92, #130 (Japanese);=20 #100 (Spanish); #114 (Chinese, both variants); #152 (French); #306=20 (Italian); #570 (Korean); #621 (Indonesian, *proposed*, open). --3z3vmhlva5wzti7s Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEARYKAB0WIQSNOMR84IlYpr/EF5vEJ5MVn575SQUCWk+EjAAKCRDEJ5MVn575 SeAZAQDAMav04N7lmCLC6ZfE5Yc0aQ9Kh+foBsN6+tSmKeyG+gEAqEn/iaVwoMWj rfR7Vsw4zVsaEIVNOMn7KwdiCKgzLgM= =wcLY -----END PGP SIGNATURE----- --3z3vmhlva5wzti7s--