Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 1F1D8CA4 for ; Fri, 5 Jan 2018 16:04:18 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C2DE64EB for ; Fri, 5 Jan 2018 16:04:16 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 8E01821726; Fri, 5 Jan 2018 11:04:15 -0500 (EST) Received: from frontend2 ([10.202.2.161]) by compute1.internal (MEProxy); Fri, 05 Jan 2018 11:04:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sprovoost.nl; h= cc:content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=XOUCJ9uFd4pGlr6QsFx+UpkW99V63ZrbmTP9AZ29i/c=; b=rMrBixeA ZRk7a1fUtcenMxq1OwfYrYrAb/HyicICvrvWflUSVkw0EwXi4KNSEuk3mFMzKmje 5n5zuxgZeTRV6s52K1BFEgmfxVSgYi1RWGjlsQbTxGZ9bwZNl4o6t1gsVZhiazmw KAcyhMygm9Luwz7ONfLjrt7WxYpxExyA/xrdlI1bYIdjQU3+EXve8k+7KD+xUFfR MHh069C8KHE1JM1qSMVITwJdP4dHfIwRvkJgindWiIiJ0EZ0jQ8ebFaJ2XpdkxY2 GupofqwJqZumD6ocZGXdJ9Phwb79ZquHbFC5vm+vmeBvliQF8iKq06qAHFUekTbI mpTCZErbXv7bxw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=XOUCJ9uFd4pGlr6QsFx+UpkW99V63 ZrbmTP9AZ29i/c=; b=kla0UjIS389pIVAestsPx7anpOzQ1GFTPEQtoCQH9LPPh 0pxMpOIgAbzslOdOuI4uO89+EmeANq7ONXOEkn8FEbfzWdgSTHRsx6EnGQfiwwWm 9YCXWVuXT+v1T9IbVap3ax/dbk8d4Y9wSrMVb02PNLyJE/lHmuDLqo/ersIvouGv JxVla60jJ9sMtAMs4ew9Dh6t3EtfUxVXe9bxkdvSOb+98uCCD5wygcbNf1cbv4RW k+J3/7hXp9qPo0uPBY4Ed+2i2KLL3t4GjEUTwUxLGVueekPgFb8Yrwwo4NgGFt/o NmSqI+oF2b94S+epD3NNDOsWW93htz0YPa480B5Ug== X-ME-Sender: Received: from [192.168.178.185] (54693d0f.cm-12-2a.dynamic.ziggo.nl [84.105.61.15]) by mail.messagingengine.com (Postfix) with ESMTPA id 96D9A24802; Fri, 5 Jan 2018 11:04:14 -0500 (EST) From: Sjors Provoost Message-Id: Content-Type: multipart/signed; boundary="Apple-Mail=_471A42E5-E9C2-4E24-BD65-764E85273817"; protocol="application/pgp-signature"; micalg=pgp-sha256 Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\)) Date: Fri, 5 Jan 2018 17:04:10 +0100 In-Reply-To: <57f5fcd8644c6f6472cd6a91144a6152@nym.zone> To: nullius , Bitcoin Protocol Discussion References: <57f5fcd8644c6f6472cd6a91144a6152@nym.zone> X-Mailer: Apple Mail (2.3445.5.20) X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, MIME_QP_LONG_LINE, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Fri, 05 Jan 2018 16:08:50 +0000 Cc: arachnid@notdot.net Subject: Re: [bitcoin-dev] BIP 39: Add language identifier strings for wordlists X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jan 2018 16:04:18 -0000 --Apple-Mail=_471A42E5-E9C2-4E24-BD65-764E85273817 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 I=E2=80=99m not a fan of language specific word lists within the current = BIP-39 standard. Very few wallets support anything other than English, = which can lead to vendor lock-in and long term loss of funds if a rare = non-English wallet disappears. However, because people can memorize things better in their native = tongue, supporting multiple languages seems quite useful. I would prefer a new standard where words are mapped to integers rather = than to a literal string. For each language a mapping from words to = integers would be published. In addition to that, there would be a = mapping from original language words to matching (in terms of integer = value, not meaning) English words that people can print on an A4 paper. = This would allow them to enter a mnemonic into e.g. a hardware wallet = that only support English. Such lists are more likely to be around 100 = years from now than some ancient piece of software. This would not work with the current BIP-39 (duress) password, but this = feature could be replaced by appending words (with or without a checksum = for that addition). A replacement for BIP-39 would be a good opportunity to produce a better = English dictionary as Nic Johnson suggested a while ago: =E2=80=A2 all words are 4-8 characters =E2=80=A2 all 4-character prefixes are unique (very useful for = hardware wallets) =E2=80=A2 no two words have edit distance < 2 Wallets need to be able to distinguish between the old and new standard, = so un-upgraded BIP 39 wallets should consider all new mnemonics invalid. = At the same time, some new wallets may not wish to support BIP39. They = shouldn't be burdened with storing the old word list. A solution is to sort the new word list such that reused words appear = first. When generating a mnemonic, at least one word unique to the new = list must be present. A wallet only needs to know the index of the last = BIP39 overlapping word. They reject a proposed mnemonic if none of the = elements use a word with a higher index. For my above point and some related ideas, see: = https://github.com/satoshilabs/slips/issues/103 Sjors > Op 5 jan. 2018, om 14:58 heeft nullius via bitcoin-dev = het volgende geschreven: >=20 > I propose and request as an enhancement that the BIP 39 wordlist set = should specify canonical native language strings to identify each = wordlist, as well as short ASCII language codes. At present, the = languages are identified only by their names in English. >=20 > Strings properly vetted and recommended by native speakers should = facilitate language identification in user interface options or menus. = Specification of language identifier strings would also promote = interface consistency between implementations; this may be important if = a user creates a mnemonic in Implementation A, then restores a wallet = using that mnemonic in Implementation B. >=20 > As an independent implementer who does not know *all* these different = languages, I monkey-pasted language-native strings from a popular wiki = site. I cannot guarantee that they be all accurate, sensible, or even = non-embarrassing. >=20 > = https://github.com/nym-zone/easyseed/blob/1a6e48bbdac9366d9d5d1912dc062dfc= 3f0db2c6/easyseed.c#L99 > ``` > LANG(english, u8"English", "en", = ascii_space ), > LANG(chinese_simplified, u8"=E6=B1=89=E8=AF=AD", = "zh-CN",ascii_space ), > LANG(chinese_traditional, u8"=E6=BC=A2=E8=AA=9E", = "zh-TW",ascii_space ), > LANG(french, u8"Fran=C3=A7ais", "fr", = ascii_space ), > LANG(italian, u8"Italiano", "it", = ascii_space ), > LANG(japanese, u8"=E6=97=A5=E6=9C=AC=E8=AA=9E", = "ja", u8"\u3000" ), > LANG(korean, u8"=ED=95=9C=EA=B5=AD=EC=96=B4", = "ko", ascii_space ), > LANG(spanish, u8"Espa=C3=B1ol", "es", = ascii_space ) > ``` >=20 > Per the comment at #L85 of the quoted file, I also know that for my = short identifiers for Chinese, =E2=80=9Czh-CN=E2=80=9D and =E2=80=9Czh-TW=E2= =80=9D, are imprecise at best=E2=80=94insofar as Hong Kong uses = Traditional; and overseas Chinese may use either. For differentiating = the two Chinese writing variants, are there any appropriate standardized = or customary short ASCII language IDs similar to ISO 3166-1 alpha-2 = which are purely linguistic, and not fit to present-day political = boundaries? >=20 > My general suggestion is that the specification of appropriate strings = in bitcoin:bips/bip-0039/bip-0039-wordlists.md be made part of the = process for accepting new wordlists. My specific request is that such = strings be ascertained for the wordlists already existing, preferably = from the persons involved in the original pull requests therefor. >=20 > Should this proposal be =E2=80=9Cconcept ACKed=E2=80=9D by appropriate = parties, then I may open a pull request suggesting an appropriate format = for specifying this information in the repository. However, I will must = needs leave the vetting of appropriate strings to native speakers or = experts in the respective languages. >=20 > Prior references: The wordlist additions at PRs #92, #130 (Japanese); = #100 (Spanish); #114 (Chinese, both variants); #152 (French); #306 = (Italian); #570 (Korean); #621 (Indonesian, *proposed*, open). > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev --Apple-Mail=_471A42E5-E9C2-4E24-BD65-764E85273817 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE7ZvfetalXiMuhFJCV/+b28wwEAkFAlpPofoACgkQV/+b28ww EAmvORAAsMK//Ba1xM6yu18BB4fGTNt/001Ee6PWGKgwsUFpK68w0N0eTdSMI7N1 ArVQiOMdaVSvBKrzVZvAl5+lwzYWhcAK4z6ntApvFyyyD2Oa1UmxHWWPfNMK3aGR zNqC8KeRxuUbyfiz4uIPaRj04BL0OtiM3PpN/aXnOqAJzrDVOkJLIxBIM3JslI1B cz/bbQTbzpte2ND79utd8OlNwapCfvWcGENwqqMqlnbKVfuRGuGayyGjRM/TSXpB TByDv3cAGkakVujbEMCFvJrR85w/ddHfyxvjScacumMNHwn18rvNFqzdtjhFy8sJ /L88i7ZprW5wp2d5JPqiiOl962J75laf5JUWIVvwcwttgm2Lyzbg51jmwTV38mdZ JJ8yDAgONwz/ivoxcQb4TyW6F3+CV0aao8I9C69qS7KHtdi43X+qms/EwtRSgFT8 IoFnS8XjLErH75ATjHWps6WxLhhrsnMxms4Onj8XNTVLS2IC97XnrCXW2j6G4GOe pyiVFH/j5Zw8ku/VIAegeR7C0dgW7IuxvHX6aBffuSjPZG47+m5FBqhq4TQkAKO2 YKc9MrjPnZxPWfD/OJxP/EsV58RC0DnLxY0v64JA/ft9W0PVBrtrQO6Y2oefJ+UZ UaWVVy/0m5Ck5OPVsTNvF+DZjUQCrghh2iAkeWlH1lluuJQox2Y= =vnAG -----END PGP SIGNATURE----- --Apple-Mail=_471A42E5-E9C2-4E24-BD65-764E85273817--