Received: from sog-mx-4.v43.ch3.sourceforge.com ([172.29.43.194] helo=mx.sourceforge.net) by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1VZKwS-000831-LW for bitcoin-development@lists.sourceforge.net; Thu, 24 Oct 2013 13:27:12 +0000 X-ACL-Warn: Received: from mail-vc0-f173.google.com ([209.85.220.173]) by sog-mx-4.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1VZKwR-0005Rv-4V for bitcoin-development@lists.sourceforge.net; Thu, 24 Oct 2013 13:27:12 +0000 Received: by mail-vc0-f173.google.com with SMTP id if17so1474473vcb.4 for ; Thu, 24 Oct 2013 06:27:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-type; bh=Rps7ru5t5xvvTY3yDbWJN1CqiC99hOTcetA/BfDOf4I=; b=ToSTRXsb6cihHCnnVz93XzfGVesKWJEIlrTqN6t9/hOQwZ8zOu/7R/uS3QaAHPR4Qz leYIxi1ihJ8P7TV1nNJPH/prRl4Ebe6qiJ3ziiCrri+NMOXq8UUxcQ77ObgDrMpnl41T IL88YH21pgaeKRpFj4RDFjTCesnOXhV5eR07LmvLnYEhaDUdQAYXl25AxWpPEQvqGo6P LV7P4zeIzILg3/XQ/ltlGIlOO0RwBbdj9EHeTzWoa6AI2TjXHffPaF+H998kzmO4+O8z YAXOihMUjhh9lsPPUSuHCmWtPEJzuK3LpnO48/zf5iAPmBpIjUeRwE93X/hd4LxQ+n7y zMnw== X-Gm-Message-State: ALoCoQmSvzFV+pRa0FXixAepC1jYJAs+BZNC4ErL/sShnEVM3mvBAQjCFVoBmAOrR5vp7/EUac7j X-Received: by 10.58.143.17 with SMTP id sa17mr1434465veb.14.1382621222510; Thu, 24 Oct 2013 06:27:02 -0700 (PDT) MIME-Version: 1.0 Sender: marek@palatinus.cz Received: by 10.59.1.2 with HTTP; Thu, 24 Oct 2013 06:26:32 -0700 (PDT) In-Reply-To: References: <87iowuuof9.fsf@gmail.com> From: slush Date: Thu, 24 Oct 2013 15:26:32 +0200 X-Google-Sender-Auth: n26fDXyZsZwBFyAPcUfkJafPe_A Message-ID: To: Gregory Maxwell Content-Type: multipart/alternative; boundary=047d7b6d95eed3b76804e97c9683 X-Spam-Score: 1.0 (+) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: doubleclick.net] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (slush[at]centrum.cz) 1.0 HTML_MESSAGE BODY: HTML included in message X-Headers-End: 1VZKwR-0005Rv-4V Cc: Bitcoin Development Subject: Re: [Bitcoin-development] BIP39 word list X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Oct 2013 13:27:12 -0000 --047d7b6d95eed3b76804e97c9683 Content-Type: text/plain; charset=ISO-8859-1 I've just pushed updated wordlist which is filtered to similar characters taken from this matrix. BIP39 now consider following character pairs as similar: similar = ( ('a', 'c'), ('a', 'e'), ('a', 'o'), ('b', 'd'), ('b', 'h'), ('b', 'p'), ('b', 'q'), ('b', 'r'), ('c', 'e'), ('c', 'g'), ('c', 'n'), ('c', 'o'), ('c', 'q'), ('c', 'u'), ('d', 'g'), ('d', 'h'), ('d', 'o'), ('d', 'p'), ('d', 'q'), ('e', 'f'), ('e', 'o'), ('f', 'i'), ('f', 'j'), ('f', 'l'), ('f', 'p'), ('f', 't'), ('g', 'j'), ('g', 'o'), ('g', 'p'), ('g', 'q'), ('g', 'y'), ('h', 'k'), ('h', 'l'), ('h', 'm'), ('h', 'n'), ('h', 'r'), ('i', 'j'), ('i', 'l'), ('i', 't'), ('i', 'y'), ('j', 'l'), ('j', 'p'), ('j', 'q'), ('j', 'y'), ('k', 'x'), ('l', 't'), ('m', 'n'), ('m', 'w'), ('n', 'u'), ('n', 'z'), ('o', 'p'), ('o', 'q'), ('o', 'u'), ('o', 'v'), ('p', 'q'), ('p', 'r'), ('q', 'y'), ('s', 'z'), ('u', 'v'), ('u', 'w'), ('u', 'y'), ('v', 'w'), ('v', 'y') ) Feel free to review and comment current wordlist, but I think we're slowly moving forward final list. slush On Sat, Oct 19, 2013 at 1:58 AM, Gregory Maxwell wrote: > some fairly old wordlist solver code of mine: > > https://people.xiph.org/~greg/wordlist.visual.py > > it has a 52x52 letter visual similarity matrix in it (along with a > citation) > > On Fri, Oct 18, 2013 at 4:52 PM, jan wrote: > > > > The words 'public', 'private' and 'secret' could be confusing when > > encoding public and private keys. eg. a private key that begins with > > the word 'public'. > > > > I think avoiding words that could look similar when written down would > > be a good idea aswell. I searched for words that only differ by the > > letters c & e, g & y, u & v and found the following: > > > > car ear > > cat eat > > gear year > > value valve > > > > Other combinations could potentially be problematic depending on the > > handwriting style: ft, ao, ij, vy, possibly even lt and il? > > > > I've included the search utility I used below. > > > > > > #include > > #include > > #include > > > > char *similar_char_pairs[] = { "ce", "gy", "uv", NULL }; > > > > bool is_similar_char(char c1, char c2) > > { > > char **pairs = similar_char_pairs; > > do { > > char *p = *pairs; > > if ((c1 == p[0] && c2 == p[1]) || > > (c1 == p[1] && c2 == p[0])) > > return true; > > } while (*++pairs); > > > > return false; > > } > > > > bool print_words_if_similar(char *word1, char *word2) > > { > > /* reject words of different lengths */ > > if (strlen(word1) != strlen(word2)) > > return false; > > > > size_t i, similarcount = 0; > > > > for (i = 0; i < strlen(word1); i++) { > > /* skip identical letters */ > > if (word1[i] == word2[i]) > > continue; > > > > /* reject words that don't match */ > > if (is_similar_char(word1[i], word2[i]) == false) > > return false; > > > > similarcount++; > > } > > > > /* reject words with more than 1 different letter */ > > //if (similarcount > 1) > > // return false; > > > > printf("%s %s\n", word1, word2); > > > > return true; > > } > > > > int main(void) > > { > > /* english.txt is assumed to exist in the working directory > > download from: > > > https://github.com/trezor/python-mnemonic/blob/master/mnemonic/wordlist/english.txt*/ > > FILE* f = fopen("english.txt", "r"); > > if (!f) { > > fprintf(stderr, "failed to open english.txt\n"); > > return 1; > > } > > > > /* read in word list, assumes one word per line */ > > #define MAXWORD 16 > > char wordlist[2048][MAXWORD]; > > int word = 0; > > while (fgets(wordlist[word], MAXWORD, f)) { > > /* strip trailing whitespace, assumes no leading whitespace */ > > char *ch = strpbrk(wordlist[word], " \n\t"); > > if (ch) > > *ch = '\0'; > > word++; > > } > > > > if (word != 2048) { > > fprintf(stderr, "word list incorrect length\n"); > > return 1; > > } > > > > /* check each word for similarity against every other word */ > > int i, j, count = 0; > > for (i = 0; i < 2048; i++) { > > for (j = i+1; j < 2048; j++) { > > if (print_words_if_similar(wordlist[i], wordlist[j])) > > count++; > > } > > } > > > > printf("%d matches\n", count); > > > > return 0; > > } > > > > > ------------------------------------------------------------------------------ > > October Webinars: Code for Performance > > Free Intel webinars can help you accelerate application performance. > > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > > the latest Intel processors and coprocessors. See abstracts and register > > > > > http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk > > _______________________________________________ > > Bitcoin-development mailing list > > Bitcoin-development@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > --047d7b6d95eed3b76804e97c9683 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I've just pushed updated wordlist which is filtered to= similar characters taken from this matrix.

BIP39 now co= nsider following character pairs as similar:

= =A0 =A0 =A0 =A0 similar =3D (
=A0 =A0 =A0 =A0 =A0 =A0 ('a', 'c'), ('a', '= ;e'), ('a', 'o'),
=A0 =A0 =A0 =A0 =A0 =A0 (&#= 39;b', 'd'), ('b', 'h'), ('b', 'p&#= 39;), ('b', 'q'), ('b', 'r'),
=A0 =A0 =A0 =A0 =A0 =A0 ('c', 'e'), ('c', '= ;g'), ('c', 'n'), ('c', 'o'), ('c&#= 39;, 'q'), ('c', 'u'),
=A0 =A0 =A0 =A0 = =A0 =A0 ('d', 'g'), ('d', 'h'), ('d'= ;, 'o'), ('d', 'p'), ('d', 'q'),
=A0 =A0 =A0 =A0 =A0 =A0 ('e', 'f'), ('e', '= ;o'),
=A0 =A0 =A0 =A0 =A0 =A0 ('f', 'i'), (&#= 39;f', 'j'), ('f', 'l'), ('f', 'p&#= 39;), ('f', 't'),
=A0 =A0 =A0 =A0 =A0 =A0 ('g', 'j'), ('g', '= ;o'), ('g', 'p'), ('g', 'q'), ('g&#= 39;, 'y'),
=A0 =A0 =A0 =A0 =A0 =A0 ('h', 'k&#= 39;), ('h', 'l'), ('h', 'm'), ('h',= 'n'), ('h', 'r'),
=A0 =A0 =A0 =A0 =A0 =A0 ('i', 'j'), ('i', '= ;l'), ('i', 't'), ('i', 'y'),
=A0 =A0 =A0 =A0 =A0 =A0 ('j', 'l'), ('j', 'p&#= 39;), ('j', 'q'), ('j', 'y'),
=A0 =A0 =A0 =A0 =A0 =A0 ('k', 'x'),
=A0 =A0 = =A0 =A0 =A0 =A0 ('l', 't'),
=A0 =A0 =A0 =A0 =A0 = =A0 ('m', 'n'), ('m', 'w'),
=A0 = =A0 =A0 =A0 =A0 =A0 ('n', 'u'), ('n', 'z'),=
=A0 =A0 =A0 =A0 =A0 =A0 ('o', 'p'), ('o', '= ;q'), ('o', 'u'), ('o', 'v'),
=A0 =A0 =A0 =A0 =A0 =A0 ('p', 'q'), ('p', 'r&#= 39;),
=A0 =A0 =A0 =A0 =A0 =A0 ('q', 'y'),
=A0 =A0 =A0 =A0 =A0 =A0 ('s', 'z'),
=A0 =A0 = =A0 =A0 =A0 =A0 ('u', 'v'), ('u', 'w'), (&#= 39;u', 'y'),
=A0 =A0 =A0 =A0 =A0 =A0 ('v', &#= 39;w'), ('v', 'y')
=A0 =A0 =A0 =A0 )

Feel free= to review and comment current wordlist, but I think we're slowly movin= g forward final list.

slush


On Sat, Oct 19, 2013 at 1:58 AM, Gregory= Maxwell <gmaxwell@gmail.com> wrote:
some fairly old wordlist solver code of mine:

https://people.xiph.org/~greg/wordlist.visual.py

it has a 52x52 letter visual similarity matrix in it (along with a citation= )

On Fri, Oct 18, 2013 at 4:52 PM, jan <jan.marecek@gmail.com> wrote:
>
> The words 'public', 'private' and 'secret' cou= ld be confusing when
> encoding public and private keys. eg. a private key that begins with > the word 'public'.
>
> I think avoiding words that could look similar when written down would=
> be a good idea aswell. I searched for words that only differ by the > letters c & e, g & y, u & v and found the following:
>
> car ear
> cat eat
> gear year
> value valve
>
> Other combinations could potentially be problematic depending on the > handwriting style: ft, ao, ij, vy, possibly even lt and il?
>
> I've included the search utility I used below.
>
>
> #include <stdbool.h>
> #include <string.h>
> #include <stdio.h>
>
> char *similar_char_pairs[] =3D { "ce", "gy", "= ;uv", NULL };
>
> bool is_similar_char(char c1, char c2)
> {
> =A0 char **pairs =3D similar_char_pairs;
> =A0 do {
> =A0 =A0 char *p =3D *pairs;
> =A0 =A0 if ((c1 =3D=3D p[0] && c2 =3D=3D p[1]) ||
> =A0 =A0 =A0 =A0 (c1 =3D=3D p[1] && c2 =3D=3D p[0]))
> =A0 =A0 =A0 return true;
> =A0 } while (*++pairs);
>
> =A0 return false;
> }
>
> bool print_words_if_similar(char *word1, char *word2)
> {
> =A0 /* reject words of different lengths */
> =A0 if (strlen(word1) !=3D strlen(word2))
> =A0 =A0 return false;
>
> =A0 size_t i, similarcount =3D 0;
>
> =A0 for (i =3D 0; i < strlen(word1); i++) {
> =A0 =A0 /* skip identical letters */
> =A0 =A0 if (word1[i] =3D=3D word2[i])
> =A0 =A0 =A0 continue;
>
> =A0 =A0 /* reject words that don't match */
> =A0 =A0 if (is_similar_char(word1[i], word2[i]) =3D=3D false)
> =A0 =A0 =A0 return false;
>
> =A0 =A0 similarcount++;
> =A0 }
>
> =A0 /* reject words with more than 1 different letter */
> =A0 //if (similarcount > 1)
> =A0 // =A0return false;
>
> =A0 printf("%s %s\n", word1, word2);
>
> =A0 return true;
> }
>
> int main(void)
> {
> =A0 /* english.txt is assumed to exist in the working directory
> =A0 =A0 =A0download from:
> =A0 =A0 =A0https://github.com/t= rezor/python-mnemonic/blob/master/mnemonic/wordlist/english.txt */
> =A0 FILE* f =3D fopen("english.txt", "r");
> =A0 if (!f) {
> =A0 =A0 fprintf(stderr, "failed to open english.txt\n");
> =A0 =A0 return 1;
> =A0 }
>
> =A0 /* read in word list, assumes one word per line */
> =A0 #define MAXWORD 16
> =A0 char wordlist[2048][MAXWORD];
> =A0 int word =3D 0;
> =A0 while (fgets(wordlist[word], MAXWORD, f)) {
> =A0 =A0 /* strip trailing whitespace, assumes no leading whitespace */=
> =A0 =A0 char *ch =3D strpbrk(wordlist[word], " \n\t");
> =A0 =A0 if (ch)
> =A0 =A0 =A0 *ch =3D '\0';
> =A0 =A0 word++;
> =A0 }
>
> =A0 if (word !=3D 2048) {
> =A0 =A0 fprintf(stderr, "word list incorrect length\n");
> =A0 =A0 return 1;
> =A0 }
>
> =A0 /* check each word for similarity against every other word */
> =A0 int i, j, count =3D 0;
> =A0 for (i =3D 0; i < 2048; i++) {
> =A0 =A0 for (j =3D i+1; j < 2048; j++) {
> =A0 =A0 =A0 if (print_words_if_similar(wordlist[i], wordlist[j]))
> =A0 =A0 =A0 =A0 count++;
> =A0 =A0 }
> =A0 }
>
> =A0 printf("%d matches\n", count);
>
> =A0 return 0;
> }
>
> ----------------------------------------------------------------------= --------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the mo= st from
> the latest Intel processors and coprocessors. See abstracts and regist= er >
> http://pubads.g.doubleclick.net= /gampad/clk?id=3D60135031&iu=3D/4140/ostg.clktrk
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-d= evelopment@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitco= in-development

---------------------------------------------------------------------------= ---
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most fr= om
the latest Intel processors and coprocessors. See abstracts and register &g= t;
http://pubads.g.doubleclick.net/gam= pad/clk?id=3D60135031&iu=3D/4140/ostg.clktrk
_______________________________________________
Bitcoin-development mailing list
Bitcoin-develo= pment@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-de= velopment

--047d7b6d95eed3b76804e97c9683--