Return-Path: Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 6B2C4C002B for ; Thu, 23 Feb 2023 16:37:14 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 3275761796 for ; Thu, 23 Feb 2023 16:37:14 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 3275761796 Authentication-Results: smtp3.osuosl.org; dkim=pass (2048-bit key) header.d=blockstream-com.20210112.gappssmtp.com header.i=@blockstream-com.20210112.gappssmtp.com header.a=rsa-sha256 header.s=20210112 header.b=t9AW+1Yi X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -1.899 X-Spam-Level: X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iOQYdGthFVkC for ; Thu, 23 Feb 2023 16:37:12 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 3A76D60F4E Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by smtp3.osuosl.org (Postfix) with ESMTPS id 3A76D60F4E for ; Thu, 23 Feb 2023 16:37:11 +0000 (UTC) Received: by mail-pj1-x1036.google.com with SMTP id pt11so14293774pjb.1 for ; Thu, 23 Feb 2023 08:37:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blockstream-com.20210112.gappssmtp.com; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=3OIkd5ACCI+NnlMs83zBoyWh+L+F4Yael5fGyaRWLLI=; b=t9AW+1Yi4xQjspBHLYv2WBKasQ3S6SnViHjbitYtDsAi9KEPd8StOzC824FEXRTEms g0AUSdRIEfrZx5bWcL+eiQY4kuASxxx9Y+kfmzH+xHWj0QLL64hi2NfgmFZpXFkBoMBD MCIh/Ezggog42hDBPY9yc/LTlUi607Z8dfZ/Xc+Sd3VWiXVBZqQilL/BcZ3Aeob66LZa kumB+6Qy5ISaSL7cYtwztz/M/D1emI9eqLbsIDoKxWvjU0LhO02mVtJmHUK+AB2B5fpW EwD7E0urtwgjN6lRAyLvdTlyiSD18fp2Kv7BOobb9ofqHYDFdjSTB7useHy6aKY1ikEt JL3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3OIkd5ACCI+NnlMs83zBoyWh+L+F4Yael5fGyaRWLLI=; b=eW/kNP57uz55L0aiFyVtmmQmUbZjrNQtjOEJ0oK/DdbrXSUYOj4O6lmnNbSjH07H28 ZB/+Wics969hb3Ju7RpsNAYxdsNv0+0Lcj/2XNxeM3JoiEDKZIim9yubrLlxrgIwFZ3o KnZczJ8CXSydquB+5DFB6RZr9NB9H1o1k7dJWz4l/Ba2Q5+c+IshxYLth5JB7S+TdL1F CEXEaDWHTsnSqbkfoap7vHorz69tuhVbHlErpV48P5BCjOqI15fuvpNw6YMuni4AJ5f+ Qx7FxNefEPmRWQ1Be7d3dxYO1pEFgziqYmJgnKEsLK62Xs/bkVB20yZMbdQHqZBC++CO n61Q== X-Gm-Message-State: AO0yUKUnz7oDTX1tA/6DsR/2+sivJFqz5dhDp+cW4NSHV/7oxR1gfnqx 3e5hZoq3squQi5swOocNxlvpMWPbOSaXb73mgy7m7AMqrj5WEw== X-Google-Smtp-Source: AK7set/vSil6XirgZ9yznDa6ru59h9aVkYMpoxnlmrZ4ncoQk8zaV8UjjfeKpjKrnyRxnCbPOvJ0prz2pRvqzKCyTvc= X-Received: by 2002:a17:90b:109:b0:233:fa52:828e with SMTP id p9-20020a17090b010900b00233fa52828emr576005pjz.1.1677170230767; Thu, 23 Feb 2023 08:37:10 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: "Russell O'Connor" Date: Thu, 23 Feb 2023 11:36:59 -0500 Message-ID: To: Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="000000000000d5566e05f5609fe6" Subject: Re: [bitcoin-dev] Codex32 X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Feb 2023 16:37:14 -0000 --000000000000d5566e05f5609fe6 Content-Type: text/plain; charset="UTF-8" Sorry for the repeated replies, but I would like to make one more remark regarding the 1 character "quick check". Because the 1 character "quick check" state is so small, the procedure becomes simplified to just using a single table. You start with the specified initial state, which would be the bech32 character '9', and then you have a lookup mapping (, ) -> . You go through the table for each character after the prefix, updating the state as you go along. ('9','2') -> '0', then ('0','N') -> '4', and so on until you reach the final state which should be '5'. If you like volvelles, one could be designed to implement this lookup table. However, I do want to note that an adjustment could be made to the codex32 generator so that this 1 character "quick check" table would become identical to the Bech32 addition table. In other words the 1 character quick check would become the same as adding up all the characters and checking that you get the required final constant. If this change were free to make, I would probably make it. However such an adjustment would come at a cost. The current generator was chosen to have three identical coefficients in a row (you can find the generator in the appendix of the draft BIP). This special property slightly eases the computation needed when creating checksums by hand (no compromise to the quality of the checksum itself). If we made the above adjustment to the codex32 generator, we would lose this property of having three identical coefficients in a row. Therefore, I am pretty hesitant to make this adjustment. Firstly the 1 character quick check is simply too small to be safely used. While it does guarantee to detect single character errors, it has a 1 in 32 chance of failing to detect more errors. I think a 3% failure rate is pretty bad, and would definitely recommend people performing quick checks use a minimum size of 2 (which has a 0.1% failure rate). Secondly the difference between using the addition table/volvelle versus a specific table/volvelle for the purpose of performing 1 character quick checks (which you ought not to be doing anyways) is pretty minimal. The addition table is possibly slightly less error prone to use because it is symmetric, but other than that the amount of work to do is pretty much the same either way. My conclusion is that it isn't worth compromising the ease of hand computation for the sake of possibly making a too-low-quality-checksum-that-no-one-should-be-using slightly more convenient, but I thought I should mention it at least. On Wed, Feb 22, 2023 at 10:30 PM Russell O'Connor wrote: > After some consultation, I now see that generators for all degree 2 BCH > codes, such as ours, are smooth and factor into quadratic and linear > components. > > Anyhow the upshot of all this is that you can perform a "quickcheck" > verification of the codex32 strings for whatever size of verification you > want to do, 1 character, 2 characters, 3 characters, upto the full 13 > characters. Each of these partial verifications will have BCH properties. > A 1 character quickchecksum will guarantee to detect any 1 character > error. A 3 character quickchecksum will guarantee to detect any 2 > character error, etc. There remains a 1 in 32^n chance of failing to > detect larger numbers of errors where n is the size of your quickcheck. > > To illustrate, let's consider a quickcheck of size 2. This can detect any > 1 character error and will only have a 1/1024 chance of failing to detect > other random errors. Let's take the second test vector as our example: " > MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM" > > You start in a specified initial state with a pair of bech32 characters. > For MS1 strings and a size 2 quickcheck it would be the pair of Bech32 > characters 'AS'. > > Next we "add" the first character after the prefix, which is '2' by using > the addition volvelle or lookup table. "Adding" '2' to 'S' yields '6' and > our state becomes 'A6'. > > Next we have to look up 'A6' in a lookup table. This table is too big to > fit in the margin of this email, so I will have to omit it. But it would > have an entry mapping 'A6' -> 'QM'. Our state becomes 'QM' > > From this point we have an even number of remaining characters in the > input string and we can work 2 characters at a time. We "add" the next two > data characters from our string, which are 'NA'. Again, using the volvelle > or lookup table we get that adding 'N' to 'Q' yields 'N', and adding 'A' to > 'M' yields 'X'. So our state is now 'NX' > > Next we look up 'NX' in this table I haven't given you and we will find an > entry mapping 'NX' -> 'DX', making 'DX' our new state. > > We keep repeating this process alternating between adding pairs of > characters and using this unstated lookup table all the way until the end > where we will reach a final state which will be 'H9'. > > If you follow this procedure with any string (upto 400 bit master seeds) > you will always end up in the state 'H9'. > > A specialized worksheet would help guide the process making the process > easier to follow. > > > This process is somewhat close to Peter Todd's suggestion of using a > lookup table on "words", which in this case would be pairs of bech32 > characters, and adding values together. The catch is that the addition is > done with Bech32 addition rather than calculator addition, which I accept > is a moderately large catch. > > Anyhow, the point is that you can do this sort of partial verification by > hand to whatever degree you like, if you are in a rush and are willing to > accept larger chances of failing to catch random errors. > > > On Wed, Feb 22, 2023 at 2:01 PM Russell O'Connor > wrote: > >> After some poking around at the math, I do see that the 13 character >> generator (for regular sized shares) is reasonably "smooth", having roots >> at T{11}, S{16}, and C{24}. >> >> This means we could build a "quick check" worksheet to evaluate the >> string modulo (x - T) to verify a 5 bit checksum, whose operation would be >> similar to the existing checksum worksheet in structure but significantly >> less work. >> >> Perhaps 5 bits is too short, and it is more reasonable working modulo (x >> - T)*(x - S) to get a 10 bit checksum. A worksheet for a 15 bit checksum >> is also an option, and possibly others well depending on the size of the >> other factors. I think this process is would be about as simple as any >> other comparable hand operated checksum over the bech32 alphabet would be. >> >> I haven't looked into the smoothness of the long generator for seeds that >> are greater than 400 bits. If it isn't smooth and smoother options are >> available, perhaps it should be changed. >> >> On Wed, Feb 22, 2023 at 11:29 AM Peter Todd via bitcoin-dev < >> bitcoin-dev@lists.linuxfoundation.org> wrote: >> >>> On Sun, Feb 19, 2023 at 10:12:51PM +0000, Andrew Poelstra via >>> bitcoin-dev wrote: >>> > > What really did catch my attention, but which was kind of buried in >>> the >>> > > project documentation, is the ability to verify the integrity of each >>> > > share independently without using a computer. For example, if I >>> store a >>> > > share with some relative who lives thousands of kilometers away, >>> I'll be >>> > > able to take that share out of its tamper-evident bag on my annual >>> > > holiday visit, verify that I can still read it accurately by >>> validating >>> > > its checksum, and put it into a new bag for another year. For this >>> > > procedure, I don't need to bring copies of any of my other shares, >>> > > allowing them (and my seed) to stay safe. >>> > > >>> > >>> > This is good feedback. I strongly agree that this is the big selling >>> > point for this -- that you can vet shares/seeds which *aren't* being >>> > actively used, without exposing them to the sorts of threats associated >>> > with active use. >>> > >>> > We should make this more prominent in the BIP motivation. >>> >>> I don't think that use-case is a good selling point. The checksum that >>> Codex32 >>> uses is much more complex than necessary if you are simply verifying a >>> share by >>> itself. >>> >>> A *much* simpler approach would be to use a simple mod N = 0 checksum, >>> either >>> by creating the seed such that each share passes, or by just storing an >>> additional word/symbol with the seed in such a way that sum(words) mod N >>> = 0 >>> passes. This approach is not only possible to compute by hand with a >>> word/symbol->number lookup table, and pen and paper or a calculator. >>> It's so >>> simple they could probably *remember* how to do it themselves. >>> >>> >>> Secondly, if all shares have mod N checksums, it may be sufficient for >>> everyone >>> to write down the checksums of the *other* shares, to verify they are the >>> correct ones and a different (otherwise correct) share hasn't >>> accidentally been >>> substituted. >>> >>> Indeed, with some brute forcing and small checksums, I'd expect it to be >>> mathematically possible to generate Shamir's secret sharing shards such >>> that >>> every shard can share the *same* checksum. In which case the share >>> verification >>> procedure would be to simply ask every share holder to verify the >>> checksum >>> manually using the mod N procedure, and then verify that each share >>> holder has >>> the same checksum. This would be less error prone in terms of leaking >>> information accidentally if the checksum was obviously *not* part of the >>> share: >>> eg by encoding the share with words, and the checksum with a number. >>> >>> Obviously, small checksums aren't fool proof. But we're probably better >>> off >>> creating a relatively easy procedure with a 1-in-1000 chance of an error >>> going >>> undetected than a complex procedure that people don't actually do at all. >>> >>> -- >>> https://petertodd.org 'peter'[:-1]@petertodd.org >>> _______________________________________________ >>> bitcoin-dev mailing list >>> bitcoin-dev@lists.linuxfoundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >>> >> --000000000000d5566e05f5609fe6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Sorry for the repeated replies, but I would like to m= ake one more remark regarding the 1 character "quick check".

Because the 1 character "quick check" state = is so small, the procedure becomes simplified to just using a single table.= =C2=A0 You start with the specified initial state, which would be the bech3= 2 character '9', and then you have a lookup mapping (<current st= ate>, <next input character>) -> <next state>.=C2=A0 You = go through the table for each character after the prefix, updating the stat= e as you go along. ('9','2') -> '0', then ('= 0','N') -> '4', and so on until you reach the final = state which should be '5'.=C2=A0 If you like volvelles, one could b= e designed to implement this lookup table.

However= , I do want to note that an adjustment could be made to the codex32 generat= or so that this 1 character "quick check" table would become iden= tical to the Bech32 addition table.=C2=A0 In other words the 1 character qu= ick check would become the same as adding up all the characters and checkin= g that you get the required final constant.

If thi= s change were free to make, I would probably make it.=C2=A0 However such an= adjustment would come at a cost.=C2=A0 The current generator was chosen to= have three identical coefficients in a row (you can find the generator in = the appendix of the draft BIP).=C2=A0 This special property slightly eases = the computation needed when creating checksums by hand (no compromise to th= e quality of the checksum itself).=C2=A0 If we made the above adjustment to= the codex32 generator, we would lose this property of having three identic= al coefficients in a row.

Therefore, I am pretty h= esitant to make this adjustment.=C2=A0 Firstly the 1 character quick check = is simply too small to be safely used.=C2=A0 While it does guarantee to det= ect single character errors, it has a 1 in 32 chance of failing to detect m= ore errors.=C2=A0 I think a 3% failure rate is pretty bad, and would defini= tely recommend people performing quick checks use a minimum size of 2 (whic= h has a 0.1% failure rate).=C2=A0 Secondly the difference between using the= addition table/volvelle versus a specific table/volvelle for the purpose o= f performing 1 character quick checks (which you ought not to be doing anyw= ays) is pretty minimal.=C2=A0 The addition table is possibly slightly less = error prone to use because it is symmetric, but other than that the amount = of work to do is pretty much the same either way.

= My conclusion is that it isn't worth compromising the ease of hand comp= utation for the sake of possibly making a too-low-quality-checksum-that-no-= one-should-be-using slightly more convenient, but I thought I should mentio= n it at least.


On Wed, Feb 22, 2023 at 10:30 PM Rus= sell O'Connor <roconnor@blockstream.com> wrote:
After some consultat= ion, I now see that generators for all degree 2 BCH codes, such as ours, ar= e smooth and factor into quadratic and linear components.

Anyhow the upshot of all this is that you can perform a "quick= check" verification of the codex32 strings for whatever size of verifi= cation you want to do, 1 character, 2 characters, 3 characters, upto the fu= ll 13 characters.=C2=A0 Each of these partial verifications will have BCH p= roperties.=C2=A0 A 1 character quickchecksum will guarantee to detect any 1= character error.=C2=A0 A 3 character quickchecksum will guarantee to detec= t any 2 character error, etc.=C2=A0 There remains a 1 in 32^n chance of fai= ling to detect larger numbers of errors where n is the size of your quickch= eck.

To illustrate, let's consider a quickchec= k of size 2.=C2=A0 This can detect any 1 character error and will only have= a 1/1024 chance of failing to detect other random errors.=C2=A0 Let's = take the second test vector as our example: "MS12NAMEA320Z= YXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM"<= /span>

You start in a specified initial state with= a pair of bech32 characters.=C2=A0 For MS1 strings and a size 2 quickcheck= it would be the pair of Bech32 characters 'AS'.

Next we "add" the first character after the prefix, which = is '2' by using the addition volvelle or lookup table.=C2=A0 "= Adding" '2' to 'S' yields '6' and our state be= comes 'A6'.

Next we have to look up 'A= 6' in a lookup table.=C2=A0 This table is too big to fit in the margin = of this email, so I will have to omit it.=C2=A0 But it would have an entry = mapping 'A6' -> 'QM'.=C2=A0 Our state becomes 'QM= 9;

From this point we have an even number of r= emaining characters in the input string and we can work 2 characters at a t= ime. We "add" the next two data characters from our string, which= are 'NA'.=C2=A0 Again, using the volvelle or lookup table we get t= hat adding 'N' to 'Q' yields 'N', and adding 'A= ' to 'M' yields 'X'.=C2=A0 So our state is now 'NX&= #39;

Next we look up 'NX' in this tabl= e I haven't given you and we will find an entry mapping 'NX' -&= gt; 'DX', making 'DX' our new state.

We keep repeating this process alternating between adding pairs of chara= cters and using this unstated lookup table all the way until the end where = we will reach a final state which will be 'H9'.

If you follow this procedure with any string (upto 400 bit master see= ds) you will always end up in the state 'H9'.

<= div>A specialized worksheet would help guide the process making the process= easier to follow.


This process= is somewhat close to Peter Todd's suggestion of using a lookup table o= n "words", which in this case would be pairs of bech32 characters= , and adding values together.=C2=A0 The catch is that the addition is done = with Bech32 addition rather than calculator addition, which I accept is a m= oderately large catch.

Anyhow, the point is th= at you can do this sort of partial verification by hand to whatever degree = you like, if you are in a rush and are willing to accept larger chances of = failing to catch random errors.


On Wed, Feb 22= , 2023 at 2:01 PM Russell O'Connor <roconnor@blockstream.com> wrote:
After some poking around at the math, I do see that the 13 character gener= ator (for regular sized shares) is reasonably "smooth", having ro= ots at T{11}, S{16}, and C{24}.

This means we coul= d build a "quick check" worksheet to evaluate the string modulo (= x - T) to verify a 5 bit checksum, whose operation would be similar to the = existing checksum worksheet in structure but significantly less work.
=

Perhaps 5 bits is too short, and it is more reasonable = working modulo (x - T)*(x - S) to get a 10 bit checksum.=C2=A0 A worksheet = for a 15 bit checksum is also an option, and possibly others well depending= on the size of the other factors.=C2=A0 I think this process is would be a= bout as simple as any other comparable hand operated checksum over the bech= 32 alphabet would be.

I haven't looked int= o the smoothness of the long generator for seeds that are greater than 400 = bits.=C2=A0 If it isn't smooth and smoother options are available, perh= aps it should be changed.

On Wed, Feb 22, 2023 at= 11:29 AM Peter Todd via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.= org> wrote:
On Sun, Feb 19, 2023 at 10:12:51PM +0000, Andrew Poelstra via bitcoin-de= v wrote:
> > What really did catch my attention, but which was kind of buried = in the
> > project documentation, is the ability to verify the integrity of = each
> > share independently without using a computer.=C2=A0 For example, = if I store a
> > share with some relative who lives thousands of kilometers away, = I'll be
> > able to take that share out of its tamper-evident bag on my annua= l
> > holiday visit, verify that I can still read it accurately by vali= dating
> > its checksum, and put it into a new bag for another year.=C2=A0 F= or this
> > procedure, I don't need to bring copies of any of my other sh= ares,
> > allowing them (and my seed) to stay safe.
> >
>
> This is good feedback. I strongly agree that this is the big selling > point for this -- that you can vet shares/seeds which *aren't* bei= ng
> actively used, without exposing them to the sorts of threats associate= d
> with active use.
>
> We should make this more prominent in the BIP motivation.

I don't think that use-case is a good selling point. The checksum that = Codex32
uses is much more complex than necessary if you are simply verifying a shar= e by
itself.

A *much* simpler approach would be to use a simple mod N =3D 0 checksum, ei= ther
by creating the seed such that each share passes, or by just storing an
additional word/symbol with the seed in such a way that sum(words) mod N = =3D 0
passes. This approach is not only possible to compute by hand with a
word/symbol->number lookup table, and pen and paper or a calculator. It&= #39;s so
simple they could probably *remember* how to do it themselves.


Secondly, if all shares have mod N checksums, it may be sufficient for ever= yone
to write down the checksums of the *other* shares, to verify they are the correct ones and a different (otherwise correct) share hasn't accidenta= lly been
substituted.

Indeed, with some brute forcing and small checksums, I'd expect it to b= e
mathematically possible to generate Shamir's secret sharing shards such= that
every shard can share the *same* checksum. In which case the share verifica= tion
procedure would be to simply ask every share holder to verify the checksum<= br> manually using the mod N procedure, and then verify that each share holder = has
the same checksum. This would be less error prone in terms of leaking
information accidentally if the checksum was obviously *not* part of the sh= are:
eg by encoding the share with words, and the checksum with a number.

Obviously, small checksums aren't fool proof. But we're probably be= tter off
creating a relatively easy procedure with a 1-in-1000 chance of an error go= ing
undetected than a complex procedure that people don't actually do at al= l.

--
http= s://petertodd.org 'peter'[:-1]@petertodd.org
_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev
--000000000000d5566e05f5609fe6--