Return-Path: Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 62B6AC002B for ; Thu, 16 Feb 2023 02:16:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 2F7E941802 for ; Thu, 16 Feb 2023 02:16:19 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 2F7E941802 Authentication-Results: smtp4.osuosl.org; dkim=pass (2048-bit key) header.d=blockstream-com.20210112.gappssmtp.com header.i=@blockstream-com.20210112.gappssmtp.com header.a=rsa-sha256 header.s=20210112 header.b=jSK9pPjA X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -0.4 X-Spam-Level: X-Spam-Status: No, score=-0.4 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, BITCOIN_SPAM_03=1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, PDS_BTC_ID=0.499, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GHevLBdeqX-O for ; Thu, 16 Feb 2023 02:16:15 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 5B2E641800 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by smtp4.osuosl.org (Postfix) with ESMTPS id 5B2E641800 for ; Thu, 16 Feb 2023 02:16:15 +0000 (UTC) Received: by mail-pf1-x433.google.com with SMTP id n2so543348pfo.3 for ; Wed, 15 Feb 2023 18:16:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blockstream-com.20210112.gappssmtp.com; s=20210112; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=ATHjnKD8GUmlQ5ywMngKqp7JK6+VwO9R93PxbWXuZUM=; b=jSK9pPjAf5KMr50PqTZLK0q4Pc9ZY0zqSgz+ucx/M4cTuqZT7Y0OWx2nEnjQZABB0x BkPGr8sD5f0BiyxYHdDXI7q1O2+oql67X2NgPCE90V0lK1vhXUehPiwIornYbdfNq1/j PT49tIa3xkY7mZe+vIoxnB6/hXYzHAF8nf9IKyB+ZlBo1j6CDB3Rdh3lhZ/g0JJSN6ey pJPGvE3JbZz3rrA0NpGOjU5QqZCZa9TpVLnItQNbHVqyIRnetP5yTs27qBUbN7q/wNVx X1dOHeEWTf9MQrqMN7vBpQEPFDj1Cr9+L4VC0higRxOPQy1ZAZvlONw/bs0vkFxVSoid QXTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ATHjnKD8GUmlQ5ywMngKqp7JK6+VwO9R93PxbWXuZUM=; b=j+3IJX0vt+RZcLxKZw4KQMk4697bkkb0l7pZ0NvrWCWW3Ce+XudDupSHYbPEFfRXZ1 WPr4p64RSRLU1rIDdtLUKJaTBT8WhDAH0X9MKASGC9DfGQexyvRF8w7ZHHPiZDwr/7Tu cPs8+k0M2SH7ybUXbAKQXD0ttLXrgghV0biItQ07STvNKN38lN4CTPuxgFN2nmUd1REA ouO7HysDnM4nnAEgLPYrVEsguBNFkqO43S2UtTtHjFuEktHX1YbSoudf16d2v3TyzbHx rmMBpWsLCC2ZnBrgywDdazZ6fprMa19RB7t5qju0gCA9qrHoVu5a7LzZX5L0s+4qXp/I 4q5g== X-Gm-Message-State: AO0yUKXt9g1kYUTDzfEN3Bew/vrVZisiq1jt+sbMONAjrmuiNi6jr5za Jc+f2xqSz1+cmIhROJ2J1YGlsc+jN9ZqjEhh+mQOxjcqEsWhOpbG X-Google-Smtp-Source: AK7set984LFGFlAR5pbcSz+U1Qg4oJAw1aWpNTqGogC2d4h1yrF+zigkGy0lXK4hf1AJvL12V0gQOLxydQ8dmXqC50k= X-Received: by 2002:a63:3684:0:b0:4fb:5f4b:f5cf with SMTP id d126-20020a633684000000b004fb5f4bf5cfmr643824pga.3.1676513773445; Wed, 15 Feb 2023 18:16:13 -0800 (PST) MIME-Version: 1.0 From: "Russell O'Connor" Date: Wed, 15 Feb 2023 21:16:02 -0500 Message-ID: To: Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="000000000000ed832605f4c7c775" X-Mailman-Approved-At: Thu, 16 Feb 2023 08:30:24 +0000 Subject: [bitcoin-dev] Codex32 X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Feb 2023 02:16:19 -0000 --000000000000ed832605f4c7c775 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I've been asked by Dr. Curr and Professor Snead to forward this message to this mailing list, as it may be of general interest to Bitcoin users. Dear Colleague: In 1967, during excavation for the construction of a new shopping center in Monroeville, Pennsylvania, workers uncovered a vault containing a cache of ancient scrolls[1]. Most were severely damaged, but those that could be recovered confirmed the existence of a secret society long suspected to have been active in the region around the year 200 BC. Based on a translation of these documents, we now know that the society, the Cult of the Bound Variable, was devoted to the careful study of computation, over two millennia before the invention of the digital computer. While the Monroeville scrolls make reference to computing machines made of sandstone, most researchers believed this to be a poetic metaphor and that the "computers" were in fact the initiates themselves, carrying out the unimaginably tedious steps of their computations with reed pens on parchment. Within the vault, a collection of sandstone wheels marked in a language consisting of 32 glyphs was found. After 15 years of study, we have successfully completed the translation of what is known as "Codex32," a document that describes the functions of the wheels. It was discovered that the wheels operate a system of cryptographic computations that was used by cult members to safeguard their most valuable secrets. The Codex32 system allows secrets to be carved into multiple tablets and scattered to the far corners of the earth. When a sufficient number of tablets are brought together the stone wheels are manipulated in a manner to recover th= e secrets. This finding may be of particular interest to the Bitcoin community. Below we provide a summary of the cult's secret sharing system, which is graciously hosted at < https://github.com/apoelstra/bips/blob/2023-02--volvelles/bip-0000.mediawik= i >. We are requesting a record assignment in the Bibliography of Immemorial Philosophy (BIP) repository. Thank you for your consideration. Dr. Leon O. Curr and Professor Pearlwort Snead Department of Archaeocryptography Harry Q. Bovik Institute for the Advancement [1] http://www.boundvariable.org/task.shtml -----BEGIN BIP-----
  BIP: ????
  Layer: Applications
  Title: codex32
  Author: Leon Olsson Curr and Pearlwort Sneed 
  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-????
  Status: Draft
  Type: ????
  Created: 2023-02-13
  License: BSD-3-Clause
  Post-History: FIXME
=3D=3DIntroduction=3D=3D =3D=3D=3DAbstract=3D=3D=3D This document describes a standard for backing up and restoring the master seed of a [https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP-0032] hierarchical deterministic wallet, using Shamir's secret sharing. It includes an encoding format, a BCH error-correcting checksum, and algorithms for share generation and secret recovery. Secret data can be split into up to 31 shares. A minimum threshold of shares, which can be between 1 and 9, is needed to recover the secret, whereas without sufficient shares, no information about the secret is recoverable. =3D=3D=3DCopyright=3D=3D=3D This document is licensed under the 3-clause BSD license. =3D=3D=3DMotivation=3D=3D=3D BIP-0032 master seed data is the source entropy used to derive all private keys in an HD wallet. Safely storing this secret data is the hardest and most important part of self-custody. However, there is a tension between security, which demands limiting the number of backups, and resilience, which demands widely replicated backups. Encrypting the seed does not change this fundamental tradeoff, since it leaves essentially the same problem of how to back up the encryption key(s)= . To allow users freedom to make this tradeoff, we use Shamir's secret sharing, which guarantees that any number of shares less than the threshold leaks no information about the secret. This approach allows increasing safety by widely distributing the generated shares, while also providing security against the compromise of one or more shares (as long as fewer than the threshold have been compromised). [https://github.com/satoshilabs/slips/blob/master/slip-0039.md SLIP-0039] has essentially the same motivations as this standard. However, unlike SLIP-0039, this standard also aims to be simple enough for hand computation. Users who demand a higher level of security for particular secrets, or have a general distrust in digital electronic devices, have the option of using hand computation to backup and restore secret data in an interoperable manner. Note that hand computation is optional, the particular details of hand computation are outside the scope of this standard, and implementers do not need to be concerned with this possibility. [https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki BIP-0039] serves the same purpose as this standard: encoding master seeds for storage by users. However, BIP-0039 has no error-correcting ability, cannot sensibly be extended to support secret sharing, has no support for versioning or other metadata, and has many technical design decisions that make implementation and interoperability difficult (for example, the use of SHA-512 to derive seeds, or the use of 11-bit words). =3D=3DSpecification=3D=3D =3D=3D=3Dcodex32=3D=3D=3D A codex32 string is similar to a Bech32 string defined in [ https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki BIP-0173]. It reuses the base32 character set from BIP-0173, and consists of: * A human-readable part, which is the string "ms" (or "MS"). * A separator, which is always "1". * A data part which is in turn subdivided into: ** A threshold parameter, which MUST be a single digit between "2" and "9", or the digit "0". *** If the threshold parameter is "0" then the share index, defined below, MUST have a value of "s" (or "S"). ** An identifier consisting of 4 Bech32 characters. ** A share index, which is any Bech32 character. Note that a share index value of "s" (or "S") is special and denotes the unshared secret (see section "Unshared Secret"). ** A payload which is a sequence of up to 74 Bech32 characters. (However, see '''Long codex32 Strings''' below for an exception to this limit.) ** A checksum which consists of 13 Bech32 characters as described below. As with Bech32 strings, a codex32 string MUST be entirely uppercase or entirely lowercase. The lowercase form is used when determining a character's value for checksum purposes. For presentation, lowercase is usually preferable, but uppercase SHOULD be used for handwritten codex32 strings. =3D=3D=3DChecksum=3D=3D=3D The last thirteen characters of the data part form a checksum and contain no information. Valid strings MUST pass the criteria for validity specified by the Python3 code snippet below. The function ms32_verify_checksum must return true when its argument is the data part as a list of integers representing the characters converted using the bech32 character table from BIP-0173. To construct a valid checksum given the data-part characters (excluding the checksum), the ms32_create_checksum function can be used. MS32_CONST =3D 0x10ce0795c2fd1e62a def ms32_polymod(values): GEN =3D [ 0x19dc500ce73fde210, 0x1bfae00def77fe529, 0x1fbd920fffe7bee52, 0x1739640bdeee3fdad, 0x07729a039cfc75f5a, ] residue =3D 0x23181b3 for v in values: b =3D (residue >> 60) residue =3D (residue & 0x0fffffffffffffff) << 5 ^ v for i in range(5): residue ^=3D GEN[i] if ((b >> i) & 1) else 0 return residue def ms32_verify_checksum(data): if len(data) >=3D 96: # See Long codex32 Strings return ms32_verify_long_checksum(data) if len(data) <=3D 93: return ms32_polymod(data) =3D=3D MS32_CONST return False def ms32_create_checksum(data): if len(data) > 80: # See Long codex32 Strings return ms32_create_long_checksum(data) values =3D data polymod =3D ms32_polymod(values + [0] * 13) ^ MS32_CONST return [(polymod >> 5 * (12 - i)) & 31 for i in range(13)] =3D=3D=3DError Correction=3D=3D=3D A codex32 string without a valid checksum MUST NOT be used. The checksum is designed to be an error correcting code that can correct up to 4 character substitutions, up to 8 unreadable characters (called erasures), or up to 13 consecutive erasures. Implementations SHOULD provide the user with a corrected valid codex32 string if possible. However, implementations SHOULD NOT automatically proceed with a corrected codex32 string without user confirmation of the corrected string, either by prompting the user, or returning a corrected string in an error message and allowing the user to repeat their action. We do not specify how an implementation should implement error correction. However, we recommend that: * Implementations make suggestions to substitute non-bech32 characters with bech32 characters in some situations, such as replacing "B" with "8", "O" with "0", "I" with "l", etc. * Implementations interpret "?" as an erasure. * Implementations optionally interpret other non-bech32 characters, or characters with incorrect case, as erasures. * If a string with 8 or fewer erasures can have those erasures filled in to make a valid codex32 string, then the implementation suggests such a string as a correction. * If a string consisting of valid Bech32 characters in the proper case can be made valid by substituting 4 or fewer characters, then the implementation suggests such a string as a correction. =3D=3D=3DUnshared Secret=3D=3D=3D When the share index of a valid codex32 string (converted to lowercase) is the letter "s", we call the string a codex32 secret. The subsequent data characters in a codex32 secret, excluding the final checksum of 13 characters, is a direct encoding of a BIP-0032 HD master seed. The master seed is decoded by converting the data to bytes: * Translate the characters to 5 bits values using the bech32 character table from BIP-0173, most significant bit first. * Re-arrange those bits into groups of 8 bits. Any incomplete group at the end MUST be 4 bits or less, and is discarded. Note that unlike the decoding process in BIP-0173, we do NOT require that the incomplete group be all zeros. For an unshared secret, the threshold parameter (the first character of the data part) is ignored (beyond the fact it must be a digit for the codex32 string to be valid). We recommend using the digit "0" for the threshold parameter in this case. The 4 character identifier also has no effect beyond aiding users in distinguishing between multiple different master seeds in cases where they have more than one. =3D=3D=3DRecovering Master Seed=3D=3D=3D When the share index of a valid codex32 string (converted to lowercase) is not the letter "s", we call the string an codex32 share. The first character of the data part indicates the threshold of the share, and it is required to be a non-"0" digit. In order to recover a master seed, one needs a set of valid codex32 shares such that: * All shares have the same threshold value, the same identifier, and the same length. * All of the share index values are distinct. * The number of codex32 shares is exactly equal to the (common) threshold value. If all the above conditions are satisfied, the ms32_recover function will return a codex32 secret when its argument is the list of codex32 shares with each share represented as a list of integers representing the characters converted using the bech32 character table from BIP-0173. bech32_inv =3D [ 0, 1, 20, 24, 10, 8, 12, 29, 5, 11, 4, 9, 6, 28, 26, 31, 22, 18, 17, 23, 2, 25, 16, 19, 3, 21, 14, 30, 13, 7, 27, 15, ] def bech32_mul(a, b): res =3D 0 for i in range(5): res ^=3D a if ((b >> i) & 1) else 0 a *=3D 2 a ^=3D 41 if (32 <=3D a) else 0 return res def bech32_lagrange(l, x): n =3D 1 c =3D [] for i in l: n =3D bech32_mul(n, i ^ x) m =3D 1 for j in l: m =3D bech32_mul(m, (x if i =3D=3D j else i) ^ j) c.append(m) return [bech32_mul(n, bech32_inv[i]) for i in c] def ms32_interpolate(l, x): w =3D bech32_lagrange([s[5] for s in l], x) res =3D [] for i in range(len(l[0])): n =3D 0 for j in range(len(l)): n ^=3D bech32_mul(w[j], l[j][i]) res.append(n) return res def ms32_recover(l): return ms32_interpolate(l, 16) =3D=3D=3DGenerating Shares=3D=3D=3D If we already have ''t'' valid codex32 strings such that: * All strings have the same threshold value ''t'', the same identifier, and the same length * All of the share index values are distinct Then we can derive additional shares with the ms32_interpolate function by passing it a list of exactly ''t'' of these codex32 strings, together with a fresh share index distinct from all of the existing share indexes. The newly derived share will have the provided share index. Once a user has generated ''n'' codex32 shares, they may discard the codex32 secret (if it exists). The ''n'' shares form a ''t'' of ''n'' Shamir's secret sharing scheme of a codex32 secret. There are two ways to create an initial set of ''t'' valid codex32 strings, depending on whether the user already has an existing master seed to split. =3D=3D=3D=3DFor an existing master seed=3D=3D=3D=3D Before generating shares for an existing master seed, it first must be converted into a codex32 secret, as described above. The conversion process consists of: * Choosing a threshold value ''t'' between 2 and 9, inclusive * Choosing a 4 bech32 character identifier ** We do not define how to choose the identifier, beyond noting that it SHOULD be distinct for every master seed the user may need to disambiguate. * Setting the share index to "s" * Setting the payload to a Bech32 encoding of the master seed, padded with arbitrary bits * Generating a valid checksum in accordance with the Checksum section Along with the codex32 secret, the user must generate ''t''-1 other codex32 shares, each with the same threshold value, the same identifier, and a distinct share index. The set of share indexes may be chosen arbitrarily. The payload of each of these codex32 shares is chosen uniformly at random such that it has the same length as the payload of the codex32 secret. For each share, a valid checksum must be generated in accordance with the Checksum section. The codex32 secret and the ''t''-1 codex32 shares form a set of ''t'' valid codex32 strings from which additional shares can be derived as described above. =3D=3D=3D=3DFor a fresh master seed=3D=3D=3D=3D In the case that the user wishes to generate a fresh master seed, the user chooses a threshold value ''t'' and an identifier, then generates ''t'' random codex32 shares, using the generation procedure from the previous section. As before, each share must have the same threshold value ''t'', the same identifier, and a distinct share index. With this set of ''t'' codex32 shares, new shares can be derived as discussed above. This process generates a fresh master seed, whose value can be retrieved by running the recovery process on any ''t'' of these shares. =3D=3D=3DLong codex32 Strings=3D=3D=3D The 13 character checksum design only supports up to 80 data characters. Excluding the threshold, identifier and index characters, this limits the payload to 74 characters or 46 bytes. While this is enough to support the 32-byte advised size of BIP-0032 master seeds, BIP-0032 allows seeds to be up to 64 bytes in size. We define a long codex32 string format to support these longer seeds by defining an alternative checksum. MS32_LONG_CONST =3D 0x43381e570bf4798ab26 def ms32_long_polymod(values): GEN =3D [ 0x3d59d273535ea62d897, 0x7a9becb6361c6c51507, 0x543f9b7e6c38d8a2a0e, 0x0c577eaeccf1990d13c, 0x1887f74f8dc71b10651, ] residue =3D 0x23181b3 for v in values: b =3D (residue >> 70) residue =3D (residue & 0x3fffffffffffffffff) << 5 ^ v for i in range(5): residue ^=3D GEN[i] if ((b >> i) & 1) else 0 return residue def ms32_verify_long_checksum(data): return ms32_long_polymod(data) =3D=3D MS32_LONG_CONST def ms32_create_long_checksum(data): values =3D data polymod =3D ms32_long_polymod(values + [0] * 15) ^ MS32_LONG_CONST return [(polymod >> 5 * (14 - i)) & 31 for i in range(15)] A long codex32 string follows the same specification as a regular codex32 string with the following changes. * The payload is a sequence of between 75 and 103 Bech32 characters. * The checksum consists of 15 Bech32 characters as defined above. A codex32 string with a data part of 94 or 95 characters is never legal as a regular codex32 string is limited to 93 data characters and a long codex32 string is at least 96 characters. Generation of long shares and recovery of the master seed from long shares proceeds in exactly the same way as for regular shares with the ms32_interpolate function. The long checksum is designed to be an error correcting code that can correct up to 4 character substitutions, up to 8 unreadable characters (called erasures), or up to 15 consecutive erasures. As with regular checksums we do not specify how an implementation should implement error correction, and all our recommendations for error correction of regular codex32 strings also apply to long codex32 strings. =3D=3DRationale=3D=3D This scheme is based on the observation that the Lagrange interpolation of valid codewords in a BCH code will always be a valid codeword. This means that derived shares will always have valid checksum, and a sufficient threshold of shares with valid checksums will derive a secret with a valid checksum. The header system is also compatible with Lagrange interpolation, meaning all derived shares will have the same identifier and will have the appropriate share index. This fact allows the header data to be covered by the checksum. The checksum size and identifier size have been chosen so that the encoding of 128-bit seeds and shares fit within 48 characters. This is a standard size for many common seed storage formats, which has been popularized by the 12 four-letter word format of the BIP-0039 mnemonic= . The 13 character checksum is adequate to correct 4 errors in up to 93 characters (80 characters of data and 13 characters of the checksum). This is somewhat better quality than the checksum used in SLIP-0039. For 256-bit seeds and shares our strings are 74 characters, which fits into the 96 character format of the 24 four-letter word format of the BIP-0039 mnemonic, with plenty of room to spare. A longer checksum is needed to support up to 512-bit seeds, the longest seed length specified in BIP-0032, as the 13 character checksum isn't adequate for more than 80 data characters. While we could use the 15 character checksum for both cases, we prefer to keep the strings as short as possible for the more common cases of 128-bit and 256-bit master seeds. We only guarantee to correct 4 characters no matter how long the string is. Longer strings mean more chances for transcription errors, so shorter strings are better. The longest data part using the regular 13 character checksum is 93 characters and corresponds to a 400-bit secret. At this length, the prefix MS1 is not covered by the checksum. This is acceptable because the checksum scheme itself requires you to know that the MS1 prefix is being used in the first place. If the prefix is damaged and a user is guessing that the data might be using this scheme, then the user can enter the available data explicitly using the suspected MS1 prefix. =3D=3DBackwards Compatibility=3D=3D codex32 is an alternative to BIP-0039 and SLIP-0039. It is technically possible to derive the BIP32 master seed from seed words encoded in one of these schemes, and then to encode this seed in codex32. For BIP-0039 this process is irreversible, since it involves hashing the original words. Furthermore, the resulting seed will be 512 bits long, which may be too large to be safely and conveniently handled. SLIP-0039 seed words can be reversibly converted to master seeds, so it is possible to interconvert between SLIP-0039 and codex32. However, SLIP-0039 '''shares''' cannot be converted to codex32 shares because the two schemes use a different underlying field. The authors of this BIP do not recommend interconversion. Instead, users who wish to switch to codex32 should generate a fresh seed and sweep their coins. =3D=3DReference Implementation=3D=3D * [https://secretcodex32.com/docs/2023-02-14--bw.ps Reference PostScript Implementation] * FIXME add Python implementation * FIXME add Rust implementation =3D=3DTest Vectors=3D=3D =3D=3D=3DTest vector 1=3D=3D=3D This example shows the codex32 format, when used without splitting the secret into any shares. The data part contains 26 Bech32 characters, which corresponds to 130 bits. We truncate the last two bits in order to obtain a 128-bit master seed. codex32 secret (Bech32): ms10testsxxxxxxxxxxxxxxxxxxxxxxxxxx4nzvca9cmczlw Master secret (hex): 318c6318c6318c6318c6318c6318c631 * human-readable part: ms * separator: 1 * k value: 0 (no secret splitting) * identifier: test * share index: s (the secret) * data: xxxxxxxxxxxxxxxxxxxxxxxxxx * checksum: 4nzvca9cmczlw =3D=3D=3DTest vector 2=3D=3D=3D This example shows generating a new master seed using "random" codex32 shares, as well as deriving an additional codex32 share, using ''k''=3D2 an= d an identifier of NAME. Although codex32 strings are canonically all lowercase, it's also valid to use all uppercase. Share with index A: MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM Share with index C: MS12NAMECACDEFGHJKLMNPQRSTUVWXYZ023FTR2GDZMPY6PN * Derived share with index D: MS12NAMEDLL4F8JLH4E5VDVULDLFXU2JHDNLSM97XVENRXEG * Secret share with index S: MS12NAMES6XQGUZTTXKEQNJSJZV4JV3NZ5K3KWGSPHUH6EVW * Master secret (hex): d1808e096b35b209ca12132b264662a5 Note that per BIP-0173, the lowercase form is used when determining a character's value for checksum purposes. In particular, given an all uppercase codex32 string, we still use lowercase ms as the human-readable part during checksum construction. =3D=3D=3DTest vector 3=3D=3D=3D This example shows splitting an existing 128-bit master seed into "random" codex32 shares, using ''k''=3D3 and an identifier of cash. We appended two zero bits in order to obtain 26 Bech32 characters (130 bits of data) from the 128-bit master seed. Master secret (hex): ffeeddccbbaa99887766554433221100 Secret share with index s: ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln Share with index a: ms13casha320zyxwvutsrqpnmlkjhgfedca2a8d0zehn8a0t Share with index c: ms13cashcacdefghjklmnpqrstuvwxyz023949xq35my48dr * Derived share with index d: ms13cashd0wsedstcdcts64cd7wvy4m90lm28w4ffupqs7rm * Derived share with index e: ms13casheekgpemxzshcrmqhaydlp6yhms3ws7320xyxsar9 * Derived share with index f: ms13cashf8jh6sdrkpyrsp5ut94pj8ktehhw2hfvyrj48704 Any three of the five shares among acdef can be used to recover the secret. Note that the choice to append two zero bits was arbitrary, and any of the following four secret shares would have been valid choices. However, each choice would have resulted in a different set of derived shares. * ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln * ms13cashsllhdmn9m42vcsamx24zrxgs3qpte35dvzkjpt0r * ms13cashsllhdmn9m42vcsamx24zrxgs3qzfatvdwq5692k6 * ms13cashsllhdmn9m42vcsamx24zrxgs3qrsx6ydhed97jx2 =3D=3D=3DTest vector 4=3D=3D=3D This example shows converting a 256-bit secret into a codex32 secret, without splitting the secret into any shares. We appended four zero bits in order to obtain 52 Bech32 characters (260 bits of data) from the 256-bit secret. 256-bit secret (hex): ffeeddccbbaa99887766554433221100ffeeddccbbaa99887766554433221100 * codex32 secret: ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqqtum9pgv9= 9ycma Note that the choice to append four zero bits was arbitrary, and any of the following sixteen codex32 secrets would have been valid: * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqqtum9pgv9= 9ycma * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqpj82dp34u= 6lqtd * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqzsrs4pnh7= jmpj5 * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqrfcpap2w8= dqezy * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqy5tdvphn6= znrf0 * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq9dsuypw2r= agmel * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqx05xupvgp= 4v6qx * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq8k0h5p43c= 2hzsk * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqgum7hplmj= tr8ks * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqf9q0lpxzt= 5clxq * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq28y48pyqf= uu7le * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqt7ly0paes= r8x0f * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqvrvg7pqyd= v5uyz * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqd6hekpea5= n0y5j * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqwcnrwpmlk= mt9dt * ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq0pgjxpzx0= ysaam =3D=3D=3DTest vector 5=3D=3D=3D This example shows generating a new 512-bit master seed using "random" codex32 characters and appending a checksum. The payload contains 103 Bech32 characters, which corresponds to 515 bits. The last three bits are discarded when converting to a 512-bit master seed. This is an example of a '''Long codex32 String'''. * Secret share with index S: MS100C8VSM32ZXFGUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN07= 4RXVCEMLH8WU3TK925ACDEFGHJKLMNPQRSTUVWXY06FHPV80UNDVARHRAK * Master secret (hex): dc5423251cb87175ff8110c8531d0952d8d73e1194e95b5f19d6f9df7c01111104c9b= aecdfea8cccc677fb9ddc8aec5553b86e528bcadfdcc201c17c638c47e9 =3D=3DAppendix=3D=3D =3D=3D=3DMathematical Companion=3D=3D=3D Below we use the Bech32 character set to denote values in GF[32]. In Bech32, the letter Q denotes zero and the letter P denotes one. The digits 0 and 2 through 9 do ''not'' denote their numeric values. They are simply elements of GF[32]. The generating polynomial for our BCH code is as follows. We extend GF[32] to GF[1024] by adjoining a primitive cube root of unity, =CE=B6, satisfying =CE=B6^2 =3D =CE=B6 + P. We select =CE=B2 :=3D G =CE=B6 which has order 93, and constru= ct the product (x - =CE=B2^i) for i in {17, 20, 46= , 49, 52, 77, 78, 79, 80, 81, 82, 83, 84}. The resulting polynomial is our generating polynomial for our 13 character checksum: x^13 + E x^12 + M x^11 + 3 x^10 + G x^9 + Q x^8 + E x^7 + E x^6 + E x^5 + L x^4 + M x^3 + C x^2 + S x + S For our long checksum, we select =CE=B3 :=3D E + X =CE=B6, whi= ch has order 1023, and construct the product (x - =CE=B3^i) for i in {32, 64, 96, 895, 927, 959, 991, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026}. The resulting polynomial is our generating polynomial for our 15 character checksum for long strings: x^15 + 0 x^14 + 2 x^13 + E x^12 + 6 x^11 + F x^10 + E x^9 + 4 x^8 + X x^7 + H x^6 + 4 x^5 + X x^4 + 9 x^3 + K x^2 + Y x^1 + H (Reminder: the character 0 does ''not'' denote the zero of the field.) -----END BIP----- --000000000000ed832605f4c7c775 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I&#= 39;ve been asked by Dr. Curr and Professor Snead to forward this message to= =C2=A0
this mailing list, as it may be of general interest to Bitcoin users.=C2= =A0

=
Dear Colleague= :=C2=A0
=
In 1967, d= uring excavation for the construction of a new shopping center in=C2=A0
Monroeville= , Pennsylvania, workers uncovered a vault containing a cache of=C2=A0
ancient scr= olls[1].=C2=A0 Most were severely damaged, but those that could be=C2=A0
recovered= confirmed the existence of a secret society long suspected to have=C2=A0
been act= ive in the region around the year 200 BC.=C2=A0

Based on a translation of these documents, = we now know that the society, the=C2=A0
Cult of the Bound Variable, was devoted to= the careful study of computation,=C2=A0
over two millennia before the invention o= f the digital computer.=C2=A0

While the Monroeville scrolls make reference to computing mac= hines made of=C2=A0
sandstone, most researchers believed this to be a poetic metap= hor and that the=C2=A0
"computers" were in fact the initiates themselves= , carrying out the=C2=A0
unimaginably tedious steps of their computations with r= eed pens on parchment.=C2=A0

Within the vault, a collection of sandstone wheels marked i= n a language=C2=A0
consisting of 32 glyphs was found. After 15 years of study, we= have successfully
completed the translation of what is known as "Codex32,= " a document that
describes the functions of the wheels. It was disc= overed that the wheels operate
a system of cryptographic computations that= was used by cult members to
safeguard their most valuable secrets.

The Codex32 system allows secrets to be= carved into multiple tablets and
scattered to the far corners of the earth. When = a sufficient number of tablets are
brought together the stone wheels are manipula= ted in a manner to recover the
secrets. This finding may be of particular intere= st to the Bitcoin community.

Below we provide a summary of the cult's secret sharing sy= stem, which is
graciously hosted = at
We ar= e requesting a record assignment in the Bibliography of Immemorial
Philosophy (BIP= ) repository.=C2=A0

Thank you for your consideration.

Dr. Leon O. Curr and Professor Pearlwort Snead=C2=A0<= /span>
Departm= ent of Archaeocryptography=C2=A0
Harry Q. Bovik Institute for the Advancement=C2= =A0

-----BEGIN BIP-----
=

<pre>
=C2=A0 BIP: ????
=C2=A0 Laye= r: Applications
=C2=A0 Title: codex32
=C2=A0 Author: Leon Olsson Curr and Pearlwort Sneed= <pearlwort@wpsoftware.net>
=C2=A0 Status: Draft
=C2=A0 Type: ????
=C2=A0 Created:= 2023-02-13
=C2=A0 License: BSD-3-Clause
=C2=A0 Post-History: FIXME
</pre>
<= div id=3D"gmail-magicdomid59" class=3D"gmail-ace-line">
=3D=3DIntroduction=3D=3D

<= div id=3D"gmail-magicdomid62" class=3D"gmail-ace-line">=3D=3D=3DAbstract=3D= =3D=3D
=
This docu= ment describes a standard for backing up and restoring the master seed of a=
[https://github.com/bitcoin/bips/blo= b/master/bip-0032.mediawiki BIP-0032] hierarchical deterministic= wallet, using Shamir's secret sharing.
It includes an encoding format, a BCH = error-correcting checksum, and algorithms for share generation and secret r= ecovery.
Secret data can be split into up to 31 shares.
A minimum threshold of shares, wh= ich can be between 1 and 9, is needed to recover the secret, whereas withou= t sufficient shares, no information about the secret is recoverable.=

=3D=3D=3DCopyright=3D= =3D=3D
=
This docu= ment is licensed under the 3-clause BSD license.

=3D=3D=3DMotivation=3D=3D=3D
<= div id=3D"gmail-magicdomid75" class=3D"gmail-ace-line">
BIP-0032 master seed data is = the source entropy used to derive all private keys in an HD wallet.<= /div>
Safely storing= this secret data is the hardest and most important part of self-custody.
However,= there is a tension between security, which demands limiting the number of = backups, and resilience, which demands widely replicated backups.
Encrypting the s= eed does not change this fundamental tradeoff, since it leaves essentially = the same problem of how to back up the encryption key(s).

To allow users freedom to make th= is tradeoff, we use Shamir's secret sharing, which guarantees that any = number of shares less than the threshold leaks no information about the sec= ret.
Th= is approach allows increasing safety by widely distributing the generated s= hares, while also providing security against the compromise of one or more = shares (as long as fewer than the threshold have been compromised).<= /div>

[https://github.com/satoshilabs/slips/blob/master/slip= -0039.md SLIP-0039] has essentially the same motivations as this= standard.
However, unlike SLIP-0039, this standard also aims to be simple enough = for hand computation.
Users who demand a higher level of security for particular s= ecrets, or have a general distrust in digital electronic devices, have the = option of using hand computation to backup and restore secret data in an in= teroperable manner.
Note that hand computation is optional, the particular details= of hand computation are outside the scope of this standard, and implemente= rs do not need to be concerned with this possibility.


=3D=3DSpecification=3D=3D

=3D=3D=3Dcodex32=3D=3D=3D=

It reuses the base32 character set from BIP-0173, and cons= ists of:

* A hum= an-readable part, which is the string "ms" (or "MS").
* A sep= arator, which is always "1".
* A data part which is in turn subdivided = into:
= = ** A threshold parameter, which MUST be a single digit between "2"= ; and "9", or the digit "0".
*** If the threshold parameter i= s "0" then the share index, defined below, MUST have a value of &= quot;s" (or "S").
** An identifier consisting of 4 Bech32 characte= rs.
**= A share index, which is any Bech32 character. Note that a share index valu= e of "s" (or "S") is special and denotes the unshared s= ecret (see section "Unshared Secret").
** A payload which is a sequence= of up to 74 Bech32 characters. (However, see '''Long codex32 S= trings''' below for an exception to this limit.)
** A checksum which = consists of 13 Bech32 characters as described below.

As with Bech32 strings, a codex32 st= ring MUST be entirely uppercase or entirely lowercase.
The lowercase form is use= d when determining a character's value for checksum purposes.
For presentatio= n, lowercase is usually preferable, but uppercase SHOULD be used for handwr= itten codex32 strings.

=3D=3D=3DChecksum=3D=3D=3D

The last thirteen characters of the data part form a = checksum and contain no information.
Valid strings MUST pass the criteria for val= idity specified by the Python3 code snippet below.
The function <code>ms32_= verify_checksum</code> must return true when its argument is the data= part as a list of integers representing the characters converted using the= bech32 character table from BIP-0173.

To construct a valid checksum given the data-part = characters (excluding the checksum), the <code>ms32_create_checksum&l= t;/code> function can be used.

<source lang=3D"python">
MS32_CONST =3D 0x10c= e0795c2fd1e62a

def ms32_polymod(values):
=C2=A0=C2=A0=C2=A0 GEN =3D [
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 0x19dc500ce73fde210,
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x1b= fae00def77fe529,
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x1fbd920fffe7bee52,<= /span>
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x1739640bdeee3fdad,
=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 0x07729a039cfc75f5a,
=C2=A0=C2=A0=C2=A0 ]
=C2=A0=C2=A0=C2=A0 resi= due =3D 0x23181b3
=C2=A0=C2=A0=C2=A0 for v in values:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 b =3D (residue >> 60)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = residue =3D (residue & 0x0fffffffffffffff) << 5 ^ v
<= div id=3D"gmail-magicdomid136" class=3D"gmail-ace-line">=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 for i in range(5):
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 residue ^=3D GEN[i] if ((b >> i) &= 1) else 0
=C2=A0=C2=A0=C2=A0 return residue

def ms32_verify_checksum(data):
=C2=A0=C2=A0=C2=A0 if le= n(data) >=3D 96:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 # = See Long codex32 Strings
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return ms32_= verify_long_checksum(data)
=C2=A0=C2=A0=C2=A0 if len(data) <=3D 93:
=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 return ms32_polymod(data) =3D=3D MS32_CONST
=C2=A0=C2= =A0=C2=A0 return False

def ms32_create_checksum(data):
=C2=A0=C2=A0=C2=A0 if len(data) > 80:= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 # See Long codex3= 2 Strings
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return ms32_create_long_chec= ksum(data)
=C2=A0=C2=A0=C2=A0 values =3D data
=C2=A0=C2=A0=C2=A0 polymod =3D ms32_polym= od(values + [0] * 13) ^ MS32_CONST
=C2=A0=C2=A0=C2=A0 return [(polymod >> 5= * (12 - i)) & 31 for i in range(13)]
</source>

=3D=3D=3DError Correction=3D=3D=3D

A codex32 string= without a valid checksum MUST NOT be used.
The checksum is designed to be an err= or correcting code that can correct up to 4 character substitutions, up to = 8 unreadable characters (called erasures), or up to 13 consecutive erasures= .
Impl= ementations SHOULD provide the user with a corrected valid codex32 string i= f possible.
However, implementations SHOULD NOT automatically proceed with a corr= ected codex32 string without user confirmation of the corrected string, eit= her by prompting the user, or returning a corrected string in an error mess= age and allowing the user to repeat their action.
We do not specify how an implem= entation should implement error correction. However, we recommend that:

* Implementations= make suggestions to substitute non-bech32 characters with bech32 character= s in some situations, such as replacing "B" with "8", &= quot;O" with "0", "I" with "l", etc.
* Impleme= ntations interpret "?" as an erasure.
* Implementations optionally inte= rpret other non-bech32 characters, or characters with incorrect case, as er= asures.
* If a string with 8 or fewer erasures can have those erasures filled in = to make a valid codex32 string, then the implementation suggests such a str= ing as a correction.
* If a string consisting of valid Bech32 characters in the p= roper case can be made valid by substituting 4 or fewer characters, then th= e implementation suggests such a string as a correction.

=3D=3D=3DUnshared Secret=3D=3D= =3D
When the s= hare index of a valid codex32 string (converted to lowercase) is the letter= "s", we call the string a codex32 secret.
The subsequent data characte= rs in a codex32 secret, excluding the final checksum of 13 characters, is a= direct encoding of a BIP-0032 HD master seed.

The master seed is decoded by converting t= he data to bytes:

* Translate the characters to 5 bits values using the bech32 character = table from BIP-0173, most significant bit first.
* Re-arrange those bits into gro= ups of 8 bits. Any incomplete group at the end MUST be 4 bits or less, and = is discarded.

= = Note that unlike the decoding process in BIP-0173, we do NOT require that t= he incomplete group be all zeros.

For an unshared secret, the threshold parameter (the fi= rst character of the data part) is ignored (beyond the fact it must be a di= git for the codex32 string to be valid).
We recommend using the digit "0&quo= t; for the threshold parameter in this case.
The 4 character identifier also has = no effect beyond aiding users in distinguishing between multiple different = master seeds in cases where they have more than one.

=3D=3D=3DRecovering Master Seed=3D= =3D=3D

When th= e share index of a valid codex32 string (converted to lowercase) is not the= letter "s", we call the string an codex32 share.
The first character o= f the data part indicates the threshold of the share, and it is required to= be a non-"0" digit.

In order to recover a master seed, one needs a set of vali= d codex32 shares such that:

* All shares have the same threshold value, the same identifi= er, and the same length.
* All of the share index values are distinct.
* The number of= codex32 shares is exactly equal to the (common) threshold value.

If all the above condit= ions are satisfied, the <code>ms32_recover</code> function will= return a codex32 secret when its argument is the list of codex32 shares wi= th each share represented as a list of integers representing the characters= converted using the bech32 character table from BIP-0173.

<source lang=3D"python= ">
bech32_inv =3D [
=C2=A0=C2=A0=C2=A0 0, 1, 20, 24, 10, 8, 12, 29, 5, 11, 4, 9= , 6, 28, 26, 31,
=C2=A0=C2=A0=C2=A0 22, 18, 17, 23, 2, 25, 16, 19, 3, 21, 14, 30,= 13, 7, 27, 15,
]

de= f bech32_mul(a, b):
=C2=A0=C2=A0=C2=A0 res =3D 0
=C2=A0=C2=A0=C2=A0 for i in range(5):<= /span>
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 res ^=3D a if ((b >> i) & 1)= else 0
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 a *=3D 2
=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 a ^=3D 41 if (32 <=3D a) else 0
=C2=A0=C2=A0=C2=A0 return= res
<= br>
def bech3= 2_lagrange(l, x):
=C2=A0=C2=A0=C2=A0 n =3D 1
=C2=A0=C2=A0=C2=A0 c =3D []
=C2=A0=C2=A0=C2=A0 f= or i in l:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 n =3D bech32_mul(n, i ^ x)<= /span>
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 m =3D 1
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 for j in l:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 m =3D bech32_mul(m, (x if i =3D=3D j else i) ^ j)
=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 c.append(m)
=C2=A0=C2=A0=C2=A0 return [bech32_mul(n, bech32_in= v[i]) for i in c]

def ms32_interpolate(l, x):
=C2=A0=C2=A0=C2=A0 w =3D bech32_lagrange([s[5] fo= r s in l], x)
=C2=A0=C2=A0=C2=A0 res =3D []
=C2=A0=C2=A0=C2=A0 for i in range(len(l[0])= ):
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 n =3D 0
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 for j in range(len(l)):
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 n ^=3D bech32_mul(w[j], l[j][i])
=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 res.append(n)
=C2=A0=C2=A0=C2=A0 return res

def ms32_recover(l):
=C2=A0=C2=A0=C2= =A0 return ms32_interpolate(l, 16)
</source>

=3D=3D=3DGenerating Shares=3D=3D=3D

If we already have = 9;'t'' valid codex32 strings such that:

* All strings have the same threshold= value ''t'', the same identifier, and the same length
* All of t= he share index values are distinct

Then we can derive additional shares with the <code= >ms32_interpolate</code> function by passing it a list of exactly = ''t'' of these codex32 strings, together with a fresh share= index distinct from all of the existing share indexes.
The newly derived share w= ill have the provided share index.

Once a user has generated ''n'' codex3= 2 shares, they may discard the codex32 secret (if it exists).
<= div id=3D"gmail-magicdomid248" class=3D"gmail-ace-line">The ''n'= ;' shares form a ''t'' of ''n'' Shamir&= #39;s secret sharing scheme of a codex32 secret.

There are two ways to create an initial = set of ''t'' valid codex32 strings, depending on whether th= e user already has an existing master seed to split.

=3D=3D=3D=3DFor an existing master s= eed=3D=3D=3D=3D

Before generating shares for an existing master seed, it first must be co= nverted into a codex32 secret, as described above.
The conversion process consist= s of:
=
* Choosi= ng a threshold value ''t'' between 2 and 9, inclusive
* Choosing = a 4 bech32 character identifier
** We do not define how to choose the identifier,= beyond noting that it SHOULD be distinct for every master seed the user ma= y need to disambiguate.
* Setting the share index to "s"
=
* Setting the payl= oad to a Bech32 encoding of the master seed, padded with arbitrary bits
* Generat= ing a valid checksum in accordance with the Checksum section

Along with the codex32 secre= t, the user must generate ''t''-1 other codex32 shares, eac= h with the same threshold value, the same identifier, and a distinct share = index.
The set of share indexes may be chosen arbitrarily.
The payload of each of these= codex32 shares is chosen uniformly at random such that it has the same len= gth as the payload of the codex32 secret.
For each share, a valid checksum must b= e generated in accordance with the Checksum section.

The codex32 secret and the ''= ;t''-1 codex32 shares form a set of ''t'' valid cod= ex32 strings from which additional shares can be derived as described above= .

=
=3D=3D=3D= =3DFor a fresh master seed=3D=3D=3D=3D

In the case that the user wishes to generate a fre= sh master seed, the user chooses a threshold value ''t'' an= d an identifier, then generates ''t'' random codex32 shares= , using the generation procedure from the previous section.
As before, each share= must have the same threshold value ''t'', the same identif= ier, and a distinct share index.

With this set of ''t'' codex32 shares, = new shares can be derived as discussed above. This process generates a fres= h master seed, whose value can be retrieved by running the recovery process= on any ''t'' of these shares.

=3D=3D=3DLong codex32 Strings=3D=3D=3D

The 13 character = checksum design only supports up to 80 data characters.
Excluding the threshold, = identifier and index characters, this limits the payload to 74 characters o= r 46 bytes.
While this is enough to support the 32-byte advised size of BIP-0032 = master seeds, BIP-0032 allows seeds to be up to 64 bytes in size.
We define a lon= g codex32 string format to support these longer seeds by defining an altern= ative checksum.

<source lang=3D"python">
MS32_LONG_CONST =3D 0x43381e570bf4798a= b26
def ms32_l= ong_polymod(values):
=C2=A0=C2=A0=C2=A0 GEN =3D [
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 0x3d59d273535ea62d897,
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x7a9bec= b6361c6c51507,
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x543f9b7e6c38d8a2a0e,<= /span>
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0c577eaeccf1990d13c,
=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 0x1887f74f8dc71b10651,
=C2=A0=C2=A0=C2=A0 ]
=C2=A0=C2=A0=C2=A0 re= sidue =3D 0x23181b3
=C2=A0=C2=A0=C2=A0 for v in values:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 b =3D (residue >> 70)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 residue =3D (residue & 0x3fffffffffffffffff) << 5 ^ v<= /div>
=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 for i in range(5):
=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 residue ^=3D GEN[i] if ((b >>= i) & 1) else 0
=C2=A0=C2=A0=C2=A0 return residue

def ms32_verify_long_checksum(data):
=C2=A0=C2= =A0=C2=A0 return ms32_long_polymod(data) =3D=3D MS32_LONG_CONST

def ms32_create_long_chec= ksum(data):
=C2=A0=C2=A0=C2=A0 values =3D data
=C2=A0=C2=A0=C2=A0 polymod =3D ms32_long= _polymod(values + [0] * 15) ^ MS32_LONG_CONST
=C2=A0=C2=A0=C2=A0 return [(polymod= >> 5 * (14 - i)) & 31 for i in range(15)]
</source>
=

A long codex32 string fol= lows the same specification as a regular codex32 string with the following = changes.

* The= payload is a sequence of between 75 and 103 Bech32 characters.
* The checksum co= nsists of 15 Bech32 characters as defined above.

A codex32 string with a data part of 94 = or 95 characters is never legal as a regular codex32 string is limited to 9= 3 data characters and a long codex32 string is at least 96 characters.

=
Generation of long= shares and recovery of the master seed from long shares proceeds in exactl= y the same way as for regular shares with the <code>ms32_interpolate&= lt;/code> function.

The long checksum is designed to be an error correcting code that = can correct up to 4 character substitutions, up to 8 unreadable characters = (called erasures), or up to 15 consecutive erasures.
As with regular checksums we= do not specify how an implementation should implement error correction, an= d all our recommendations for error correction of regular codex32 strings a= lso apply to long codex32 strings.

=3D=3DRationale=3D=3D

This scheme is based on the observation that t= he Lagrange interpolation of valid codewords in a BCH code will always be a= valid codeword.
This means that derived shares will always have valid checksum, = and a sufficient threshold of shares with valid checksums will derive a sec= ret with a valid checksum.

The header system is also compatible with Lagrange interpolati= on, meaning all derived shares will have the same identifier and will have = the appropriate share index.
This fact allows the header data to be covered by th= e checksum.

Th= e checksum size and identifier size have been chosen so that the encoding o= f 128-bit seeds and shares fit within 48 characters.
This is a standard size for = many common seed storage formats, which has been popularized by the 12 four= -letter word format of the BIP-0039 mnemonic.

The 13 character checksum is adequate to co= rrect 4 errors in up to 93 characters (80 characters of data and 13 charact= ers of the checksum). This is somewhat better quality than the checksum use= d in SLIP-0039.

For 256-bit seeds and shares our strings are 74 characters, which fits in= to the 96 character format of the 24 four-letter word format of the BIP-003= 9 mnemonic, with plenty of room to spare.

A longer checksum is needed to support up to 51= 2-bit seeds, the longest seed length specified in BIP-0032, as the 13 chara= cter checksum isn't adequate for more than 80 data characters.
While we could= use the 15 character checksum for both cases, we prefer to keep the string= s as short as possible for the more common cases of 128-bit and 256-bit mas= ter seeds.
We only guarantee to correct 4 characters no matter how long the strin= g is.
= = Longer strings mean more chances for transcription errors, so shorter strin= gs are better.

The longest data part using the regular 13 character checksum is 93 charac= ters and corresponds to a 400-bit secret.
At this length, the prefix <code>= MS1</code> is not covered by the checksum.
This is acceptable because the c= hecksum scheme itself requires you to know that the <code>MS1</cod= e> prefix is being used in the first place.
If the prefix is damaged and a use= r is guessing that the data might be using this scheme, then the user can e= nter the available data explicitly using the suspected <code>MS1</= code> prefix.

=3D=3DBackwards Compatibility=3D=3D

codex32 is an alternative to BIP-0039 and SLIP-003= 9.
It = is technically possible to derive the BIP32 master seed from seed words enc= oded in one of these schemes, and then to encode this seed in codex32.
For BIP-00= 39 this process is irreversible, since it involves hashing the original wor= ds.
Fu= rthermore, the resulting seed will be 512 bits long, which may be too large= to be safely and conveniently handled.

SLIP-0039 seed words can be reversibly converted = to master seeds, so it is possible to interconvert between SLIP-0039 and co= dex32.
However, SLIP-0039 '''shares''' cannot be converte= d to codex32 shares because the two schemes use a different underlying fiel= d.
The authors= of this BIP do not recommend interconversion.
Instead, users who wish to switch = to codex32 should generate a fresh seed and sweep their coins.
=

=3D=3DReference Implement= ation=3D=3D

* = [https://secretcodex32.com/docs/2023-02-14--= bw.ps Reference PostScript Implementation]
* FIXME add Python implem= entation
* FIXME add Rust implementation

=3D=3DTest Vectors=3D=3D

=3D=3D=3DTest vector 1=3D=3D=3D

This example shows the co= dex32 format, when used without splitting the secret into any shares.
The data pa= rt contains 26 Bech32 characters, which corresponds to 130 bits. We truncat= e the last two bits in order to obtain a 128-bit master seed.
<= div id=3D"gmail-magicdomid375" class=3D"gmail-ace-line">
codex32 secret (Bech32): = <code>ms10testsxxxxxxxxxxxxxxxxxxxxxxxxxx4nzvca9cmczlw</code>

Master secret (= hex): <code>318c6318c6318c6318c6318c6318c631</code>

* human-readable part: &l= t;code>ms</code>
* separator: <code>1</code>
* k value: <code&g= t;0</code> (no secret splitting)
* identifier: <code>test</code>= ;
* sh= are index: <code>s</code> (the secret)
* data: <code>xxxxxxxxxx= xxxxxxxxxxxxxxxx</code>
* checksum: <code>4nzvca9cmczlw</code><= /span>

=3D=3D=3DTest = vector 2=3D=3D=3D

This example shows generating a new master seed using "random"= ; codex32 shares, as well as deriving an additional codex32 share, using &#= 39;'k''=3D2 and an identifier of <code>NAME</code>.=
Altho= ugh codex32 strings are canonically all lowercase, it's also valid to u= se all uppercase.

Share with index <code>A</code>: <code>MS12NAMEA320ZY= XWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM</code>

Share with index <code>C</code&g= t;: <code>MS12NAMECACDEFGHJKLMNPQRSTUVWXYZ023FTR2GDZMPY6PN</code&g= t;
* Derived s= hare with index <code>D</code>: <code>MS12NAMEDLL4F8JLH4E= 5VDVULDLFXU2JHDNLSM97XVENRXEG</code>
* Secret share with index <code>= S</code>: <code>MS12NAMES6XQGUZTTXKEQNJSJZV4JV3NZ5K3KWGSPHUH6EV= W</code>
* Master secret (hex): <code>d1808e096b35b209ca12132b264662a= 5</code>

Note that per BIP-0173, the lowercase form is used when determining a char= acter's value for checksum purposes.
In particular, given an all uppercase co= dex32 string, we still use lowercase <code>ms</code> as the hum= an-readable part during checksum construction.

=3D=3D=3DTest vector 3=3D=3D=3D

This example shows split= ting an existing 128-bit master seed into "random" codex32 shares= , using ''k''=3D3 and an identifier of <code>cash<= /code>.
We appended two zero bits in order to obtain 26 Bech32 characters (130= bits of data) from the 128-bit master seed.

Master secret (hex): <code>ffeeddccbba= a99887766554433221100</code>

Secret share with index <code>s</code>: &l= t;code>ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln</code>

Share with index = <code>a</code>: <code>ms13casha320zyxwvutsrqpnmlkjhgfedca= 2a8d0zehn8a0t</code>

Share with index <code>c</code>: <code>ms13c= ashcacdefghjklmnpqrstuvwxyz023949xq35my48dr</code>

* Derived share with index <c= ode>d</code>: <code>ms13cashd0wsedstcdcts64cd7wvy4m90lm28w4f= fupqs7rm</code>
* Derived share with index <code>e</code>: <= code>ms13casheekgpemxzshcrmqhaydlp6yhms3ws7320xyxsar9</code>
* Derived s= hare with index <code>f</code>: <code>ms13cashf8jh6sdrkpy= rsp5ut94pj8ktehhw2hfvyrj48704</code>

Any three of the five shares among <code>= ;acdef</code> can be used to recover the secret.

Note that the choice to append tw= o zero bits was arbitrary, and any of the following four secret shares woul= d have been valid choices.
However, each choice would have resulted in a differen= t set of derived shares.

* <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln&= lt;/code>
* <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qpte35dvzkjpt0r</c= ode>
* <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qzfatvdwq5692k6</code&g= t;
* &= lt;code>ms13cashsllhdmn9m42vcsamx24zrxgs3qrsx6ydhed97jx2</code>

=3D=3D=3DTest ve= ctor 4=3D=3D=3D

This example shows converting a 256-bit secret into a codex32 secret, wit= hout splitting the secret into any shares.
We appended four zero bits in order to= obtain 52 Bech32 characters (260 bits of data) from the 256-bit secret.

256-bit secret (= hex): <code>ffeeddccbbaa99887766554433221100ffeeddccbbaa9988776655443= 3221100</code>

* codex32 secret: <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7a= hwvhw4fnzrhve25gvezzyqqtum9pgv99ycma</code>

Note that the choice to append four zer= o bits was arbitrary, and any of the following sixteen codex32 secrets woul= d have been valid:

* <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gve= zzyqqtum9pgv99ycma</code>
* <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qr= l7ahwvhw4fnzrhve25gvezzyqpj82dp34u6lqtd</code>
* <code>ms10leetsllhdm= n9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqzsrs4pnh7jmpj5</code>
* <c= ode>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqrfcpap2w= 8dqezy</code>
* <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnz= rhve25gvezzyqy5tdvphn6znrf0</code>
* <code>ms10leetsllhdmn9m42vcsamx2= 4zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq9dsuypw2ragmel</code>
<= div id=3D"gmail-magicdomid448" class=3D"gmail-ace-line">* <code>ms10l= eetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqx05xupvgp4v6qx</c= ode>
* <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzy= q8k0h5p43c2hzsk</code>
* <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7a= hwvhw4fnzrhve25gvezzyqgum7hplmjtr8ks</code>
* <code>ms10leetsllhdmn9m= 42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqf9q0lpxzt5clxq</code>
* <code= >ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq28y48pyqfuu= 7le</code>
* <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhv= e25gvezzyqt7ly0paesr8x0f</code>
* <code>ms10leetsllhdmn9m42vcsamx24zr= xgs3qrl7ahwvhw4fnzrhve25gvezzyqvrvg7pqydv5uyz</code>
* <code>ms10leet= sllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqd6hekpea5n0y5j</code= >
<= span class=3D"gmail-author-a-ogz65zvz65zz90zz81zz90zwz83zdz82zz78zz65zmd">*= <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqwc= nrwpmlkmt9dt</code>
* <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwv= hw4fnzrhve25gvezzyq0pgjxpzx0ysaam</code>

=3D=3D=3DTest vector 5=3D=3D=3D

This example shows gener= ating a new 512-bit master seed using "random" codex32 characters= and appending a checksum.
The payload contains 103 Bech32 characters, which corr= esponds to 515 bits. The last three bits are discarded when converting to a= 512-bit master seed.

This is an example of a '''Long codex32 String''= ;'.

* Secr= et share with index <code>S</code>: <code>MS100C8VSM32ZXF= GUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN074RXVCEMLH8WU3TK925ACD= EFGHJKLMNPQRSTUVWXY06FHPV80UNDVARHRAK</code>
* Master secret (hex): <cod= e>dc5423251cb87175ff8110c8531d0952d8d73e1194e95b5f19d6f9df7c01111104c9ba= ecdfea8cccc677fb9ddc8aec5553b86e528bcadfdcc201c17c638c47e9</code>

=3D=3DAppendix=3D= =3D
=3D=3D=3DM= athematical Companion=3D=3D=3D

Below we use the Bech32 character set to denote values in = GF[32].
In Bech32, the letter <code>Q</code> denotes zero and the let= ter <code>P</code> denotes one.
The digits <code>0</code>= and <code>2</code> through <code>9</code> do '= 'not'' denote their numeric values.
They are simply elements of GF[32= ].
The generat= ing polynomial for our BCH code is as follows.

We extend GF[32] to GF[1024] by adjoining = a primitive cube root of unity, <code>=CE=B6</code>, satisfying= <code>=CE=B6^2 =3D =CE=B6 + P</code>.

We select <code>=CE=B2 :=3D G = =CE=B6</code> which has order 93, and construct the product <code&= gt;(x - =CE=B2^i)</code> for <code>i</code> in <code&g= t;{17, 20, 46, 49, 52, 77, 78, 79, 80, 81, 82, 83, 84}</code>.=
The resulti= ng polynomial is our generating polynomial for our 13 character checksum:

=C2=A0=C2=A0=C2= =A0 x^13 + E x^12 + M x^11 + 3 x^10 + G x^9 + Q x^8 + E x^7 + E x^6 + E x^5= + L x^4 + M x^3 + C x^2 + S x + S

For our long checksum, we select <code>=CE=B3 := =3D E + X =CE=B6</code>, which has order 1023, and construct the prod= uct <code>(x - =CE=B3^i)</code> for <code>i</code> = in <code>{32, 64, 96, 895, 927, 959, 991, 1019, 1020, 1021, 1022, 102= 3, 1024, 1025, 1026}</code>.
The resulting polynomial is our generating pol= ynomial for our 15 character checksum for long strings:

=C2=A0=C2=A0=C2=A0 x^15 + 0 x^14 = + 2 x^13 + E x^12 + 6 x^11 + F x^10 + E x^9 + 4 x^8 + X x^7 + H x^6 + 4 x^5= + X x^4 + 9 x^3 + K x^2 + Y x^1 + H

(Reminder: the character <code>0</code> = does ''not'' denote the zero of the field.)

-----END BIP-----
--000000000000ed832605f4c7c775--