Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 83074C96 for ; Tue, 26 Jun 2018 20:30:07 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-oi0-f66.google.com (mail-oi0-f66.google.com [209.85.218.66]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id ECCF276C for ; Tue, 26 Jun 2018 20:30:06 +0000 (UTC) Received: by mail-oi0-f66.google.com with SMTP id c2-v6so5596974oic.1 for ; Tue, 26 Jun 2018 13:30:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=icIqh7AHc1g+8JmkbSLoQTpnTMG+hHqGcSMjp9tDp7A=; b=PMXNp3FSv+QRm3cHWnz3U5uUctxapO7MV+WTkq/UWw5PffufEPHzNnY+yiuBxwCWPy Y0C1tSd829mk5H+vX0Du9bShMlGbSWY5pzTQnfL0130Q0fzJYYYdOFUZdBO0pNy4AVa8 LY3kOz+qaoU6k7CwS+5CXQHK/pbVoDkWvvDGMx0fu0GRUpoI5lK/NJSBiaGysW3sOfvW yDYiBiQAJSzUEOFoICFFUz7/N7sda8SnMQOnV33qYsvsTu20dddMuqUoIY+o7NFy8B4F d+dmW/dGassyX0TsTzXBLQyeXh1qjmjUmDQIkuU8xaZMNnF3/Hxfh8SPYtBN/F8H7OgO q7Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=icIqh7AHc1g+8JmkbSLoQTpnTMG+hHqGcSMjp9tDp7A=; b=SrPcnEn9YGiHrk50tVD/k/g5wEnUv+GXJneFCn+kPWLO8eOwZLFrCR2bvINUFUaSyU JRLGCFY4FqtqJCJO/T7SnixS6HCrB2D4afihCvqtWdXknTVypcD7HBB5XkXSR/zj3chL X1dHEKvPfvjKGL+9hOO4cA6O5281jeQF9jBV01jmCdDwtvq7JHHXqkgZYXjpdMKNMK66 bHZidCPsvmheN7Zr+aY4aM4qrbTiW4AQ5PCvBnjbkQK+RxmIXPTUN3xK7RMLPEETsiZE 8IV5Wfxfoy7ODFnA4ovOlNdxmffh/6beKEkpRZyoFDNuIylmz51kJiqEF1kCVdfmRuUN CnuQ== X-Gm-Message-State: APt69E3deMfK2xTwqv9ZetWxeP5GTYJJ/tEpP2HcStLSJuVRHN46pEe9 qkSJ/YcZcY2KzXvs8vc21kGK4xDXQCmzQV7S6Vc= X-Google-Smtp-Source: AAOMgpdJDXMIhj6dGYGelIscivIoVH0+y25Zbn5lqWCQGTkkGOM4fMpjwlebXzC3ldCCE17LeHMynKfNbC+s01AiQ0E= X-Received: by 2002:aca:c141:: with SMTP id r62-v6mr1614528oif.68.1530045005955; Tue, 26 Jun 2018 13:30:05 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a4a:6a89:0:0:0:0:0 with HTTP; Tue, 26 Jun 2018 13:30:04 -0700 (PDT) In-Reply-To: References: <21a616f5-7a17-35b9-85ea-f779f20a6a2d@satoshilabs.com> <20180621195654.GC99379@coinkite.com> From: Pieter Wuille Date: Tue, 26 Jun 2018 13:30:04 -0700 Message-ID: To: matejcik , Bitcoin Protocol Discussion Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] BIP 174 thoughts X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jun 2018 20:30:07 -0000 On Tue, Jun 26, 2018 at 8:33 AM, matejcik via bitcoin-dev wrote: > I'm still going to argue against the key-value model though. > > It's true that this is not significant in terms of space. But I'm more > concerned about human readability, i.e., confusing future implementers. > At this point, the key-value model is there "for historical reasons", > except these aren't valid even before finalizing the format. The > original rationale for using key-values seems to be gone (no key-based > lookups are necessary). As for combining and deduplication, whether key > data is present or not is now purely a stand-in for a "repeatable" flag. > We could just as easily say, e.g., that the high bit of "type" specifies > whether this record can be repeated. I understand this is a philosophical point, but to me it's the opposite. The file conveys "the script is X", "the signature for key X is Y", "the derivation for key X is Y" - all extra metadata added to inputs of the form "the X is Y". In a typed record model, you still have Xes, but they are restricted to a single number (the record type). In cases where that is insufficient, your solution is adding a repeatable flag to switch from "the first byte needs to be unique" to "the entire record needs to be unique". Why just those two? It seems much more natural to have a length that directly tells you how many of the first bytes need to be unique (which brings you back to the key-value model). Since the redundant script hashes were removed by making the scripts per-input, I think the most compelling reason (size advantages) for a record based model is gone. > (Moreover, as I wrote previously, the Combiner seems like a weirdly > placed role. I still don't see its significance and why is it important > to correctly combine PSBTs by agents that don't understand them. If you > have a usecase in mind, please explain. Forward compatibility with new script types. A transaction may spend inputs from different outputs, with different script types. Perhaps some of these are highly specialized things only implemented by some software (say HTLCs of a particular structure), in non-overlapping ways where no piece of software can handle all scripts involved in a single transaction. If Combiners cannot deal with unknown fields, they won't be able to deal with unknown scripts. That would mean that combining must be done independently by Combiner implementations for each script type involved. As this is easily avoided by adding a slight bit of structure (parts of the fields that need to be unique - "keys"), this seems the preferable option. > ISTM a Combiner could just as well combine based on whole-record > uniqueness, and leave the duplicate detection to the Finalizer. In case > the incoming PSBTs have incompatible unique fields, the Combiner would > have to fail anyway, so the Finalizer might as well do it. Perhaps it > would be good to leave out the Combiner role entirely?) No, a Combiner can pick any of the values in case different PSBTs have different values for the same key. That's the point: by having a key-value structure the choice of fields can be made such that Combiners don't need to care about the contents. Finalizers do need to understand the contents, but they only operate once at the end. Combiners may be involved in any PSBT passing from one entity to another. > There's two remaining types where key data is used: BIP32 derivations > and partial signatures. In case of BIP32 derivation, the key data is > redundant ( pubkey = derive(value) ), so I'd argue we should leave that > out and save space. In case of partial signatures, it's simple enough to > make the pubkey part of the value. In case of BIP32 derivation, computing the pubkeys is possibly expensive. A simple signer can choose to just sign with whatever keys are present, but they're not the only way to implement a signer, and even less the only software interacting with this format. Others may want to use a matching approach to find keys that are relevant; without pubkeys in the format, they're forced to perform derivations for all keys present. And yes, it's simple enough to make the key part of the value everywhere, but in that case it becomes legal for a PSBT to contain multiple signatures for a key, for example, and all software needs to deal with that possibility. With a stronger uniqueness constraint, only Combiners need to deal with repetitions. > Thing is: BIP174 *is basically protobuf* (v2) as it stands. If I'm > succesful in convincing you to switch to a record set model, it's going > to be "protobuf with different varint". If you take the records model, and then additionally drop the whole-record uniqueness constraint, yes, though that seems pushing it a bit by moving even more guarantees from the file format to application level code. I'd like to hear opinions of other people who have worked on implementations about changing this. Cheers, -- Pieter