Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 362B2D66 for ; Wed, 27 Jun 2018 15:06:54 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-ot0-f176.google.com (mail-ot0-f176.google.com [74.125.82.176]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 0A172734 for ; Wed, 27 Jun 2018 15:06:52 +0000 (UTC) Received: by mail-ot0-f176.google.com with SMTP id a6-v6so2563177otf.2 for ; Wed, 27 Jun 2018 08:06:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=DE/JEvDouhZUyWNJHMEk8jEZ7Ij9nMzQDVa8xC1p/uQ=; b=VSzr5qSTzGM+Firb+/6R55NpNQnHWebHoduOaOEEoNWtPzosLskQ6AamLRoJ/xVqlf jTUuPxxKEWBluAR/gEIB/exZLcCrw7t1L1IdlstLkyRsOMyLk2viIR9+XQfJFWrVzboX BMv+J9b1rHl5lqWl7usTvlMH7F5Uq/yXTVIIJ+FX3eTjlPSoPrlaIp9SbPGD35gHlVYU 2DQzCDMPGdXTfRnWEV4ITk6JcHXvBjDv8j+4wUQPGk0Hp8YuESICl9jZsjlV5nOihshX n6OLjZCxwcZKClg+v/SFaWauHTYOTN8EAM8uqNlCs/P4Dg/4Ejkkq4NdIV9422CSn65C ZdAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DE/JEvDouhZUyWNJHMEk8jEZ7Ij9nMzQDVa8xC1p/uQ=; b=SQmwxpXfdipiugsQqYYtlcjs1qjrLPzhqFSnYPE0s5ZDJWUVUNrs42DwbDrRC93C5H m/Ib1ZhR/jnd+Xl9iU+ItYsi6aJdMvm95roIeaFVzWiqcyvklUquUXfP9U4LF6YvrBpb MOIRA75ivV/D610zz+k+qiP6qsfFzSAN1lItrZJWUgKGjfgk7lGrODyJEcex5fA63LDN uzgNl1zqCxopkFx8oOV7g+MZ0XdaVgnuheQjH9QsCCIb2E0oWEzzXJxGrSBxzT7WXyR+ aex8OycCeaByX611VddLDXweO4YEY5x7i6el8xmplEuUZENeCIA6DwRUfzRbvsh0I7/Y JYjQ== X-Gm-Message-State: APt69E09QuWev6a1WhYxvzBBTEONjiWtdEBaHeCdaQttHLWAvigcjIJ3 /gQmsMvMgCrvIJBBpfkTJeY9DqhuA2hmIGNaeCEkhQ== X-Google-Smtp-Source: AAOMgpcqMdfcwYRPnnbnWk6FdYYj23nbQJrogmapXxRRjwLez62/FtYFfiQgPxLeD+QiMbHtAOSz3AlbMQHlVzRnIio= X-Received: by 2002:a9d:5d18:: with SMTP id b24-v6mr3771186oti.227.1530112012019; Wed, 27 Jun 2018 08:06:52 -0700 (PDT) MIME-Version: 1.0 References: <21a616f5-7a17-35b9-85ea-f779f20a6a2d@satoshilabs.com> <20180621195654.GC99379@coinkite.com> <881def14-696c-3207-cf6c-49f337ccf0d1@satoshilabs.com> In-Reply-To: <881def14-696c-3207-cf6c-49f337ccf0d1@satoshilabs.com> From: Pieter Wuille Date: Wed, 27 Jun 2018 08:06:39 -0700 Message-ID: To: matejcik Content-Type: multipart/alternative; boundary="000000000000f16fc2056fa0f670" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Bitcoin Dev Subject: Re: [bitcoin-dev] BIP 174 thoughts X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jun 2018 15:06:54 -0000 --000000000000f16fc2056fa0f670 Content-Type: text/plain; charset="UTF-8" On Wed, Jun 27, 2018, 07:04 matejcik wrote: > hello, > > On 26.6.2018 22:30, Pieter Wuille wrote: > >> (Moreover, as I wrote previously, the Combiner seems like a weirdly > >> placed role. I still don't see its significance and why is it important > >> to correctly combine PSBTs by agents that don't understand them. If you > >> have a usecase in mind, please explain. > > > > Forward compatibility with new script types. A transaction may spend > > inputs from different outputs, with different script types. Perhaps > > some of these are highly specialized things only implemented by some > > software (say HTLCs of a particular structure), in non-overlapping > > ways where no piece of software can handle all scripts involved in a > > single transaction. If Combiners cannot deal with unknown fields, they > > won't be able to deal with unknown scripts. > > Record-based Combiners *can* deal with unknown fields. Either by > including both versions, or by including one selected at random. This is > the same in k-v model. > Yes, I wasn't claiming otherwise. This was just a response to your question why it is important that Combiners can process unknown fields. It is not an argument in favor of one model or the other. > combining must be done independently by Combiner implementations for > > each script type involved. As this is easily avoided by adding a > > slight bit of structure (parts of the fields that need to be unique - > > "keys"), this seems the preferable option. > > IIUC, you're proposing a "semi-smart Combiner" that understands and > processes some fields but not others? That doesn't seem to change > things. Either the "dumb" combiner throws data away before the "smart" > one sees it, or it needs to include all of it anyway. > No, I'm exactly arguing against smartness in the Combiner. It should always be possible to implement a Combiner without any script specific logic. > No, a Combiner can pick any of the values in case different PSBTs have > > different values for the same key. That's the point: by having a > > key-value structure the choice of fields can be made such that > > Combiners don't need to care about the contents. Finalizers do need to > > understand the contents, but they only operate once at the end. > > Combiners may be involved in any PSBT passing from one entity to > > another. > > Yes. Combiners don't need to care about the contents. > So why is it important that a Combiner properly de-duplicates the case > where keys are the same but values are different? This is a job that, > AFAICT so far, can be safely left to someone along the chain who > understands that particular record. > That's because PSBTs can be copied, signed, and combined back together. A Combiner which does not deduplicate (at all) would end up having every original record present N times, one for each copy, a possibly large blowup. For all fields I can think of right now, that type of deduplication can be done through whole-record uniqueness. The question whether you need whole-record uniqueness or specified-length uniqueness (=what is offered by a key-value model) is a philosophical one (as I mentioned before). I have a preference for stronger invariants on the file format, so that it becomes illegal for a PSBT to contain multiple signatures for the same key for example, and implementations do not need to deal with the case where multiple are present. It seems that you consider the latter PSBT "invalid". But it is well > formed and doesn't contain duplicate records. A Finalizer, or a > different Combiner that understands field F, can as well have the rule > "throw away all but one" for this case. > It's not about considering. We're writing a specification. Either it is made invalid, or not. In a key-value model you can have dumb combiners that must pick one of the keys in case of duplication, and remove the necessity of dealing with duplication from all other implementations (which I consider to be a good thing). In a record-based model you cannot guarantee deduplication of records that permit repetition per type, because a dumb combiner cannot understand what part is supposed to be unique. As a result, a record-based model forces you to let all implementations deal with e.g. multiple partial signatures for a single key. This is a minor issue, but in my view shows how records are a less than perfect match for the problem at hand. To repeat and restate my central question: > Why is it important, that an agent which doesn't understand a particular > field structure, can nevertheless make decisions about its inclusion or > omission from the result (based on a repeated prefix)? > Again, because otherwise you may need a separate Combiner for each type of script involved. That would be unfortunate, and is very easily avoided. Actually, I can imagine the opposite: having fields with same "key" > (identifying data), and wanting to combine their "values" intelligently > without losing any of the data. Say, two Signers producing separate > parts of a combined-signature under the same common public key? > That can always be avoided by using different identifying information as key for these fields. In your example, assuming you're talking about some form of threshold signature scheme, every party has their own "shard" of the key, which still uniquely identifies the participant. If they have no data that is unique to the participant, they are clones, and don't need to interact regardless. > In case of BIP32 derivation, computing the pubkeys is possibly > > expensive. A simple signer can choose to just sign with whatever keys > > are present, but they're not the only way to implement a signer, and > > even less the only software interacting with this format. Others may > > want to use a matching approach to find keys that are relevant; > > without pubkeys in the format, they're forced to perform derivations > > for all keys present. > > I'm going to search for relevant keys by comparing master fingerprint; I > would expect HWWs generally don't have index based on leaf pubkeys. > OTOH, Signers with lots of keys probably aren't resource-constrained and > can do the derivations in case of collisions. > Perhaps you want to avoid signing with keys that are already signed with? If you need to derive all the keys before even knowing what was already signed with, you've already performed 80% of the work. > If you take the records model, and then additionally drop the > > whole-record uniqueness constraint, yes, though that seems pushing it > > a bit by moving even more guarantees from the file format to > > application level code. > > The "file format" makes no guarantees, because the parsing code and > application code is the same anyway. You could say I'm proposing to > separate these concerns ;) > Of course a file format can make guarantees. If certain combinations of data in it do not satsify the specification, the file is illegal, and implementations do not need to deal with it. Stricter file formats are easier to deal with, because there are less edge cases to consider. To your point: proto v2 afaik has no way to declare "whole record uniqueness", so either you drop that (which I think is unacceptable - see the copy/sign/combine argument above), or you deal with it in your application code. Cheers, -- Pieter --000000000000f16fc2056fa0f670 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, = Jun 27, 2018, 07:04 matejcik <jan.matejek@satoshilabs.com> wrote:
hello,

On 26.6.2018 22:30, Pieter Wuille wrote:
>> (Moreover, as I wrote previously, the Combiner seems like a weirdl= y
>> placed role. I still don't see its significance and why is it = important
>> to correctly combine PSBTs by agents that don't understand the= m. If you
>> have a usecase in mind, please explain.
>
> Forward compatibility with new script types. A transaction may spend > inputs from different outputs, with different script types. Perhaps > some of these are highly specialized things only implemented by some > software (say HTLCs of a particular structure), in non-overlapping
> ways where no piece of software can handle all scripts involved in a > single transaction. If Combiners cannot deal with unknown fields, they=
> won't be able to deal with unknown scripts.

Record-based Combiners *can* deal with unknown fields. Either by
including both versions, or by including one selected at random. This is the same in k-v model.

Yes, I wasn't claiming otherwise. This was just a= response to your question why it is important that Combiners can process u= nknown fields. It is not an argument in favor of one model or the other.

> combining must be done independently by Combiner implementations for > each script type involved. As this is easily avoided by adding a
> slight bit of structure (parts of the fields that need to be unique -<= br> > "keys"), this seems the preferable option.

IIUC, you're proposing a "semi-smart Combiner" that understan= ds and
processes some fields but not others? That doesn't seem to change
things. Either the "dumb" combiner throws data away before the &q= uot;smart"
one sees it, or it needs to include all of it anyway.

No, I'm exactly ar= guing against smartness in the Combiner. It should always be possible to im= plement a Combiner without any script specific logic.

> No, a Combiner can pick any of the values in case different PSBTs have=
> different values for the same key. That's the point: by having a > key-value structure the choice of fields can be made such that
> Combiners don't need to care about the contents. Finalizers do nee= d to
> understand the contents, but they only operate once at the end.
> Combiners may be involved in any PSBT passing from one entity to
> another.

Yes. Combiners don't need to care about the contents.
So why is it important that a Combiner properly de-duplicates the case
where keys are the same but values are different? This is a job that,
AFAICT so far, can be safely left to someone along the chain who
understands that particular record.

That's because PSBTs can be copied, = signed, and combined back together. A Combiner which does not deduplicate (= at all) would end up having every original record present N times, one for = each copy, a possibly large blowup.

For all fields I can think of right now, that type of deduplica= tion can be done through whole-record uniqueness.
The question whether you need whole-record unique= ness or specified-length uniqueness (=3Dwhat is offered by a key-value mode= l) is a philosophical one (as I mentioned before). I have a preference for = stronger invariants on the file format, so that it becomes illegal for a PS= BT to contain multiple signatures for the same key for example, and impleme= ntations do not need to deal with the case where multiple are present.

<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex"> It seems that you consider the latter PSBT "invalid". But it is w= ell
formed and doesn't contain duplicate records. A Finalizer, or a
different Combiner that understands field F, can as well have the rule
"throw away all but one" for this case.

It's not about conside= ring. We're writing a specification. Either it is made invalid, or not.=

In a key-value model yo= u can have dumb combiners that must pick one of the keys in case of duplica= tion, and remove the necessity of dealing with duplication from all other i= mplementations (which I consider to be a good thing). In a record-based mod= el you cannot guarantee deduplication of records that permit repetition per= type, because a dumb combiner cannot understand what part is supposed to b= e unique. As a result, a record-based model forces you to let all implement= ations deal with e.g. multiple partial signatures for a single key. This is= a minor issue, but in my view shows how records are a less than perfect ma= tch for the problem at hand.

To repeat and restate my central question:
Why is it important, that an agent which doesn't understand a particula= r
field structure, can nevertheless make decisions about its inclusion or
omission from the result (based on a repeated prefix)?

Again, because otherw= ise you may need a separate Combiner for each type of script involved. That= would be unfortunate, and is very easily avoided.
<= br>
Actually, I can imagine the opposite: having fields with same "key&quo= t;
(identifying data), and wanting to combine their "values" intelli= gently
without losing any of the data. Say, two Signers producing separate
parts of a combined-signature under the same common public key?

That can alw= ays be avoided by using different identifying information as key for these = fields. In your example, assuming you're talking about some form of thr= eshold signature scheme, every party has their own "shard" of the= key, which still uniquely identifies the participant. If they have no data= that is unique to the participant, they are clones, and don't need to = interact regardless.

> In case of BIP32 derivation, computing the pubkeys is possibly
> expensive. A simple signer can choose to just sign with whatever keys<= br> > are present, but they're not the only way to implement a signer, a= nd
> even less the only software interacting with this format. Others may > want to use a matching approach to find keys that are relevant;
> without pubkeys in the format, they're forced to perform derivatio= ns
> for all keys present.

I'm going to search for relevant keys by comparing master fingerprint; = I
would expect HWWs generally don't have index based on leaf pubkeys.
OTOH, Signers with lots of keys probably aren't resource-constrained an= d
can do the derivations in case of collisions.
<= div dir=3D"auto">
Perhaps you want to avoid sign= ing with keys that are already signed with? If you need to derive all the k= eys before even knowing what was already signed with, you've already pe= rformed 80% of the work.

> If you take the records model, and then additionally drop the
> whole-record uniqueness constraint, yes, though that seems pushing it<= br> > a bit by moving even more guarantees from the file format to
> application level code.

The "file format" makes no guarantees, because the parsing code a= nd
application code is the same anyway. You could say I'm proposing to
separate these concerns ;)
Of course a file format can make guarantees. If c= ertain combinations of data in it do not satsify the specification, the fil= e is illegal, and implementations do not need to deal with it. Stricter fil= e formats are easier to deal with, because there are less edge cases to con= sider.

To your point: pr= oto v2 afaik has no way to declare "whole record uniqueness", so = either you drop that (which I think is unacceptable - see the copy/sign/com= bine argument above), or you deal with it in your application code.=C2=A0

Cheers,

--=C2=A0
Pieter

--000000000000f16fc2056fa0f670--