Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of gmail.com
	designates 209.85.214.170 as permitted sender)
	client-ip=209.85.214.170; envelope-from=pieter.wuille@gmail.com;
	helo=mail-ob0-f170.google.com; 
MIME-Version: 1.0
In-Reply-To: <CALxbBHWyhTTbRAdWbeGp+fdKRif=ghh7W6eP-vak7jHdot6uJA@mail.gmail.com>
References: <CALxbBHUnt7ToVK9reH6W6uT4HV=7NbxGHyNWWa-OEHg+Z1+qOg@mail.gmail.com>
	<CAPg+sBggj382me1ATDx4SS9KHVfvX5KH7ZhLHN6B+2_a+Emw1Q@mail.gmail.com>
	<CALxbBHU-0huAs_y3cZCfmKKAAq3LHut8DwdSGm+1Rym3pb9j2A@mail.gmail.com>
	<CAPg+sBjX=u4Osbzr+25w-5QzzhWGKryzW2K-0Xu3gS0eJXUUDw@mail.gmail.com>
	<CALxbBHWyhTTbRAdWbeGp+fdKRif=ghh7W6eP-vak7jHdot6uJA@mail.gmail.com>
Date: Wed, 13 May 2015 12:40:54 -0700
Message-ID: <CAPg+sBh=KGLMyRmLfPujNuPnfmwpcsC2F8McypyTkCAcj=EaXw@mail.gmail.com>
From: Pieter Wuille <pieter.wuille@gmail.com>
To: Christian Decker <decker.christian@gmail.com>
Content-Type: multipart/alternative; boundary=089e011618b2170ece0515fbca51
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] [BIP] Normalized Transaction IDs
Precedence: list

--089e011618b2170ece0515fbca51
Content-Type: text/plain; charset=ISO-8859-1

On Wed, May 13, 2015 at 12:14 PM, Christian Decker <
decker.christian@gmail.com> wrote:

>
> On Wed, May 13, 2015 at 8:40 PM Pieter Wuille <pieter.wuille@gmail.com>
> wrote:
>
>> On Wed, May 13, 2015 at 11:04 AM, Christian Decker <
>> decker.christian@gmail.com> wrote:
>>
>>> If the inputs to my transaction have been long confirmed I can be
>>> reasonably safe in assuming that the transaction hash does not change
>>> anymore. It's true that I have to be careful not to build on top of
>>> transactions that use legacy references to transactions that are
>>> unconfirmed or have few confirmations, however that does not invalidate the
>>> utility of the normalized transaction IDs.
>>>
>>
>> Sufficient confirmations help of course, but make systems like this less
>> useful for more complex interactions where you have multiple unconfirmed
>> transactions waiting on each other. I think being able to rely on this
>> problem being solved unconditionally is what makes the proposal attractive.
>> For the simple cases, see BIP62.
>>
>
> If we are building a long running contract using a complex chain of
> transactions, or multiple transactions that depend on each other, there is
> no point in ever using any malleable legacy transaction IDs and I would
> simply stop cooperating if you tried. I don't think your argument applies.
> If we build our contract using only normalized transaction IDs there is no
> way of suffering any losses due to malleability.
>

That's correct as long as you stay within your contract, but you likely
want compatibility with other software, without waiting an age before and
after your contract settles on the chain. It's a weaker argument, though, I
agree.

I remember reading about the SIGHASH proposal somewhere. It feels really
>>> hackish to me: It is a substantial change to the way signatures are
>>> verified, I cannot really see how this is a softfork if clients that did
>>> not update are unable to verify transactions using that SIGHASH Flag and it
>>> is adding more data (the normalized hash) to the script, which has to be
>>> stored as part of the transaction. It may be true that a node observing
>>> changes in the input transactions of a transaction using this flag could
>>> fix the problem, however it requires the node's intervention.
>>>
>>
>> I think you misunderstand the idea. This is related, but orthogonal to
>> the ideas about extended the sighash flags that have been discussed here
>> before.
>>
>> All it's doing is adding a new CHECKSIG operator to script, which, in its
>> internally used signature hash, 1) removes the scriptSigs from transactions
>> before hashing 2) replaces the txids in txins by their ntxid. It does not
>> add any data to transactions, and it is a softfork, because it only impacts
>> scripts which actually use the new CHECKSIG operator. Wallets that don't
>> support signing with this new operator would not give out addresses that
>> use it.
>>
>
> In that case I don't think I heard this proposal before, and I might be
> missing out :-)
> So if transaction B spends an output from A, then the input from B
> contains the CHECKSIG operator telling the validating client to do what
> exactly? It appears that it wants us to go and fetch A, normalize it, put
> the normalized hash in the txIn of B and then continue the validation?
> Wouldn't that also need a mapping from the normalized transaction ID to the
> legacy transaction ID that was confirmed?
>

There would just be an OP_CHECKAWESOMESIG, which can do anything. It can
identical to how OP_CHECKSIG works now, but has a changed algorithm for its
signature hash algorithm. Optionally (and likely in practice, I think), it
can do various other proposed improvements, like using Schnorr signatures,
having a smaller signature encoding, supporting batch validation, have
extended sighash flags, ...

It wouldn't fetch A and normalize it; that's impossible as you would need
to go fetch all of A's dependencies too and recurse until you hit the
coinbases that produced them. Instead, your UTXO set contains the
normalized txid for every normal txid (which adds around 26% to the UTXO
set size now), but lookups in it remain only by txid.

You don't need a ntxid->txid mapping, as transactions and blocks keep
referring to transactions by txid. Only the OP_CHECKAWESOMESIG operator
would do the conversion, and at most once.

A client that did not update still would have no clue on how to handle
> these transactions, since it simply does not understand the CHECKSIG
> operator. If such a transaction ends up in a block I cannot even catch up
> with the network since the transaction does not validate for me.
>

As for every softfork, it works by redefining an OP_NOP operator, so old
nodes simply consider these checksigs unconditionally valid. That does mean
you don't want to use them before the consensus rule is forked in
(=enforced by a majority of the hashrate), and that you suffer from the
temporary security reduction that an old full node is unknowingly reduced
to SPV security for these opcodes. However, as full node wallet, this
problem does not affect you, as your wallet would simply not give out
addresses using the new opcode (and thus, wouldn't receive coins using it),
unless it was upgraded to support it.

Could you provide an example of how this works?
>
>
>>
>>> Compare that to the simple and clean solution in the proposal, which
>>> does not add extra data to be stored, keeps the OP_*SIG* semantics as they
>>> are and where once you sign a transaction it does not have to be monitored
>>> or changed in order to be valid.
>>>
>>
>> OP_*SIG* semantics don't change here either, we're just adding a superior
>> opcode (which in most ways behaves the same as the existing operators). I
>> agree with the advantage of not needing to monitor transactions afterwards
>> for malleated inputs, but I think you underestimate the deployment costs.
>> If you want to upgrade the world (eventually, after the old index is
>> dropped, which is IMHO the only point where this proposal becomes superior
>> to the alternatives) to this, you're changing *every single piece of
>> Bitcoin software on the planet*. This is not just changing some validation
>> rules that are opt-in to use, you're fundamentally changing how
>> transactions refer to each other.
>>
>
> As I mentioned before, this is a really long term strategy, hoping to get
> the cleanest and easiest solution, so that we do not further complicate the
> inner workings of Bitcoin. I don't think that it is completely out of
> question to eventually upgrade to use normalized transactions, after all
> the average lifespan of hardware is a few years tops.
>

Fair enough, I definitely agree the end result is superior in this case.

Also, what do blocks commit to? Do you keep using the old transaction ids
>> for this? Because if you don't, any relayer on the network can invalidate a
>> block (and have the receiver mark it as invalid) by changing the txids. You
>> need to somehow commit to the scriptSig data in blocks still so the POW of
>> a block is invalidated by changing a scriptSig.
>>
>
> How could I change the transaction IDs if I am a relayer? The miner
> decides which flavor of IDs it is adding into its merkle tree, the block
> hash locks in the choice. If we saw a transaction having a valid sigScript,
> it does not matter how we reference it in the block.
>

If the merkle tree of a block only commits to a transaction's normalized
hash, that means that the block hash does not change when the scriptSig is
altered. So, anyone on the network can take a random valid block, and
modify its scriptSig, and the block will become invalid _without_
invalidating the block header. This means that nodes on the network will
now classify that block header as having invalid transactions, and reject
it. Not having the ability anymore to mark blocks as invalid opens
significant DoS risks.

So yes, seeing a block with valid scriptSigs is indeed a proof the
transaction was legitimately authored. But the oppose is no longer true,
and we need that. The correct solution is to either keep using the old full
transaction ids in blocks, but ntxids everywhere else, or having some
alternative means to commit to the scriptSigs inside the block (for example
in the coinbase or using one of the more efficient block commitment
proposals), and have that enforced as consensus rule.

-- 
Pieter

--089e011618b2170ece0515fbca51
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Wed, May 13, 2015 at 12:14 PM, Christian Decker <span d=
ir=3D"ltr">&lt;<a href=3D"mailto:decker.christian@gmail.com" target=3D"_bla=
nk">decker.christian@gmail.com</a>&gt;</span> wrote:<br><div class=3D"gmail=
_extra"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=
=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=
=3D"ltr"><div class=3D"gmail_quote"><br></div><div class=3D"gmail_quote"><s=
pan class=3D"">On Wed, May 13, 2015 at 8:40 PM Pieter Wuille &lt;<a href=3D=
"mailto:pieter.wuille@gmail.com" target=3D"_blank">pieter.wuille@gmail.com<=
/a>&gt; wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">On Wed, M=
ay 13, 2015 at 11:04 AM, Christian Decker <span dir=3D"ltr">&lt;<a href=3D"=
mailto:decker.christian@gmail.com" target=3D"_blank">decker.christian@gmail=
.com</a>&gt;</span> wrote:<br></div><div dir=3D"ltr"><div class=3D"gmail_ex=
tra"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"=
margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"=
ltr">If the inputs to my transaction have been long confirmed I can be reas=
onably safe in assuming that the transaction hash does not change anymore. =
It&#39;s true that I have to be careful not to build on top of transactions=
 that use legacy references to transactions that are unconfirmed or have fe=
w confirmations, however that does not invalidate the utility of the normal=
ized transaction IDs.=A0</div></blockquote><div><br></div></div></div></div=
><div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><di=
v>Sufficient confirmations help of course, but make systems like this less =
useful for more complex interactions where you have multiple unconfirmed tr=
ansactions waiting on each other. I think being able to rely on this proble=
m being solved unconditionally is what makes the proposal attractive. For t=
he simple cases, see BIP62.<br></div></div></div></div></blockquote><div><b=
r></div></span><div>If we are building a long running contract using a comp=
lex chain of transactions, or multiple transactions that depend on each oth=
er, there is no point in ever using any malleable legacy transaction IDs an=
d I would simply stop cooperating if you tried. I don&#39;t think your argu=
ment applies. If we build our contract using only normalized transaction ID=
s there is no way of suffering any losses due to malleability.</div></div><=
/div></blockquote><div><br></div><div>That&#39;s correct as long as you sta=
y within your contract, but you likely want compatibility with other softwa=
re, without waiting an age before and after your contract settles on the ch=
ain. It&#39;s a weaker argument, though, I agree.<br><br></div><blockquote =
class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid=
;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><span class=
=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l=
eft:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_e=
xtra"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D=
"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D=
"ltr"><div>I remember reading about the SIGHASH proposal somewhere. It feel=
s really hackish to me: It is a substantial change to the way signatures ar=
e verified, I cannot really see how this is a softfork if clients that did =
not update are unable to verify transactions using that SIGHASH Flag and it=
 is adding more data (the normalized hash) to the script, which has to be s=
tored as part of the transaction. It may be true that a node observing chan=
ges in the input transactions of a transaction using this flag could fix th=
e problem, however it requires the node&#39;s intervention.</div></div></bl=
ockquote><div><br></div></div></div></div><div dir=3D"ltr"><div class=3D"gm=
ail_extra"><div class=3D"gmail_quote"><div>I think you misunderstand the id=
ea. This is related, but orthogonal to the ideas about extended the sighash=
 flags that have been discussed here before.<br><br></div><div>All it&#39;s=
 doing is adding a new CHECKSIG operator to script, which, in its internall=
y used signature hash, 1) removes the scriptSigs from transactions before h=
ashing 2) replaces the txids in txins by their ntxid. It does not add any d=
ata to transactions, and it is a softfork, because it only impacts scripts =
which actually use the new CHECKSIG operator. Wallets that don&#39;t suppor=
t signing with this new operator would not give out addresses that use it.<=
br></div></div></div></div></blockquote><div><br></div></span><div>In that =
case I don&#39;t think I heard this proposal before, and I might be missing=
 out :-)</div><div>So if transaction B spends an output from A, then the in=
put from B contains the CHECKSIG operator telling the validating client to =
do what exactly? It appears that it wants us to go and fetch A, normalize i=
t, put the normalized hash in the txIn of B and then continue the validatio=
n? Wouldn&#39;t that also need a mapping from the normalized transaction ID=
 to the legacy transaction ID that was confirmed?</div></div></div></blockq=
uote><div><br></div><div>There would just be an OP_CHECKAWESOMESIG, which c=
an do anything. It can identical to how OP_CHECKSIG works now, but has a ch=
anged algorithm for its signature hash algorithm. Optionally (and likely in=
 practice, I think), it can do various other proposed improvements, like us=
ing Schnorr signatures, having a smaller signature encoding, supporting bat=
ch validation, have extended sighash flags, ...<br><br></div><div>It wouldn=
&#39;t fetch A and normalize it; that&#39;s impossible as you would need to=
 go fetch all of A&#39;s dependencies too and recurse until you hit the coi=
nbases that produced them. Instead, your UTXO set contains the normalized t=
xid for every normal txid (which adds around 26% to the UTXO set size now),=
 but lookups in it remain only by txid.<br><br></div><div>You don&#39;t nee=
d a ntxid-&gt;txid mapping, as transactions and blocks keep referring to tr=
ansactions by txid. Only the OP_CHECKAWESOMESIG operator would do the conve=
rsion, and at most once.<br><br></div><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div d=
ir=3D"ltr"><div class=3D"gmail_quote"><div>A client that did not update sti=
ll would have no clue on how to handle these transactions, since it simply =
does not understand the CHECKSIG operator. If such a transaction ends up in=
 a block I cannot even catch up with the network since the transaction does=
 not validate for me.</div></div></div></blockquote><div><br></div><div>As =
for every softfork, it works by redefining an OP_NOP operator, so old nodes=
 simply consider these checksigs unconditionally valid. That does mean you =
don&#39;t want to use them before the consensus rule is forked in (=3Denfor=
ced by a majority of the hashrate), and that you suffer from the temporary =
security reduction that an old full node is unknowingly reduced to SPV secu=
rity for these opcodes. However, as full node wallet, this problem does not=
 affect you, as your wallet would simply not give out addresses using the n=
ew opcode (and thus, wouldn&#39;t receive coins using it), unless it was up=
graded to support it.<br><br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=
=3D"ltr"><div class=3D"gmail_quote"><div>Could you provide an example of ho=
w this works?=A0</div><span class=3D""><div>=A0</div><blockquote class=3D"g=
mail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-l=
eft:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_qu=
ote"><div></div></div></div></div><div dir=3D"ltr"><div class=3D"gmail_extr=
a"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"lt=
r"><div><br></div><div>Compare that to the simple and clean solution in the=
 proposal, which does not add extra data to be stored, keeps the OP_*SIG* s=
emantics as they are and where once you sign a transaction it does not have=
 to be monitored or changed in order to be valid.</div></div></blockquote><=
div><br></div></div></div></div><div dir=3D"ltr"><div class=3D"gmail_extra"=
><div class=3D"gmail_quote"><div>OP_*SIG* semantics don&#39;t change here e=
ither, we&#39;re just adding a superior opcode (which in most ways behaves =
the same as the existing operators). I agree with the advantage of not need=
ing to monitor transactions afterwards for malleated inputs, but I think yo=
u underestimate the deployment costs. If you want to upgrade the world (eve=
ntually, after the old index is dropped, which is IMHO the only point where=
 this proposal becomes superior to the alternatives) to this, you&#39;re ch=
anging *every single piece of Bitcoin software on the planet*. This is not =
just changing some validation rules that are opt-in to use, you&#39;re fund=
amentally changing how transactions refer to each other.<br></div></div></d=
iv></div></blockquote><div><br></div></span><div>As I mentioned before, thi=
s is a really long term strategy, hoping to get the cleanest and easiest so=
lution, so that we do not further complicate the inner workings of Bitcoin.=
 I don&#39;t think that it is completely out of question to eventually upgr=
ade to use normalized transactions, after all the average lifespan of hardw=
are is a few years tops.</div></div></div></blockquote><div><br></div><div>=
Fair enough, I definitely agree the end result is superior in this case. <b=
r><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"g=
mail_quote"><span class=3D""><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr=
"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div>Also, what do =
blocks commit to? Do you keep using the old transaction ids for this? Becau=
se if you don&#39;t, any relayer on the network can invalidate a block (and=
 have the receiver mark it as invalid) by changing the txids. You need to s=
omehow commit to the scriptSig data in blocks still so the POW of a block i=
s invalidated by changing a scriptSig.<br></div></div></div></div></blockqu=
ote><div><br></div></span><div>How could I change the transaction IDs if I =
am a relayer? The miner decides which flavor of IDs it is adding into its m=
erkle tree, the block hash locks in the choice. If we saw a transaction hav=
ing a valid sigScript, it does not matter how we reference it in the block.=
</div></div></div></blockquote><div><br></div><div>If the merkle tree of a =
block only commits to a transaction&#39;s normalized hash, that means that =
the block hash does not change when the scriptSig is altered. So, anyone on=
 the network can take a random valid block, and modify its scriptSig, and t=
he block will become invalid _without_ invalidating the block header. This =
means that nodes on the network will now classify that block header as havi=
ng invalid transactions, and reject it. Not having the ability anymore to m=
ark blocks as invalid opens significant DoS risks.<br><br></div><div>So yes=
, seeing a block with valid scriptSigs is indeed a proof the transaction wa=
s legitimately authored. But the oppose is no longer true, and we need that=
. The correct solution is to either keep using the old full transaction ids=
 in blocks, but ntxids everywhere else, or having some alternative means to=
 commit to the scriptSigs inside the block (for example in the coinbase or =
using one of the more efficient block commitment proposals), and have that =
enforced as consensus rule.<br><br>-- <br></div><div>Pieter<br><br></div></=
div></div></div>

--089e011618b2170ece0515fbca51--