Return-Path: Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id ED202C002D for ; Wed, 8 Jun 2022 15:59:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id C427C419A2 for ; Wed, 8 Jun 2022 15:59:19 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -2.098 X-Spam-Level: X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no Authentication-Results: smtp4.osuosl.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YchHmY75pgS2 for ; Wed, 8 Jun 2022 15:59:16 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.8.0 Received: from mail-yb1-xb2b.google.com (mail-yb1-xb2b.google.com [IPv6:2607:f8b0:4864:20::b2b]) by smtp4.osuosl.org (Postfix) with ESMTPS id 7C19A41948 for ; Wed, 8 Jun 2022 15:59:16 +0000 (UTC) Received: by mail-yb1-xb2b.google.com with SMTP id s39so9680498ybi.0 for ; Wed, 08 Jun 2022 08:59:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=30N/JiZ9Cz8JsUlL0Ytfk9DFR30GI+yyoPyUKs7ar8U=; b=KRJDPA7a3YH/7q1jCyBogAasC96Vadnc0D7uZ4dnextWIBwjCW45ET+M517SzI/svJ 7+aqZs3dWYI80MBcQJefFTL8tsMloJUnitjHT14GSlgCYGkA89qrPyccokQDO2AQMqj4 qy+hwl8s9fcOftOoKUnDVowE2aQTkaB2aJkjmCkIahFwIGc9yUGyl9k349A7Bq40SYfZ o5v5gktboxx+z5tObnimH9EPdOAEleU5Qb1KPiEhnGgc07LbCZ0zyzOk7DxaNODqJjCl 6SIsl22NkNNo+ei6MHu4I4LMRH9suvxITA7kPtSgHT82TRlk585z8gkgyZkaRmy1qvPZ qgRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=30N/JiZ9Cz8JsUlL0Ytfk9DFR30GI+yyoPyUKs7ar8U=; b=7VkjkfKAkToSmmMjRYZD//iHHpyhsBL/HYzLy/Kz1X8K/SClr5ggTs3qgIRDBV/WW+ CJvrZF9kCJ+kZHdFKtpdw8X2p0vEox/DBAejrw3v0SaOY/FB9UPR3cI2CpvGmno+EqZw j0Sb3s0RrFCoapS1Ql4c9D6NzrdLobEr1X4CvpssTO38U1uy3ww7CZ9ZXIRKwDFBVxYn ZTb5JntL4BdnM/K8YEtEsobM3NYK6L+/ZaYapXFQ/5UeFLsIvhxn750/SCtEg5+QTgHM Gjjp81bsNOWD5UflqNZlRTlDKqI/Kh/b4O9dEXQP8h0YutK9LGUxR8fTGLhzU03Xh8DK kaJQ== X-Gm-Message-State: AOAM533Z4qq/klBsSbWWFEwt6J1bG8VmcC47cvv5o/umpVLQDgk9iT5b Iq90cAiLADiStMDI138kJ16aw4guNLmQlTc2f7r70Y1n X-Google-Smtp-Source: ABdhPJz/PE2boKVIeeLkSN7ng5GNVZajqVB587f6X+4NZ9B9/WBU1WV1tX86ObhwbYNt6g2VtOzYVaVo/0Fqdt2uwoo= X-Received: by 2002:a25:22d5:0:b0:64f:f06c:d96c with SMTP id i204-20020a2522d5000000b0064ff06cd96cmr33011207ybi.123.1654703954001; Wed, 08 Jun 2022 08:59:14 -0700 (PDT) MIME-Version: 1.0 References: <20220518003531.GA4402@erisian.com.au> <20220523213416.GA6151@erisian.com.au> <2B3D1901-901C-4000-A2B9-F6857FCE2847@erisian.com.au> <8FFE048D-854F-4D34-85DA-CE523C16EEB0@erisian.com.au> <017501d87079$4c08f9c0$e41aed40$@voskuil.org> <001201d870ac$8d7a06a0$a86e13e0$@voskuil.org> In-Reply-To: From: Suhas Daftuar Date: Wed, 8 Jun 2022 11:59:03 -0400 Message-ID: To: Bitcoin Protocol Discussion Content-Type: multipart/alternative; boundary="000000000000632bb205e0f1c9fa" Subject: Re: [bitcoin-dev] Package Relay Proposal X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Jun 2022 15:59:20 -0000 --000000000000632bb205e0f1c9fa Content-Type: text/plain; charset="UTF-8" Hi, Thanks again for your work on this! One question I have is about potential bandwidth waste in the case of nodes running with different policy rules. Here's my understanding of a scenario I think could happen: 1) Transaction A is both low-fee and non-standard to some nodes on the network. 2) Whenever a transaction T that spends A is relayed, new nodes will send INV(PKGINFO1, T) to all package-relay peers. 3) Nodes on the network that have implemented package relay, but do not accept A, will send getdata(PKGINFO1, T) and learn all of T's unconfirmed parents (~32 bytes * number of parents(T)). 4) Such nodes will reject T. But because of transaction malleability, and to avoid being blinded to a transaction unnecessarily, these nodes will likely still send getdata(PKGINFO1, T) to every node that announces T, in case someone has a transaction that includes an alternate set of parent transactions that would pass policy checks. Is that understanding correct? I think a good design goal would be to not waste bandwidth in non-adversarial situations. In this case, there would be bandwidth waste from downloading duplicate data from all your peers, just because the announcement doesn't commit to the set of parent wtxids that we'd get from the peer (and so we are unable to determine that all our peers would be telling us the same thing, just based on the announcement). Some ways to mitigate this might be to: (a) include a hash (maybe even just a 20-byte hash -- is that enough security?) of the package wtxids (in some canonical ordering) along with the wtxid of the child in the initial announcement; (b) limit the use of v1 packages to transactions with very few parents (I don't know if this is reasonable for the use cases we have in mind). Another point I wanted to bring up is about the rules around v1 package validation generally, and the use of a blockhash in transaction relay specifically. My first observation is that it won't always be the case that a v1 package relay node will be able to validate that a set of package transactions is fully sorted topologically, because there may be (non-parent) ancestors that are missing from the package and the best a peer can validate is topology within the package -- this means that a peer can validly (under this BIP) relay transaction packages out of the true topological sort (if all ancestors were included). This makes me wonder how useful this topological rule is. I suppose there is some value in preventing completely broken implementations from staying connected and so there is no harm in having the rule, but perhaps it would be helpful to add that nodes SHOULD order transactions based on topological sort in the complete transaction graph, so that if missing-from-package ancestors are already known by a peer (which is the expected case when using v1 package relay on transactions that have more than one generation of unconfirmed ancestor) then the remaining transactions are already properly ordered, and this is helpful even if unenforceable in general. The other observation I wanted to make was that having transaction relay gated on whether two nodes agree on chain tip seems like an overly restrictive criteria. I think an important design principle is that we want to minimize disruption from network splits -- if there are competing blocks found in a small window of time, it's likely that the utxo set is not materially different on the two chains (assuming miners are selecting from roughly the same sets of transactions when this happens, which is typical). Having transaction relay bifurcate on the two network halves would seem to exacerbate the difference between the two sides of the split -- users ought to be agnostic about how benign splits are resolved and would likely want their transactions to relay across the whole network. Additionally, use of a chain tip might impose a larger burden than is necessary on software that would seek to participate in transaction relay without implementing headers sync/validation. I don't know what software exists on the network, but I imagine there are a lot of scripts out there for transaction submission to the public p2p network, and in thinking about modifying such a script to utilize package relay it seems like an unnecessary added burden to first learn a node's tip before trying to relay a transaction. Could you explain again what the benefit of including the blockhash is? It seems like it is just so that a node could prioritize transaction relay from peers with the same chain tip to maximize the likelihood of transaction acceptance, but in the common case this seems like a pretty negligible concern, and in the case of a chain fork that persists for many minutes it seems better to me that we not partition the network into package-relay regimes and just risk a little extra bandwidth in one direction or the other. If we solve the problem I brought up at the beginning (of de-duplicating package data across peers with a package-wtxid-commitment in the announcement), I think this is just some wasted pkginfo bandwidth on a single-link, and not across links (as we could cache validation failure for a package-hash to avoid re-requesting duplicate pkginfo1 messages). Best, Suhas On Tue, Jun 7, 2022 at 1:57 PM Gloria Zhao via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > Hi Eric, aj, all, > > Sorry for the delayed response. @aj I'm including some paraphrased points > from our offline discussion (thanks). > > > Other idea: what if you encode the parent txs as a short hash of the > wtxid (something like bip152 short ids? perhaps seeded per peer so > collisions will be different per peer?) and include that in the inv > announcement? Would that work to avoid a round trip almost all of the time, > while still giving you enough info to save bw by deduping parents? > > > As I suggested earlier, a package is fundamentally a compact block (or > > block) announcement without the header. Compact block (BIP152) > announcement > > is already well-defined and widely implemented... > > > Let us not reinvent the wheel and/or introduce accidental complexity. I > see > > no reason why packaging is not simply BIP152 without the 'header' field, > an > > updated protocol version, and the following sort of changes to names > > Interestingly, "why not use BIP 152 shortids to save bandwidth?" is by far > the most common suggestion I hear (including offline feedback). Here's a > full explanation: > > BIP 152 shortens transaction hashes (32 bytes) to shortids (6 bytes) to > save a significant amount of network bandwidth, which is extremely > important in block relay. However, this comes at the expense of > computational complexity. There is no way to directly calculate a > transaction hash from a shortid; upon receipt of a compact block, a node is > expected to calculate the shortids of every unconfirmed transaction it > knows about to find the matches (BIP 152: [1], Bitcoin Core: [2]). This is > expensive but appropriate for block relay, since the block must have a > valid Proof of Work and new blocks only come every ~10 minutes. On the > other hand, if we require nodes to calculate shortids for every transaction > in their mempools every time they receive a package, we are creating a DoS > vector. Unconfirmed transactions don't need PoW and, to have a live > transaction relay network, we should expect nodes to handle transactions at > a high-ish rate (i.e. at least 1000s of times more transactions than > blocks). We can't pre-calculate or cache shortids for mempool transactions, > since the SipHash key depends on the block hash and a per-connection salt. > > Additionally, shortid calculation is not designed to prevent intentional > individual collisions. If we were to use these shortids to deduplicate > transactions we've supposedly already seen, we may have a censorship > vector. Again, these tradeoffs make sense for compact block relay (see > shortid section in BIP 152 [3]), but not package relay. > > TLDR: DoSy if we calculate shortids on every package and censorship vector > if we use shortids for deduplication. > > > Given this message there is no reason > > to send a (potentially bogus) fee rate with every package. It can only > be > > validated by obtaining the full set of txs, and the only recourse is > > dropping (etc.) the peer, as is the case with single txs. > > Yeah, I agree with this. Combined with the previous discussion with aj > (i.e. we can't accurately communicate the incentive compatibility of a > package without sending the full graph, and this whole dance is to avoid > downloading a few low-fee transactions in uncommon edge cases), I've > realized I should remove the fee + weight information from pkginfo. Yay for > less complexity! > > Also, this might be pedantic, but I said something incorrect earlier and > would like to correct myself: > > >> In theory, yes, but maybe it was announced earlier (while our node was > down?) or had dropped from our mempool or similar, either way we don't have > those txs yet. > > I said "It's fine if they have Erlay, since a sender would know in advance > that B is missing and announce it as a package." But this isn't true since > we're only using reconciliation in place of flooding to announce > transactions as they arrive, not for rebroadcast, and we're not doing full > mempool set reconciliation. In any case, making sure a node receives the > transactions announced when it was offline is not something we guarantee, > not an intended use case for package relay, and not worsened by this. > > Thanks for your feedback! > > Best, > Gloria > > [1]: > https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#cmpctblock > [2]: > https://github.com/bitcoin/bitcoin/blob/master/src/blockencodings.cpp#L49 > [3]: > https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#short-transaction-id-calculation > > On Thu, May 26, 2022 at 3:59 AM wrote: > >> Given that packages have no header, the package requires identity in a >> BIP152 scheme. For example 'header' and 'blockhash' fields can be replaced >> with a Merkle root (e.g. "identity" field) for the package, uniquely >> identifying the partially-ordered set of txs. And use of 'getdata' (to >> obtain a package by hash) can be eliminated (not a use case). >> >> e >> >> > -----Original Message----- >> > From: eric@voskuil.org >> > Sent: Wednesday, May 25, 2022 1:52 PM >> > To: 'Anthony Towns' ; 'Bitcoin Protocol Discussion' >> > ; 'Gloria Zhao' >> > >> > Subject: RE: [bitcoin-dev] Package Relay Proposal >> > >> > > From: bitcoin-dev On >> > Behalf >> > > Of Anthony Towns via bitcoin-dev >> > > Sent: Wednesday, May 25, 2022 11:56 AM >> > >> > > So the other thing is what happens if the peer announcing packages to >> us >> > is >> > > dishonest? >> > > >> > > They announce pkg X, say X has parents A B C and the fee rate is >> garbage. >> > But >> > > actually X has parent D and the fee rate is excellent. Do we request >> the >> > > package from another peer, or every peer, to double check? Otherwise >> > we're >> > > allowing the first peer we ask about a package to censor that tx from >> us? >> > > >> > > I think the fix for that is just to provide the fee and weight when >> > announcing >> > > the package rather than only being asked for its info? Then if one >> peer >> > makes >> > > it sound like a good deal you ask for the parent txids from them, >> dedupe, >> > > request, and verify they were honest about the parents. >> > >> > Single tx broadcasts do not carry an advertised fee rate, however the' >> > feefilter' message (BIP133) provides this distinction. This should be >> > interpreted as applicable to packages. Given this message there is no >> reason >> > to send a (potentially bogus) fee rate with every package. It can only >> be >> > validated by obtaining the full set of txs, and the only recourse is >> > dropping (etc.) the peer, as is the case with single txs. Relying on the >> > existing message is simpler, more consistent, and more efficient. >> > >> > > >> Is it plausible to add the graph in? >> > > >> > > Likewise, I think you'd have to have the graph info from many nodes if >> > you're >> > > going to make decisions based on it and don't want hostile peers to be >> > able to >> > > trick you into ignoring txs. >> > > >> > > Other idea: what if you encode the parent txs as a short hash of the >> wtxid >> > > (something like bip152 short ids? perhaps seeded per peer so >> collisions >> > will >> > > be different per peer?) and include that in the inv announcement? >> Would >> > > that work to avoid a round trip almost all of the time, while still >> giving >> > you >> > > enough info to save bw by deduping parents? >> > >> > As I suggested earlier, a package is fundamentally a compact block (or >> > block) announcement without the header. Compact block (BIP152) >> > announcement >> > is already well-defined and widely implemented. A node should never be >> > required to retain an orphan, and BIP152 ensures this is not required. >> > >> > Once a validated set of txs within the package has been obtained with >> > sufficient fee, a fee-optimal node would accept the largest subgraph of >> the >> > package that conforms to fee constraints and drop any peer that >> provides a >> > package for which the full graph does not. >> > >> > Let us not reinvent the wheel and/or introduce accidental complexity. I >> see >> > no reason why packaging is not simply BIP152 without the 'header' field, >> an >> > updated protocol version, and the following sort of changes to names: >> > >> > sendpkg >> > MSG_CMPCT_PKG >> > cmpctpkg >> > getpkgtxn >> > pkgtxn >> > >> > > > For a maximum 25 transactions, >> > > >23*24/2 = 276, seems like 36 bytes for a child-with-parents package. >> > > >> > > If you're doing short ids that's maybe 25*4B=100B already, then the >> above >> > is >> > > up to 36% overhead, I guess. Might be worth thinking more about, but >> > maybe >> > > more interesting with ancestors than just parents. >> > > >> > > >Also side note, since there are no size/count params, >> > >> > Size is restricted in the same manner as block and transaction >> broadcasts, >> > by consensus. If the fee rate is sufficient there would be no reason to >> > preclude any valid size up to what can be mined in one block (packaging >> > across blocks is not economically rational under the assumption that one >> > miner cannot expect to mine multiple blocks in a row). Count is >> incorporated >> > into BIP152 as 'shortids_length'. >> > >> > > > wondering if we >> > > >should just have "version" in "sendpackages" be a bit field instead >> of >> > > >sending a message for each version. 32 versions should be enough >> right? >> > >> > Adding versioning to individual protocols is just a reflection of the >> > insufficiency of the initial protocol versioning design, and that of the >> > various ad-hoc changes to it (including yet another approach in this >> > proposal) that have been introduced to compensate for it, though I'll >> > address this in an independent post at some point. >> > >> > Best, >> > e >> > >> > > Maybe but a couple of messages per connection doesn't really seem >> worth >> > > arguing about? >> > > >> > > Cheers, >> > > aj >> > > >> > > >> > > -- >> > > Sent from my phone. >> > > _______________________________________________ >> > > bitcoin-dev mailing list >> > > bitcoin-dev@lists.linuxfoundation.org >> > > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >> >> >> _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > --000000000000632bb205e0f1c9fa Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

Thanks again for your work on this!=

One question I have is about potential bandwidth = waste in the case of nodes running with different policy rules.=C2=A0 Here&= #39;s my understanding of a scenario I think could happen:

1) Transaction A is both low-fee and non-standard to some nodes on= the network.
2) Whenever a transaction T that spends A is relaye= d, new nodes will send INV(PKGINFO1, T) to all package-relay peers.
3) Nodes on the network that have implemented package relay, but do not = accept A, will send getdata(PKGINFO1, T) and learn all of T's unconfirm= ed parents (~32 bytes * number of parents(T)).
4) Such nodes will= reject T.=C2=A0 But because of transaction malleability, and to avoid bein= g blinded to a transaction unnecessarily, these nodes will likely still sen= d getdata(PKGINFO1, T) to every node that announces T, in case someone has = a transaction that includes an alternate set of parent transactions that wo= uld pass policy checks.

Is that understanding corr= ect?=C2=A0 I think a good design goal would be to not waste bandwidth in no= n-adversarial situations.=C2=A0 In this case, there would be bandwidth wast= e from downloading duplicate data from all your peers, just because the ann= ouncement doesn't commit to the set of parent wtxids that we'd get = from the peer (and so we are unable to determine that all our peers would b= e telling us the same thing, just based on the announcement).
Some ways to mitigate this might be to: (a) include a hash (may= be even just a 20-byte hash -- is that enough security?) of the package wtx= ids (in some canonical ordering) along with the wtxid of the child in the i= nitial announcement; (b) limit the use of v1 packages to transactions with = very few parents (I don't know if this is reasonable for the use cases = we have in mind).

Another point I wanted to bring = up is about the rules around v1 package validation generally, and the use o= f a blockhash in transaction relay specifically.=C2=A0 My first observation= is that it won't always be the case that a v1 package relay node will = be able to validate that a set of package transactions is fully sorted topo= logically, because there may be (non-parent) ancestors that are missing fro= m the package and the best a peer can validate is topology within the packa= ge -- this means that a peer can validly (under this BIP) relay transaction= packages out of the true topological sort (if all ancestors were included)= .

This makes me wonder how useful this topological= rule is.=C2=A0 I suppose there is some value in preventing completely brok= en implementations from staying connected and so there is no harm in having= the rule, but perhaps it would be helpful to add that nodes SHOULD order t= ransactions based on topological sort in the complete transaction graph, so= that if missing-from-package ancestors are already known by a peer (which = is the expected case when using v1 package relay on transactions that have = more than one generation of unconfirmed ancestor) then the remaining transa= ctions are already properly ordered, and this is helpful even if unenforcea= ble in general.=C2=A0=C2=A0

The other observation = I wanted to make was that having transaction relay gated on whether two nod= es agree on chain tip seems like an overly restrictive criteria.=C2=A0 I th= ink an important design principle is that we want to minimize disruption fr= om network splits -- if there are competing blocks found in a small window = of time, it's likely that the utxo set is not materially different on t= he two chains (assuming miners are selecting from roughly the same sets of = transactions when this happens, which is typical).=C2=A0 Having transaction= relay bifurcate on the two network halves would seem to exacerbate the dif= ference between the two sides of the split -- users ought to be agnostic ab= out how benign splits are resolved and would likely want their transactions= to relay across the whole network.

Additionally, = use of a chain tip might impose a larger burden than is necessary on softwa= re that would seek to participate in transaction relay without implementing= headers sync/validation.=C2=A0 I don't know what software exists on th= e network, but I imagine there are a lot of scripts out there for transacti= on submission to the public p2p network, and in thinking about=C2=A0modifyi= ng such a script to utilize package relay it seems=C2=A0like an unnecessary= added burden to first learn a node's tip before trying to relay a tran= saction.

Could you explain again what the benefit = of including the blockhash is?=C2=A0 It seems like it is just so that a nod= e could prioritize transaction relay from peers with the same chain tip to = maximize the likelihood of transaction acceptance, but in the common case t= his seems like a pretty negligible concern, and in the case of a chain fork= that persists for many minutes it seems better to me that we not partition= the network into package-relay regimes and just risk a little extra bandwi= dth in one direction or the other.=C2=A0 If we solve the problem I brought = up at the beginning (of de-duplicating package data across peers with a pac= kage-wtxid-commitment in the announcement), I think this is just some waste= d pkginfo bandwidth on a single-link, and not across links (as we could cac= he validation failure for a package-hash to avoid re-requesting duplicate p= kginfo1 messages).

Best,
Suhas


On Tue, Jun 7, 2022 at 1:57 PM Gloria Zhao via bitcoin-dev <<= a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">b= itcoin-dev@lists.linuxfoundation.org> wrote:
Hi Eric, aj, all,=

Sorry for the delayed response. @aj I'm inclu= ding some paraphrased points from our offline discussion (thanks).

> Other idea: what if you encode the parent txs as = a short hash of the=20 wtxid (something like bip152 short ids? perhaps seeded per peer so=20 collisions will be different per peer?) and include that in the inv=20 announcement? Would that work to avoid a round trip almost all of the=20 time, while still giving you enough info to save bw by deduping parents?

> As I suggested earlier, a pa= ckage is fundamentally a compact block (or
> block) announcement without the header. Compact block (BIP152) announc= ement
> is already well-defined and widely implemented...

> Let us not reinven= t the wheel and/or introduce accidental complexity. I see
> no reason why packaging is not simply BIP152 without the 'header&#= 39; field, an
> updated protocol version, and the following sort of changes to names

Interestingly, "why not use BIP 152 sho= rtids to save bandwidth?" is by far the most common suggestion I hear = (including offline feedback). Here's a full explanation:

BIP 152= shortens transaction hashes (32 bytes) to shortids (6 bytes) to save a sig= nificant amount of network bandwidth, which is extremely important in block= relay. However, this comes at the expense of computational complexity. The= re is no way to directly calculate a transaction hash from a shortid; upon = receipt of a compact block, a node is expected to calculate the shortids of= every unconfirmed transaction it knows about to find the matches (BIP 152:= [1], Bitcoin Core: [2]). This is expensive but appropriate for block relay= , since the block must have a valid Proof of Work and new blocks only come = every ~10 minutes. On the other hand, if we require nodes to calculate shor= tids for every transaction in their mempools every time they receive a pack= age, we are creating a DoS vector. Unconfirmed transactions don't need = PoW and, to have a live transaction relay network, we should expect nodes t= o handle transactions at a high-ish rate (i.e. at least 1000s of times more= transactions than blocks). We can't pre-calculate or cache shortids fo= r mempool transactions, since the SipHash key depends on the block hash and= a per-connection salt.

Additionally, shortid calculation is not des= igned to prevent intentional individual collisions. If we were to use these= shortids to deduplicate transactions we've supposedly already seen, we= may have a censorship vector. Again, these tradeoffs make sense for compac= t block relay (see shortid section in BIP 152 [3]), but not package relay.<= /div>

TLDR: DoSy if we calculate shortids on every packa= ge and censorship vector if we use shortids for deduplication.
> Given this message there is no reason =C2=A0
> to s= end a (potentially bogus) fee rate with every package. It can only be =C2= =A0
> validated by obtaining the full set of txs, and the only recour= se is =C2=A0
> dropping (etc.) the peer, as is the case with single t= xs.

Yeah, I agree with this. Combined with the previous d= iscussion with aj (i.e. we can't accurately communicate the incentive c= ompatibility of a package without sending the full graph, and this whole da= nce is to avoid downloading a few low-fee transactions in uncommon edge cas= es), I've realized I should remove the fee + weight information from pk= ginfo. Yay for less complexity!

Also, this mig= ht be pedantic, but I said something incorrect earlier and would like to co= rrect myself:

>> In theory, yes, but maybe it was announced ea= rlier (while our node was down?) or had dropped from our mempool or similar= , either way we don't have those txs yet. =C2=A0

I said "It= 's fine if they have Erlay, since a sender would know in advance that B= is missing and announce it as a package." But this isn't true sin= ce we're only using reconciliation in place of flooding to announce tra= nsactions as they arrive, not for rebroadcast, and we're not doing full= mempool set reconciliation. In any case, making sure a node receives the t= ransactions announced when it was offline is not something we guarantee, no= t an intended use case for package relay, and not worsened by this.

Thanks for your feedback!

Best,<= br>
Gloria
=

= On Thu, May 26, 2022 at 3:59 AM <eric@voskuil.org> wrote:
Given that packages have no header, the pa= ckage requires identity in a
BIP152 scheme. For example 'header' and 'blockhash' fields = can be replaced
with a Merkle root (e.g. "identity" field) for the package, uniqu= ely
identifying the partially-ordered set of txs. And use of 'getdata' = (to
obtain a package by hash) can be eliminated (not a use case).

e

> -----Original Message-----
> From: eric@vosku= il.org <eric@v= oskuil.org>
> Sent: Wednesday, May 25, 2022 1:52 PM
> To: 'Anthony Towns' <aj@erisian.com.au>; 'Bitcoin Protocol Discussio= n'
> <bitcoin-dev@lists.linuxfoundation.org>; 'Gloria Zhao= 9;
> <gloriaj= zhao@gmail.com>
> Subject: RE: [bitcoin-dev] Package Relay Proposal
>
> > From: bitcoin-dev <bitcoin-dev-bounces@lists.linuxfoun= dation.org> On
> Behalf
> > Of Anthony Towns via bitcoin-dev
> > Sent: Wednesday, May 25, 2022 11:56 AM
>
> > So the other thing is what happens if the peer announcing package= s to us
> is
> > dishonest?
> >
> > They announce pkg X, say X has parents A B C and the fee rate is<= br> garbage.
> But
> > actually X has parent D and the fee rate is excellent. Do we requ= est the
> > package from another peer, or every peer, to double check? Otherw= ise
> we're
> > allowing the first peer we ask about a package to censor that tx = from
us?
> >
> > I think the fix for that is just to provide the fee and weight wh= en
> announcing
> > the package rather than only being asked for its info? Then if on= e peer
> makes
> > it sound like a good deal you ask for the parent txids from them,=
dedupe,
> > request, and verify they were honest about the parents.
>
> Single tx broadcasts do not carry an advertised fee rate, however the&= #39;
> feefilter' message (BIP133) provides this distinction. This should= be
> interpreted as applicable to packages. Given this message there is no<= br> reason
> to send a (potentially bogus) fee rate with every package. It can only= be
> validated by obtaining the full set of txs, and the only recourse is > dropping (etc.) the peer, as is the case with single txs. Relying on t= he
> existing message is simpler, more consistent, and more efficient.
>
> > >> Is it plausible to add the graph in?
> >
> > Likewise, I think you'd have to have the graph info from many= nodes if
> you're
> > going to make decisions based on it and don't want hostile pe= ers to be
> able to
> > trick you into ignoring txs.
> >
> > Other idea: what if you encode the parent txs as a short hash of = the
wtxid
> > (something like bip152 short ids? perhaps seeded per peer so coll= isions
> will
> > be different per peer?) and include that in the inv announcement?= Would
> > that work to avoid a round trip almost all of the time, while sti= ll
giving
> you
> > enough info to save bw by deduping parents?
>
> As I suggested earlier, a package is fundamentally a compact block (or=
> block) announcement without the header. Compact block (BIP152)
> announcement
> is already well-defined and widely implemented. A node should never be=
> required to retain an orphan, and BIP152 ensures this is not required.=
>
> Once a validated set of txs within the package has been obtained with<= br> > sufficient fee, a fee-optimal node would accept the largest subgraph o= f
the
> package that conforms to fee constraints and drop any peer that provid= es a
> package for which the full graph does not.
>
> Let us not reinvent the wheel and/or introduce accidental complexity. = I
see
> no reason why packaging is not simply BIP152 without the 'header&#= 39; field,
an
> updated protocol version, and the following sort of changes to names:<= br> >
> sendpkg
> MSG_CMPCT_PKG
> cmpctpkg
> getpkgtxn
> pkgtxn
>
> > > For a maximum 25 transactions,
> > >23*24/2 =3D 276, seems like 36 bytes for a child-with-parents= package.
> >
> > If you're doing short ids that's maybe 25*4B=3D100B alrea= dy, then the
above
> is
> > up to 36% overhead, I guess. Might be worth thinking more about, = but
> maybe
> > more interesting with ancestors than just parents.
> >
> > >Also side note, since there are no size/count params,
>
> Size is restricted in the same manner as block and transaction broadca= sts,
> by consensus. If the fee rate is sufficient there would be no reason t= o
> preclude any valid size up to what can be mined in one block (packagin= g
> across blocks is not economically rational under the assumption that o= ne
> miner cannot expect to mine multiple blocks in a row). Count is
incorporated
> into BIP152 as 'shortids_length'.
>
> > > wondering if we
> > >should just have "version" in "sendpackages&qu= ot; be a bit field instead of
> > >sending a message for each version. 32 versions should be eno= ugh right?
>
> Adding versioning to individual protocols is just a reflection of the<= br> > insufficiency of the initial protocol versioning design, and that of t= he
> various ad-hoc changes to it (including yet another approach in this > proposal) that have been introduced to compensate for it, though I'= ;ll
> address this in an independent post at some point.
>
> Best,
> e
>
> > Maybe but a couple of messages per connection doesn't really = seem worth
> > arguing about?
> >
> > Cheers,
> > aj
> >
> >
> > --
> > Sent from my phone.
> > _______________________________________________
> > bitcoin-dev mailing list
> > bitcoin-dev@lists.linuxfoundation.org
> > https://lists.linuxfoundatio= n.org/mailman/listinfo/bitcoin-dev


_______________________________________________
bitcoin-dev mailing list
= bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mail= man/listinfo/bitcoin-dev
--000000000000632bb205e0f1c9fa--