summaryrefslogtreecommitdiff
path: root/20/f245a12cddf99045922e076b7f21c782970718
blob: 782856458c5182b79157e8affac3ca3c89205eeb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
Delivery-date: Thu, 05 Dec 2024 13:35:57 -0800
Received: from mail-yw1-f189.google.com ([209.85.128.189])
	by mail.fairlystable.org with esmtps  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
	(Exim 4.94.2)
	(envelope-from <bitcoindev+bncBC3PT7FYWAMRBMFZZC5AMGQECVKR4SA@googlegroups.com>)
	id 1tJJVp-0005EP-UY
	for bitcoindev@gnusha.org; Thu, 05 Dec 2024 13:35:57 -0800
Received: by mail-yw1-f189.google.com with SMTP id 00721157ae682-6ef7895405fsf15079087b3.1
        for <bitcoindev@gnusha.org>; Thu, 05 Dec 2024 13:35:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlegroups.com; s=20230601; t=1733434548; x=1734039348; darn=gnusha.org;
        h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
         :list-id:mailing-list:precedence:x-original-sender:mime-version
         :subject:references:in-reply-to:message-id:to:from:date:sender:from
         :to:cc:subject:date:message-id:reply-to;
        bh=tTNB1GXkbB8JzhyBHVlEmZYRKwqiyL+AuJEQxcXifkM=;
        b=iIDZCoE37GyinCzBR7XDkwvGXpTkRXWQ9uecFpcbqXNr85b7I+SCXRYDzJsmIuy1fi
         wd1klEgWObu+V7J+2pyXpjm0rot8IzChAJkj0ck82uUQSAdLV2h8NbOSjhXIAq/ihDcf
         U0iHMLiodBeG4fnOKJEGo0wnoqY3vN69ApKGejuA6a/BQX8ZrPxvuX3CHn9l4OMakP1m
         f/UnWE/18EVERTAsSX7jw7iVoWch3iHAfRQ45vsGIrCsxdhQVEXRIjmVcmo/s3OVGPDt
         u1Uhx829tMs9IBRF7noPBmxyqidyIujZGpKOSHrPYGIM6fw3+iuKcoTmO5EfQsVb1lu3
         x5YA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1733434548; x=1734039348; darn=gnusha.org;
        h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
         :list-id:mailing-list:precedence:x-original-sender:mime-version
         :subject:references:in-reply-to:message-id:to:from:date:from:to:cc
         :subject:date:message-id:reply-to;
        bh=tTNB1GXkbB8JzhyBHVlEmZYRKwqiyL+AuJEQxcXifkM=;
        b=l5sYGW0X/EgI6BWUg/olS9x1ie3AJSl0c5RZeuTt1Q5v0woq66G0CcKxJwxfteXsJS
         PYgoBdetZdVYxTedlVj3Z151aFM6ZW2HpWc3A/pmTcQ6lH1KH0LX4/syf2l7Z4Icdlc+
         U1V8DMoSvj6jnab0ZQmPMInxOftQp5+UT/YIOzvBtG85F6DYpobA9t4NCsKddsyAjkfI
         2EhpDPbUt7MfC0LBDu0bMqOZPjkHYap3YaqeEHaMPqjAPC8Tl3WPEqW7iBOP9Vmh/WBy
         Thn66QZEyU5mtbEzig+9owWbqJbsc3rv9XCx/HQU9so8blXn9TTwRIO9nkZr+C+Tq6HP
         xuyw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1733434548; x=1734039348;
        h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
         :list-id:mailing-list:precedence:x-original-sender:mime-version
         :subject:references:in-reply-to:message-id:to:from:date:x-beenthere
         :x-gm-message-state:sender:from:to:cc:subject:date:message-id
         :reply-to;
        bh=tTNB1GXkbB8JzhyBHVlEmZYRKwqiyL+AuJEQxcXifkM=;
        b=tcm8YLvnr6WRtvpJYLu7wmw8lVik5O4MIlTTI8ngxQb0szod1krtLdsPuj142Vf1+X
         zPJqEuaGodHBMLJ2Q2kfs7KPmd0uw+4SYr+fwbre86eQdz4cnBqoExCFbSkTu9Iq7RYA
         qWiFiZZsX1XIGcGMc+I3YvyBKzluisNgJnJ0QUfm0m2tF92aCCk7yUKjfdFJtGZrt7eY
         Yl0HzOv4Q3HE6d0rM/gZzTl8ViJf/ST0rj3plRBJIxlhdT7mcFVjLymyLVaYwDQ5yrKR
         BAri/IJtcOUcedsXHfhwIuBFFu9cfyOi5obIt6uSY/s55aYxFRdeJ15MfSBwWqj/VIxF
         isEQ==
Sender: bitcoindev@googlegroups.com
X-Forwarded-Encrypted: i=1; AJvYcCXlhTCx+27LenXWojizLcxKbT4tArelUjh/tFnhKZWETWw+MFab/zW4OMTZQacuBl7r3D5ANlR0vjtT@gnusha.org
X-Gm-Message-State: AOJu0YyypnH8ur32NEtaeAl4Od4EVuLlJoJEfqF0E4lPLCuAQZ3ttEu8
	0PdvhqsLNM4sNqIhnPdidvzpkQPcT2Qk/+x2TQ5rprDKVAJ3IyJq
X-Google-Smtp-Source: AGHT+IG2i+WmH8ZcwlfCvHmT5YdEjh/2OsPTAt3wKoiZ09kgeQL6mjOpOplU6F3TuX734g76exvjZg==
X-Received: by 2002:a05:6902:1207:b0:e39:9a8f:5220 with SMTP id 3f1490d57ef6-e3a0b082013mr760796276.11.1733434547398;
        Thu, 05 Dec 2024 13:35:47 -0800 (PST)
X-BeenThere: bitcoindev@googlegroups.com
Received: by 2002:a25:d30e:0:b0:e35:df28:2ec4 with SMTP id 3f1490d57ef6-e39f21ff2a8ls712535276.2.-pod-prod-09-us;
 Thu, 05 Dec 2024 13:35:44 -0800 (PST)
X-Received: by 2002:a05:690c:6a10:b0:6dc:7877:1ea3 with SMTP id 00721157ae682-6efe3bfb0dbmr10902007b3.17.1733434544437;
        Thu, 05 Dec 2024 13:35:44 -0800 (PST)
Received: by 2002:a05:690c:2c10:b0:6e2:1e5e:a1e1 with SMTP id 00721157ae682-6ef36930774ms7b3;
        Wed, 27 Nov 2024 21:18:38 -0800 (PST)
X-Received: by 2002:a05:690c:6ac5:b0:6ee:61ea:a429 with SMTP id 00721157ae682-6ef372893ffmr61645887b3.36.1732771117364;
        Wed, 27 Nov 2024 21:18:37 -0800 (PST)
Date: Wed, 27 Nov 2024 21:18:37 -0800 (PST)
From: Antoine Riard <antoine.riard@gmail.com>
To: Bitcoin Development Mailing List <bitcoindev@googlegroups.com>
Message-Id: <78e8248d-bc77-452f-ac7e-19c28cbc3280n@googlegroups.com>
In-Reply-To: <926fdd12-4e50-433d-bd62-9cc41c7b22a0n@googlegroups.com>
References: <gnM89sIQ7MhDgI62JciQEGy63DassEv7YZAMhj0IEuIo0EdnafykF6RH4OqjTTHIHsIoZvC2MnTUzJI7EfET4o-UQoD-XAQRDcct994VarE=@protonmail.com>
 <72e83c31-408f-4c13-bff5-bf0789302e23n@googlegroups.com>
 <heKH68GFJr4Zuf6lBozPJrb-StyBJPMNvmZL0xvKFBnBGVA3fVSgTLdWc-_8igYWX8z3zCGvzflH-CsRv0QCJQcfwizNyYXlBJa_Kteb2zg=@protonmail.com>
 <5b0331a5-4e94-465d-a51d-02166e2c1937n@googlegroups.com>
 <yt1O1F7NiVj-WkmnYeta1fSqCYNFx8h6OiJaTBmwhmJ2MWAZkmmjPlUST6FM7t6_-2NwWKdglWh77vcnEKA8swiAnQCZJY2SSCAh4DOKt2I=@protonmail.com>
 <be78e733-6e9f-4f4e-8dc2-67b79ddbf677n@googlegroups.com>
 <jJLDrYTXvTgoslhl1n7Fk9-pL1mMC-0k6gtoniQINmioJpzgtqrJ_WqyFZkLltsCUusnQ4jZ6HbvRC-mGuaUlDi3kcqcFHALd10-JQl-FMY=@protonmail.com>
 <9a4c4151-36ed-425a-a535-aa2837919a04n@googlegroups.com>
 <3f0064f9-54bd-46a7-9d9a-c54b99aca7b2n@googlegroups.com>
 <26b7321b-cc64-44b9-bc95-a4d8feb701e5n@googlegroups.com>
 <CALZpt+EwVyaz1=A6hOOycqFGJs+zxyYYocZixTJgVmzZezUs9Q@mail.gmail.com>
 <607a2233-ac12-4a80-ae4a-08341b3549b3n@googlegroups.com>
 <3dceca4d-03a8-44f3-be64-396702247fadn@googlegroups.com>
 <301c64c7-0f0f-476a-90c4-913659477276n@googlegroups.com>
 <e2c61ee5-68c4-461e-a132-bb86a4c3e2ccn@googlegroups.com>
 <33dfd007-ac28-44a5-acee-cec4b381e854n@googlegroups.com>
 <CALZpt+Fs1U5f3S6_tR7AFfEMEkgBPSp3OaNEq+eqYoCSSYXD7g@mail.gmail.com>
 <a76b8dc5-d37f-4059-882b-207004874887n@googlegroups.com>
 <ac6cc3b8-43e5-4cd6-aabe-f5ffc4672812n@googlegroups.com>
 <926fdd12-4e50-433d-bd62-9cc41c7b22a0n@googlegroups.com>
Subject: Re: [bitcoindev] Re: Great Consensus Cleanup Revival
MIME-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_150666_732945913.1732771117097"
X-Original-Sender: antoine.riard@gmail.com
Precedence: list
Mailing-list: list bitcoindev@googlegroups.com; contact bitcoindev+owners@googlegroups.com
List-ID: <bitcoindev.googlegroups.com>
X-Google-Group-Id: 786775582512
List-Post: <https://groups.google.com/group/bitcoindev/post>, <mailto:bitcoindev@googlegroups.com>
List-Help: <https://groups.google.com/support/>, <mailto:bitcoindev+help@googlegroups.com>
List-Archive: <https://groups.google.com/group/bitcoindev
List-Subscribe: <https://groups.google.com/group/bitcoindev/subscribe>, <mailto:bitcoindev+subscribe@googlegroups.com>
List-Unsubscribe: <mailto:googlegroups-manage+786775582512+unsubscribe@googlegroups.com>,
 <https://groups.google.com/group/bitcoindev/subscribe>
X-Spam-Score: -0.5 (/)

------=_Part_150666_732945913.1732771117097
Content-Type: multipart/alternative; 
	boundary="----=_Part_150667_1323905878.1732771117097"

------=_Part_150667_1323905878.1732771117097
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Eric,

Going back to this thread with a bit of delay...

tl;dr: See specifically comment on the lack of proof that invalidating=20
64-byte transactions are actually solving merkle root weaknesses that could=
=20
lead to a fork or unravel SPV clients.

> I'm not sure what you mean by stating that a new consensus rule, "could=
=20
be a low memory overhead". Checking all tx sizes is far more overhead than=
=20
validating the coinbase for a null point. As AntoineP agreed, it cannot be=
=20
done earlier, and I have shown that it is *significantly* more=20
computationally intensive. It makes the determination much more costly and=
=20
in all other cases by adding an additional check that serves no purpose.

I think for any (new) consensus rule, we shall be able to evalute its=20
implication in term of at least 2 dimensions a) memory overhead (e.g do a=
=20
full-node needs more memory to validate post-segwit blocks now that witness=
=20
fields are discounted ?) and b) computational overhead (e.g do a full-node=
=20
needs more CPU cycles to validate confidential transactions perdersen=20
commitments ?). A same consensus rule can achieve the same effect e.g=20
reducing headers merkle tree ambiguities, with completely different memory=
=20
or computational cost. For the checking all tx sizes vs validating the=20
coinbase for a null point, indeed I agree with you that the latter is=20
intuitively better on the 2 dimensions.

> I think you misunderstood me. Of course the witness commitment must be=20
validated (as I said, "Yet it remains necessary to validate the witness=20
commitment..."), as otherwise the witnesses within a block can be anything=
=20
without affecting the block hash. And of course the witness commitment is=
=20
computed in the same manner as the tx commitment and is therefore subject=
=20
to the same malleations. However, because the coinbase tx is committed to=
=20
the block hash, there is no need to guard the witness commitment for=20
malleation. And to my knowledge nobody has proposed doing so.

Yes, we misunderstood each other here.

> It cannot, that was my point: "(1) have distinct identity due to another=
=20
header property deviation, or (2) are the same block..."

Ok.

> This was already the presumption.

Ok.

> I'm not seeing the connection here. Are you suggesting that tx and block=
=20
hashes may collide with each other? Or that that a block message may be=20
confused with a transaction message?

This was about how to deal with types of invalid block messages in bitcoin=
=20
core that could sources of denial-of-service. E.g invalid bitcoin block=20
hash for a message with unparsable data (#1 and #3 in your typology.
My point was bitcoin core is making some assumption at block download to=20
fetch them from outbound peers rather than inbound. Outboud peers are=20
assumed to be more reliable, as the connection is attempted from the=20
outside.
The point in my point was that outbound-block-relay connections where=20
initially introduce to alleviate those types of concerns i.e tx probe to=20
infer the topology, see https://arxiv.org/pdf/1812.00942

> This does not mitigate the issue. It's essentially dead code. It's=20
exactly like saying, "there's an arbitrary number of holes in the bucket,=
=20
but we can plug a subset of those holes." Infinite minus any number is=20
still infinite.

I disagree with you here - If the fundamental problem is efficiently=20
caching identity in the case of block invalidity, one cannot ignore a=20
robust peeering policy, i.e you pick up peers allowed a scarce connection=
=20
slot.
This is indeed useless if you don't have first an efficient verification=20
algorithm to determine the block invalidity, though it's part of the=20
overall equation.
While infinite minus any number is of course still infinite, thinking=20
security by layers it's the base you can have a secure signature=20
verification algorithm still on a hot computing host.

> I don't follow this statement. The term "usable" was specifically=20
addressing the proposal - that a header hash must uniquely identify a block=
=20
(a header and committed set of txs) as valid or otherwise. As I have=20
pointed out, this will still not be the case if 64 byte blocks are=20
invalidated. It is also not the case that detection of type64 malleated=20
blocks can be made more performant if 64 byte txs are globally invalid. In=
=20
fact the opposite is true, it becomes more costly (and complex) and is=20
therefore just dead code.

Okay, in my statement, the term "usable" was to be understood as any=20
meaningful bit of information that can lead to=20
computationally-hard-to-forge progress in the determination problem you=20
laid out here:
https://groups.google.com/g/bitcoindev/c/CAfm7D5ppjo/m/T1-HKqSLAAAJ

> Headers first only defers malleation checks. The same checks are=20
necessary whether you perform blocks first or headers first sync (we=20
support both protocol levels). The only difference is that for headers=20
first, a stored header might later become invalidated. However, this is the=
=20
case with and without the possibility of malleation.

Yes, I agree with you here, a stored header might become invalidated, e.g a=
=20
reorg-out tx committed in the header's merkle tree after the header=20
reception.

> I have not suggested that anything is waived or ignored here. I'm stating=
=20
that there is no "mempool" performance benefit whatsoever to invalidating=
=20
64 byte txs. Mempool caching could only rely on tx identifiers, not block=
=20
identifiers. Tx identifiers are not at issue.

Once again, if the goal is an efficient algorithm making progress to=20
determinate a block invalidity, and as such reducing the denial-of-service=
=20
surface, caching signatures which are committed in the wtixd tree or in the=
=20
txid tree is a plus.
Though yes, I agree there is no "mempool" performance benefit to invalidate=
=20
the 64 byte tx.

> I don't know how to add any more detail than I already have. There are=20
three relevant considerations:
>=20
> (1) block hashes will not become unique identifiers for block messages.
> (2) the earliest point at which type64 malleation can be detected will=20
not be reduced.
> (3) the necessary cost of type64 malleated determination will not be=20
reduced.
> (4) the additional consensus rule will increase validation cost and code=
=20
complexity.
> (5) invalid blocks can still be produced at no cost that require full=20
double tx hashing/Merkle root computations.
>=20
> Which of these statements are not evident at this point?

That's five statements, not three. Minding implementation-dependent=20
considerations, I'm leaning to agree with up to (4) included.
About (5), I don't see how it makes sense that invalid blocks can be still=
=20
produced at not cost, at least pow should be the first thing first to be=20
verified.
Like this statement could be clarified what you mean by this.

> No, no part of this thread has any bearing on p2p transaction messages -=
=20
nor are coinbase transactions relayed as transaction messages. You could=20
restate it as:
>=20
> - receive block p2p messages
> - if the first tx's first input does not have a null point, reject the=20
block

I don't believe we can fully dissociate bearing on p2p blocks /=20
transactions messages, from the overall goal of reducing denial-of-service=
=20
arising from invalid blocks. How can you be sure the block is invalid until=
=20
you validate all txn ? Though let's waiwe this observation for the present=
=20
point.

The idea of exploiting block malleability is to grind one transaction T0=20
for a block B such that H(T0) =3D=3D H(H(T1) || H(T2)) =3D=3D B's Root. I.e=
 to have=20
T0 =3D=3D H(T1) || H(T2). T0 can be consensus valid or invalid to provoke a=
=20
consensus fork (it's the collision in the deserialization which is the=20
source of the merkle tree root weakness). The first transaction in the=20
block is necessarily the coinbase per current consensus rules. Checking=20
that T0 is a valid coinbase transaction is sufficient to reject the block.=
=20
Grinding 64-byte transactions that all deserialize as valid transactions,=
=20
including the null point requirement is computationally infeasible.

I'm not sure that even if we get ride of 64-byte transactions, we would=20
remove the merkle root weaknesses. Back to the previous example, one could=
=20
find T3 and T4 such that H(H(T3) || H(T4)) is equivalent to H(H(T1) ||=20
H(T2)). Of course, it would consist in breaking SHA256, which is deemed=20
computationally infeasible. I'm not even sure the header verifcation=20
algorithms gains second-preimage resistance from forbidding 64-byte=20
transaction.

So I think more that the minimal sufficient check to reject a block should=
=20
be more carefully analyzed, rather than advocating that forbidding some=20
magic value obviously fix an issue, in the present the bitcoin's merkle=20
root weaknesses.

> The above approach makes this malleation computationally infeasible.

I'm intuitively leaning so, though see comments above that it should be=20
more carefully thought about.

> It has nothing to do with internal cache layout and nothing to do with=20
mining resources. Not having a cache is clearly more efficient than having=
=20
a cache that provides no advantage, regardless of how the cache is laid=20
out. There is no cost to forcing a node to perform far more block=20
validation computations than can be precluded by invalidity caching. The=20
caching simply increases the overall computational cost (as would another=
=20
redundant rule to try and make it more efficient). Discarding invalid=20
blocks after the minimal amount of work is the most efficient resolution.=
=20
What one does with the peer at that point is orthogonal (e.g. drop, ban).

I disagree here - If the goal is an efficient algorithm making progress to=
=20
determinate a block invalidity, and then being able to re-use a run of this=
=20
algorithm when a blocks occurs again, having a cache widens the range of=20
algorithmsone can design. Same with the mining ressources, if it's to=20
consider denial-of-services and an attacker could be able to fully forge=20
blocks. If such invalidity caching strategy was efficient it would actually=
=20
minimize or erase the cost for a node to perform more block validation=20
computations. Where yes I share you opinion is that an ill-designed caching=
=20
could increase the overall computational cost, and that discarding invalid=
=20
blocks after the minimal amount of work is the most efficient resolution=20
for the 1st seen, though it doesn't say on the next N seen. Having the=20
signatures already validated could be obviously a win, even with a blind,=
=20
decaying cache, it's all about the memory space of an
average full-node.

> An attacker can throw a nearly infinite number of distinct invalid blocks=
=20
at your node (and all will connect to the chain and show proper PoW). As=20
such you will encounter zero cache hits and therefore nothing but overhead=
=20
from the cache. Please explain to me in detail how "cache layout" is going=
=20
to make any difference at all.

Going back to your typology from (1) to (9), e.g for the step 9 to=20
determine if a block message with valid header but unmalleated committed=20
valid tx data.
If you have seen the block message a first-time, though it wasn't yet on=20
the longest pow-chain and you disregarded its validation.

> I don't see this as a related/relevant topic. There are zero mining=20
resources required to overflow the invalidity cache. Just as Core recently=
=20
published regarding overflowing to its "ban" store, resulting in process=20
termination, this then introduces another attack vector that must be=20
mitigated.

Depends if your invalidity cache is safeguarded by a minimal valid=20
proof-of-work. I'm certaintly not going to defend that all bitcoin core=20
internal caches and stores are well-designed for adversarials environments.

> pseudo-code , not from libbitcoin...
>=20
> ```
> bool malleated64(block)
> {
>     segregated =3D ((block[80 + 4] =3D=3D 0) and (block[80 + 4 + 1] =3D=
=3D 1))
>     return block[segregated ? 86 : 85] !=3D=20
0xffffffff0000000000000000000000000000000000000000000000000000000000000000
> }
> ```
>=20
> Obviously there is no error handling (e.g. block too small, too many=20
inputs, etc.) but that is not relevant to the particular question. The=20
block.header is fixed size, always 80 bytes. The tx.version is also fixed,=
=20
always 4 bytes. A following 0 implies a segregated witness (otherwise it's=
=20
the input count), assuming there is a following 1. The first and only input=
=20
for the coinbase tx, which must be the first block tx, follows. If it does=
=20
not match=20
0xffffffff0000000000000000000000000000000000000000000000000000000000000000=
=20
then the block is invalid. If it does match, it is computationally=20
infeasible that the merkle root is type64 malleated. That's it, absolutely=
=20
trivial and with no prerequisites. The only thing that even makes it=20
interesting is the segwit bifurcation.

Thanks for the example with the segwit bifurcation for the marker. By the=
=20
way, the segwit marker is documented in BIP144, which is incorrectly=20
labeled as "Peer Services", though obviously misimplementing the=20
transaction ser / deser algorithm for segwit blocks would lead to consensus=
=20
divergence (what if you expect the "flag" to be 0xff and not 0x01 ?).=20
Personally, I think it's a good example of how tedious consensus changes=20
can be, when even documents for inter-compatibility about consensus changes=
=20
are not drawing a clear line between what is consensus and what are p2p=20
rules...

> Sure, but no language difference that I'm aware of could have any bearing=
=20
on this particular question.

Same, I don't see language difference that could have bearing on this=20
question, at that level of granularity.

Best,
Antoine R
ots hash: 3d5ed1718683ce1e864751a2eccf21908ed3b11079f183cdf863729d71ae3f36
Le samedi 20 juillet 2024 =C3=A0 21:51:27 UTC+1, Eric Voskuil a =C3=A9crit =
:

> Hi Antoine R,
>
> >> While at some level the block message buffer would generally be=20
> referenced by one or more C pointers, the difference between a valid=20
> coinbase input (i.e. with a "null point") and any other input, is not=20
> nullptr vs. !nullptr. A "null point" is a 36 byte value, 32 0x00 byes=20
> followed by 4 0xff bytes. In his infinite wisdom Satoshi decided it was=
=20
> better (or easier) to serialize a first block tx (coinbase) with an input=
=20
> containing an unusable script and pointing to an invalid [tx:index] tuple=
=20
> (input point) as opposed to just not having any input. That invalid input=
=20
> point is called a "null point", and of course cannot be pointed to by a=
=20
> "null pointer". The coinbase must be identified by comparing those 36 byt=
es=20
> to the well-known null point value (and if this does not match the Merkle=
=20
> hash cannot have been type64 malleated).
>
> > Good for the clarification here, I had in mind the core's `CheckBlock`=
=20
> path where the first block transaction pointer is dereferenced to verify =
if=20
> the transaction is a coinbase (i.e a "null point" where the prevout is=20
> null). Zooming out and back to my remark, I think this is correct that=20
> adding a new 64 byte size check on all block transactions to detect block=
=20
> hash invalidity could be a low memory overhead (implementation dependant)=
,=20
> rather than making that 64 byte check alone on the coinbase transaction a=
s=20
> in my understanding you're proposing.
>
> I'm not sure what you mean by stating that a new consensus rule, "could b=
e=20
> a low memory overhead". Checking all tx sizes is far more overhead than=
=20
> validating the coinbase for a null point. As AntoineP agreed, it cannot b=
e=20
> done earlier, and I have shown that it is *significantly* more=20
> computationally intensive. It makes the determination much more costly an=
d=20
> in all other cases by adding an additional check that serves no purpose.
>
> >>> The second one is the bip141 wtxid commitment in one of the coinbase=
=20
> transaction `scriptpubkey` output, which is itself covered by a txid in t=
he=20
> merkle tree.
>
> >> While symmetry seems to imply that the witness commitment would be=20
> malleable, just as the txs commitment, this is not the case. If the tx=20
> commitment is correct it is computationally infeasible for the witness=20
> commitment to be malleated, as the witness commitment incorporates each=
=20
> full tx (with witness, sentinel, and marker). As such the block identifie=
r,=20
> which relies only on the header and tx commitment, is a sufficient=20
> identifier. Yet it remains necessary to validate the witness commitment t=
o=20
> ensure that the correct witness data has been provided in the block messa=
ge.
> >>
> >> The second type of malleability, in addition to type64, is what we cal=
l=20
> type32. This is the consequence of duplicated trailing sets of txs (and=
=20
> therefore tx hashes) in a block message. This is applicable to some but n=
ot=20
> all blocks, as a function of the number of txs contained.
>
> > To precise more your statement in describing source of malleability. Th=
e=20
> witness stack can be malleated altering the wtxid and yet still valid. I=
=20
> think you can still have the case where you're feeded a block header with=
 a=20
> merkle root commitment deserializing to a valid coinbase transaction with=
=20
> an invalid witness commitment. This is the case of a "block message with=
=20
> valid header but malleatead committed valid tx data". Validation of the=
=20
> witness commitment to ensure the correct witness data has been provided i=
n=20
> the block message is indeed necessary.
>
> I think you misunderstood me. Of course the witness commitment must be=20
> validated (as I said, "Yet it remains necessary to validate the witness=
=20
> commitment..."), as otherwise the witnesses within a block can be anythin=
g=20
> without affecting the block hash. And of course the witness commitment is=
=20
> computed in the same manner as the tx commitment and is therefore subject=
=20
> to the same malleations. However, because the coinbase tx is committed to=
=20
> the block hash, there is no need to guard the witness commitment for=20
> malleation. And to my knowledge nobody has proposed doing so.
>
> >>> I think I mostly agree with the identity issue as laid out so far,=20
> there is one caveat to add if you're considering identity caching as the=
=20
> problem solved. A validation node might have to consider differently bloc=
k=20
> messages processed if they connect on the longest most PoW valid chain fo=
r=20
> which all blocks have been validated. Or alternatively if they have to be=
=20
> added on a candidate longest most PoW valid chain.
>
> >> Certainly an important consideration. We store both types. Once there=
=20
> is a stronger candidate header chain we store the headers and proceed to=
=20
> obtaining the blocks (if we don't already have them). The blocks are stor=
ed=20
> in the same table; the confirmed vs. candidate indexes simply point to th=
em=20
> as applicable. It is feasible (and has happened twice) for two blocks to=
=20
> share the very same coinbase tx, even with either/all bip30/34/90 active=
=20
> (and setting aside future issues here for the sake of simplicity). This=
=20
> remains only because two competing branches can have blocks at the same=
=20
> height, and bip34 requires only height in the coinbase input script. This=
=20
> therefore implies the same transaction but distinct blocks. It is however=
=20
> infeasible for one block to exist in multiple distinct chains. In order f=
or=20
> this to happen two blocks at the same height must have the same coinbase=
=20
> (ok), and also the same parent (ok). But this then means that they either=
=20
> (1) have distinct identity due to another header property deviation, or (=
2)=20
> are the same block with the same parent and are therefore in just one=20
> chain. So I don't see an actual caveat. I'm not certain if this is the=20
> ambiguity that you were referring to. If not please feel free to clarify.
>
> > If you assume no network partition and the no blocks more than 2h in th=
e=20
> future consensus rule, I cannot see how one block with no header property=
=20
> deviation can exist in multiple distinct chains.
>
> It cannot, that was my point: "(1) have distinct identity due to another=
=20
> header property deviation, or (2) are the same block..."
>
> > The ambiguity I was referring was about a different angle, if the desig=
n=20
> goal of introducing a 64 byte size check is to "it was about being able t=
o=20
> cache the hash of a (non-malleated) invalid block as permanently invalid =
to=20
> avoid re-downloading and re-validating it", in my thinking we shall=20
> consider the whole block headers caching strategy and be sure we don't ge=
t=20
> situations where an attacker can attach a chain of low-pow block headers=
=20
> with malleated committed valid tx data yielding a block invalidity at the=
=20
> end, provoking as a side-effect a network-wide data download blowup. So I=
=20
> think any implementation of the validation of a block validity, of which=
=20
> identity is a sub-problem, should be strictly ordered by adequate=20
> proof-of-work checks.
>
> This was already the presumption.
>
> >> We don't do this and I don't see how it would be relevant. If a peer=
=20
> provides any invalid message or otherwise violates the protocol it is=20
> simply dropped.
> >>
> >> The "problematic" that I'm referring to is the reliance on the block=
=20
> hash as a message identifier, because it does not identify the message an=
d=20
> cannot be useful in an effectively unlimited number of zero-cost cases.
>
> > Historically, it was to isolate transaction-relay from block-relay to=
=20
> optimistically harden in face of network partition, as this is easy to=20
> infer transaction-relay topology with a lot of heuristics.
>
> I'm not seeing the connection here. Are you suggesting that tx and block=
=20
> hashes may collide with each other? Or that that a block message may be=
=20
> confused with a transaction message?
>
> > I think this is correct that block hash message cannot be relied on as=
=20
> it cannot be useful in an unlimited number of zero-cost cases, as I was=
=20
> pointing that bitcoin core partially mitigate that with discouraging=20
> connections to block-relay peers servicing block messages=20
> (`MaybePunishNodeForBlocks`).
>
> This does not mitigate the issue. It's essentially dead code. It's exactl=
y=20
> like saying, "there's an arbitrary number of holes in the bucket, but we=
=20
> can plug a subset of those holes." Infinite minus any number is still=20
> infinite.
>
> > I believe somehow the bottleneck we're circling around is=20
> computationally definining what are the "usable" identifiers for block=20
> messages. The most straightforward answer to this question is the full=20
> block in one single peer message, at least in my perspective.
>
> I don't follow this statement. The term "usable" was specifically=20
> addressing the proposal - that a header hash must uniquely identify a blo=
ck=20
> (a header and committed set of txs) as valid or otherwise. As I have=20
> pointed out, this will still not be the case if 64 byte blocks are=20
> invalidated. It is also not the case that detection of type64 malleated=
=20
> blocks can be made more performant if 64 byte txs are globally invalid. I=
n=20
> fact the opposite is true, it becomes more costly (and complex) and is=20
> therefore just dead code.
>
> > Reality since headers first synchronization (`getheaders`), block=20
> validation has been dissociated in steps for performance reasons, among=
=20
> others.
>
> Headers first only defers malleation checks. The same checks are necessar=
y=20
> whether you perform blocks first or headers first sync (we support both=
=20
> protocol levels). The only difference is that for headers first, a stored=
=20
> header might later become invalidated. However, this is the case with and=
=20
> without the possibility of malleation.
>
> >> Again, this has no relation to tx hashes/identifiers. Libbitcoin has a=
=20
> tx pool, we just don't store them in RAM (memory).
> >>
> >> I don't follow this. An invalid 64 byte tx consensus rule would=20
> definitely not make it harder to exploit block message invalidity. In fac=
t=20
> it would just slow down validation by adding a redundant rule. Furthermor=
e,=20
> as I have detailed in a previous message, caching invalidity does=20
> absolutely nothing to increase protection. In fact it makes the situation=
=20
> materially worse.
>
> > Just to recall, in my understanding the proposal we're discussing is=20
> about outlawing 64 bytes size transactions at the consensus-level to=20
> minimize denial-of-service vectors during block validation. I think we're=
=20
> talking about each other because the mempool already introduce a layer of=
=20
> caching in bitcoin core, of which the result are re-used at block=20
> validation, such as signature verification results. I'm not sure we can=
=20
> fully waive apart performance considerations, though I agree implementati=
on=20
> architecture subsystems like mempool should only be a sideline=20
> considerations.
>
> I have not suggested that anything is waived or ignored here. I'm stating=
=20
> that there is no "mempool" performance benefit whatsoever to invalidating=
=20
> 64 byte txs. Mempool caching could only rely on tx identifiers, not block=
=20
> identifiers. Tx identifiers are not at issue.
>
> >> No, this is not the case. As I detailed in my previous message, there=
=20
> is no possible scenario where invalidation caching does anything but make=
=20
> the situation materially worse.
>
> > I think this can be correct that invalidation caching make the situatio=
n=20
> materially worse, or is denial-of-service neutral, as I believe a full no=
de=20
> is only trading space for time resources in matters of block messages=20
> validation. I still believe such analysis, as detailed in your previous=
=20
> message, would benefit to be more detailed.
>
> I don't know how to add any more detail than I already have. There are=20
> three relevant considerations:
>
> (1) block hashes will not become unique identifiers for block messages.
> (2) the earliest point at which type64 malleation can be detected will no=
t=20
> be reduced.
> (3) the necessary cost of type64 malleated determination will not be=20
> reduced.
> (4) the additional consensus rule will increase validation cost and code=
=20
> complexity.
> (5) invalid blocks can still be produced at no cost that require full=20
> double tx hashing/Merkle root computations.
>
> Which of these statements are not evident at this point?
>
> >> On the other hand, just dealing with parse failure on the spot by=20
> introducing a leading pattern in the stream just inflates the size of p2p=
=20
> messages, and the transaction-relay bandwidth cost.
> >>
> >> I think you misunderstood me. I am suggesting no change to=20
> serialization. I can see how it might be unclear, but I said, "nothing=20
> precludes incorporating a requirement for a necessary leading pattern in=
=20
> the stream." I meant that the parser can simply incorporate the=20
> *requirement* that the byte stream starts with a null input point. That=
=20
> identifies the malleation or invalidity without a single hash operation a=
nd=20
> while only reading a handful of bytes. No change to any messages.
>
> > Indeed, this is clearer with the re-explanation above about what you=20
> meant by the "null point".
>
> Ok
>
> > In my understanding, you're suggesting the following algorithm:
> > - receive transaction p2p messages
> > - deserialize transaction p2p messages
> > - if the transaction is a coinbase candidate, verify null input point
> > - if null input point pattern invalid, reject the transaction
>
> No, no part of this thread has any bearing on p2p transaction messages -=
=20
> nor are coinbase transactions relayed as transaction messages. You could=
=20
> restate it as:
>
> - receive block p2p messages
> - if the first tx's first input does not have a null point, reject the=20
> block
>
> > If I'm understanding correctly, the last rule has for effect to=20
> constraint the transaction space that can be used to brute-force and moun=
t=20
> a Merkle root forgery with a 64-byte coinbase transaction.
> >
> > As described in the 3.1.1 of the paper:=20
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190=
225/a27d8837/attachment-0001.pdf
>
> The above approach makes this malleation computationally infeasible.
>
> >> I'm referring to DoS mitigation (the only relevant security=20
> consideration here). I'm pointing out that invalidity caching is pointles=
s=20
> in all cases, and in this case is the most pointless as type64 malleation=
=20
> is the cheapest of all invalidity to detect. I would prefer that all bogu=
s=20
> blocks sent to my node are of this type. The worst types of invalidity=20
> detection have no mitigation and from a security standpoint are=20
> counterproductive to cache. I'm describing what overall is actually not a=
=20
> tradeoff. It's all negative and no positive.
>
> > I think we're both discussing the same issue about DoS mitigation for=
=20
> sure. Again, I think that saying the "invalidity caching" is pointless in=
=20
> all cases cannot be fully grounded as a statement without precising (a)=
=20
> what is the internal cache(s) layout of the full node processing block=20
> messages and (b) the sha256 mining resources available during N difficult=
y=20
> period and if any miner engage in self-fish mining like strategy.
>
> It has nothing to do with internal cache layout and nothing to do with=20
> mining resources. Not having a cache is clearly more efficient than havin=
g=20
> a cache that provides no advantage, regardless of how the cache is laid=
=20
> out. There is no cost to forcing a node to perform far more block=20
> validation computations than can be precluded by invalidity caching. The=
=20
> caching simply increases the overall computational cost (as would another=
=20
> redundant rule to try and make it more efficient). Discarding invalid=20
> blocks after the minimal amount of work is the most efficient resolution.=
=20
> What one does with the peer at that point is orthogonal (e.g. drop, ban).
>
> > About (a), I'll maintain my point I think it's a classic time-space=20
> trade-off to ponder in function of the internal cache layouts.
>
> An attacker can throw a nearly infinite number of distinct invalid blocks=
=20
> at your node (and all will connect to the chain and show proper PoW). As=
=20
> such you will encounter zero cache hits and therefore nothing but overhea=
d=20
> from the cache. Please explain to me in detail how "cache layout" is goin=
g=20
> to make any difference at all.
>
> > About (b) I think we''ll be back to the headers synchronization strateg=
y=20
> as implemented by a full node to discuss if they're exploitable asymmetri=
es=20
> for self-fish mining like strategies.
>
> I don't see this as a related/relevant topic. There are zero mining=20
> resources required to overflow the invalidity cache. Just as Core recentl=
y=20
> published regarding overflowing to its "ban" store, resulting in process=
=20
> termination, this then introduces another attack vector that must be=20
> mitigated.
>
> > If you can give a pseudo-code example of the "null point" validation=20
> implementation in libbitcoin code (?) I think this can make the=20
> conversation more concrete on the caching aspect.
>
> pseudo-code , not from libbitcoin...
>
> ```
> bool malleated64(block)
> {
>     segregated =3D ((block[80 + 4] =3D=3D 0) and (block[80 + 4 + 1] =3D=
=3D 1))
>     return block[segregated ? 86 : 85] !=3D=20
> 0xffffffff000000000000000000000000000000000000000000000000000000000000000=
0
> }
> ```
>
> Obviously there is no error handling (e.g. block too small, too many=20
> inputs, etc.) but that is not relevant to the particular question. The=20
> block.header is fixed size, always 80 bytes. The tx.version is also fixed=
,=20
> always 4 bytes. A following 0 implies a segregated witness (otherwise it'=
s=20
> the input count), assuming there is a following 1. The first and only inp=
ut=20
> for the coinbase tx, which must be the first block tx, follows. If it doe=
s=20
> not match=20
> 0xffffffff000000000000000000000000000000000000000000000000000000000000000=
0=20
> then the block is invalid. If it does match, it is computationally=20
> infeasible that the merkle root is type64 malleated. That's it, absolutel=
y=20
> trivial and with no prerequisites. The only thing that even makes it=20
> interesting is the segwit bifurcation.
>
> >> Rust has its own set of problems. No need to get into a language Jihad=
=20
> here. My point was to clarify that the particular question was not about =
a=20
> C (or C++) null pointer value, either on the surface or underneath an=20
> abstraction.
>
> > Thanks for the additional comments on libbitcoin usage of dependencies,=
=20
> yes I don't think there is a need to get into a language jihad here. It's=
=20
> just like all languages have their memory model (stack, dynamic alloc,=20
> smart pointers, etc) and when you're talking about performance it's usefu=
l=20
> to have their minds, imho.
>
> Sure, but no language difference that I'm aware of could have any bearing=
=20
> on this particular question.
>
> Best,
> Eric
>

--=20
You received this message because you are subscribed to the Google Groups "=
Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/=
78e8248d-bc77-452f-ac7e-19c28cbc3280n%40googlegroups.com.

------=_Part_150667_1323905878.1732771117097
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Eric,<br /><br />Going back to this thread with a bit of delay...<br /><=
br />tl;dr: See specifically comment on the lack of proof that invalidating=
 64-byte transactions are actually solving merkle root weaknesses that coul=
d lead to a fork or unravel SPV clients.<br /><br />&gt; I'm not sure what =
you mean by stating that a new consensus rule, "could be a low memory overh=
ead". Checking all tx sizes is far more overhead than validating the coinba=
se for a null point. As AntoineP agreed, it cannot be done earlier, and I h=
ave shown that it is *significantly* more computationally intensive. It mak=
es the determination much more costly and in all other cases by adding an a=
dditional check that serves no purpose.<br /><br />I think for any (new) co=
nsensus rule, we shall be able to evalute its implication in term of at lea=
st 2 dimensions a) memory overhead (e.g do a full-node needs more memory to=
 validate post-segwit blocks now that witness fields are discounted ?) and =
b) computational overhead (e.g do a full-node needs more CPU cycles to vali=
date confidential transactions perdersen commitments ?). A same consensus r=
ule can achieve the same effect e.g reducing headers merkle tree ambiguitie=
s, with completely different memory or computational cost. For the checking=
 all tx sizes vs validating the coinbase for a null point, indeed I agree w=
ith you that the latter is intuitively better on the 2 dimensions.<br /><br=
 />&gt; I think you misunderstood me. Of course the witness commitment must=
 be validated (as I said, "Yet it remains necessary to validate the witness=
 commitment..."), as otherwise the witnesses within a block can be anything=
 without affecting the block hash. And of course the witness commitment is =
computed in the same manner as the tx commitment and is therefore subject t=
o the same malleations. However, because the coinbase tx is committed to th=
e block hash, there is no need to guard the witness commitment for malleati=
on. And to my knowledge nobody has proposed doing so.<br /><br />Yes, we mi=
sunderstood each other here.<br /><br />&gt; It cannot, that was my point: =
"(1) have distinct identity due to another header property deviation, or (2=
) are the same block..."<br /><br />Ok.<br /><br />&gt; This was already th=
e presumption.<br /><br />Ok.<br /><br />&gt; I'm not seeing the connection=
 here. Are you suggesting that tx and block hashes may collide with each ot=
her? Or that that a block message may be confused with a transaction messag=
e?<br /><br />This was about how to deal with types of invalid block messag=
es in bitcoin core that could sources of denial-of-service. E.g invalid bit=
coin block hash for a message with unparsable data (#1 and #3 in your typol=
ogy.<br />My point was bitcoin core is making some assumption at block down=
load to fetch them from outbound peers rather than inbound. Outboud peers a=
re assumed to be more reliable, as the connection is attempted from the out=
side.<br />The point in my point was that outbound-block-relay connections =
where initially introduce to alleviate those types of concerns i.e tx probe=
 to infer the topology, see https://arxiv.org/pdf/1812.00942<br /><br />&gt=
; This does not mitigate the issue. It's essentially dead code. It's exactl=
y like saying, "there's an arbitrary number of holes in the bucket, but we =
can plug a subset of those holes." Infinite minus any number is still infin=
ite.<br /><br />I disagree with you here - If the fundamental problem is ef=
ficiently caching identity in the case of block invalidity, one cannot igno=
re a robust peeering policy, i.e you pick up peers allowed a scarce connect=
ion slot.<br />This is indeed useless if you don't have first an efficient =
verification algorithm to determine the block invalidity, though it's part =
of the overall equation.<br />While infinite minus any number is of course =
still infinite, thinking security by layers it's the base you can have a se=
cure signature verification algorithm still on a hot computing host.<br /><=
br />&gt; I don't follow this statement. The term "usable" was specifically=
 addressing the proposal - that a header hash must uniquely identify a bloc=
k (a header and committed set of txs) as valid or otherwise. As I have poin=
ted out, this will still not be the case if 64 byte blocks are invalidated.=
 It is also not the case that detection of type64 malleated blocks can be m=
ade more performant if 64 byte txs are globally invalid. In fact the opposi=
te is true, it becomes more costly (and complex) and is therefore just dead=
 code.<br /><br />Okay, in my statement, the term "usable" was to be unders=
tood as any meaningful bit of information that can lead to computationally-=
hard-to-forge progress in the determination problem you laid out here:<br /=
>https://groups.google.com/g/bitcoindev/c/CAfm7D5ppjo/m/T1-HKqSLAAAJ<br /><=
br />&gt; Headers first only defers malleation checks. The same checks are =
necessary whether you perform blocks first or headers first sync (we suppor=
t both protocol levels). The only difference is that for headers first, a s=
tored header might later become invalidated. However, this is the case with=
 and without the possibility of malleation.<br /><br />Yes, I agree with yo=
u here, a stored header might become invalidated, e.g a reorg-out tx commit=
ted in the header's merkle tree after the header reception.<br /><br />&gt;=
 I have not suggested that anything is waived or ignored here. I'm stating =
that there is no "mempool" performance benefit whatsoever to invalidating 6=
4 byte txs. Mempool caching could only rely on tx identifiers, not block id=
entifiers. Tx identifiers are not at issue.<br /><br />Once again, if the g=
oal is an efficient algorithm making progress to determinate a block invali=
dity, and as such reducing the denial-of-service surface, caching signature=
s which are committed in the wtixd tree or in the txid tree is a plus.<br /=
>Though yes, I agree there is no "mempool" performance benefit to invalidat=
e the 64 byte tx.<br /><br />&gt; I don't know how to add any more detail t=
han I already have. There are three relevant considerations:<br />&gt; <br =
/>&gt; (1) block hashes will not become unique identifiers for block messag=
es.<br />&gt; (2) the earliest point at which type64 malleation can be dete=
cted will not be reduced.<br />&gt; (3) the necessary cost of type64 mallea=
ted determination will not be reduced.<br />&gt; (4) the additional consens=
us rule will increase validation cost and code complexity.<br />&gt; (5) in=
valid blocks can still be produced at no cost that require full double tx h=
ashing/Merkle root computations.<br />&gt; <br />&gt; Which of these statem=
ents are not evident at this point?<br /><br />That's five statements, not =
three. Minding implementation-dependent considerations, I'm leaning to agre=
e with up to (4) included.<br />About (5), I don't see how it makes sense t=
hat invalid blocks can be still produced at not cost, at least pow should b=
e the first thing first to be verified.<br />Like this statement could be c=
larified what you mean by this.<br /><br />&gt; No, no part of this thread =
has any bearing on p2p transaction messages - nor are coinbase transactions=
 relayed as transaction messages. You could restate it as:<br />&gt; <br />=
&gt; - receive block p2p messages<br />&gt; - if the first tx's first input=
 does not have a null point, reject the block<br /><br />I don't believe we=
 can fully dissociate bearing on p2p blocks / transactions messages, from t=
he overall goal of reducing denial-of-service arising from invalid blocks. =
How can you be sure the block is invalid until you validate all txn ? Thoug=
h let's waiwe this observation for the present point.<br /><br />The idea o=
f exploiting block malleability is to grind one transaction T0 for a block =
B such that H(T0) =3D=3D H(H(T1) || H(T2)) =3D=3D B's Root. I.e to have T0 =
=3D=3D H(T1) || H(T2). T0 can be consensus valid or invalid to provoke a co=
nsensus fork (it's the collision in the deserialization which is the source=
 of the merkle tree root weakness). The first transaction in the block is n=
ecessarily the coinbase per current consensus rules. Checking that T0 is a =
valid coinbase transaction is sufficient to reject the block. Grinding 64-b=
yte transactions that all deserialize as valid transactions, including the =
null point requirement is computationally infeasible.<br /><br />I'm not su=
re that even if we get ride of 64-byte transactions, we would remove the me=
rkle root weaknesses. Back to the previous example, one could find T3 and T=
4 such that H(H(T3) || H(T4)) is equivalent to H(H(T1) || H(T2)). Of course=
, it would consist in breaking SHA256, which is deemed computationally infe=
asible. I'm not even sure the header verifcation algorithms gains second-pr=
eimage resistance from forbidding 64-byte transaction.<br /><br />So I thin=
k more that the minimal sufficient check to reject a block should be more c=
arefully analyzed, rather than advocating that forbidding some magic value =
obviously fix an issue, in the present the bitcoin's merkle root weaknesses=
.<br /><br />&gt; The above approach makes this malleation computationally =
infeasible.<br /><br />I'm intuitively leaning so, though see comments abov=
e that it should be more carefully thought about.<br /><br />&gt; It has no=
thing to do with internal cache layout and nothing to do with mining resour=
ces. Not having a cache is clearly more efficient than having a cache that =
provides no advantage, regardless of how the cache is laid out. There is no=
 cost to forcing a node to perform far more block validation computations t=
han can be precluded by invalidity caching. The caching simply increases th=
e overall computational cost (as would another redundant rule to try and ma=
ke it more efficient). Discarding invalid blocks after the minimal amount o=
f work is the most efficient resolution. What one does with the peer at tha=
t point is orthogonal (e.g. drop, ban).<br /><br />I disagree here - If the=
 goal is an efficient algorithm making progress to determinate a block inva=
lidity, and then being able to re-use a run of this algorithm when a blocks=
 occurs again, having a cache widens the range of algorithmsone can design.=
 Same with the mining ressources, if it's to consider denial-of-services an=
d an attacker could be able to fully forge blocks. If such invalidity cachi=
ng strategy was efficient it would actually minimize or erase the cost for =
a node to perform more block validation computations. Where yes I share you=
 opinion is that an ill-designed caching could increase the overall computa=
tional cost, and that discarding invalid blocks after the minimal amount of=
 work is the most efficient resolution for the 1st seen, though it doesn't =
say on the next N seen. Having the signatures already validated could be ob=
viously a win, even with a blind, decaying cache, it's all about the memory=
 space of an<br />average full-node.<br /><br />&gt; An attacker can throw =
a nearly infinite number of distinct invalid blocks at your node (and all w=
ill connect to the chain and show proper PoW). As such you will encounter z=
ero cache hits and therefore nothing but overhead from the cache. Please ex=
plain to me in detail how "cache layout" is going to make any difference at=
 all.<br /><br />Going back to your typology from (1) to (9), e.g for the s=
tep 9 to determine if a block message with valid header but unmalleated com=
mitted valid tx data.<br />If you have seen the block message a first-time,=
 though it wasn't yet on the longest pow-chain and you disregarded its vali=
dation.<br /><br />&gt; I don't see this as a related/relevant topic. There=
 are zero mining resources required to overflow the invalidity cache. Just =
as Core recently published regarding overflowing to its "ban" store, result=
ing in process termination, this then introduces another attack vector that=
 must be mitigated.<br /><br />Depends if your invalidity cache is safeguar=
ded by a minimal valid proof-of-work. I'm certaintly not going to defend th=
at all bitcoin core internal caches and stores are well-designed for advers=
arials environments.<br /><br />&gt; pseudo-code , not from libbitcoin...<b=
r />&gt; <br />&gt; ```<br />&gt; bool malleated64(block)<br />&gt; {<br />=
&gt; =C2=A0 =C2=A0 segregated =3D ((block[80 + 4] =3D=3D 0) and (block[80 +=
 4 + 1] =3D=3D 1))<br />&gt; =C2=A0 =C2=A0 return block[segregated ? 86 : 8=
5] !=3D 0xffffffff000000000000000000000000000000000000000000000000000000000=
0000000<br />&gt; }<br />&gt; ```<br />&gt; <br />&gt; Obviously there is n=
o error handling (e.g. block too small, too many inputs, etc.) but that is =
not relevant to the particular question. The block.header is fixed size, al=
ways 80 bytes. The tx.version is also fixed, always 4 bytes. A following 0 =
implies a segregated witness (otherwise it's the input count), assuming the=
re is a following 1. The first and only input for the coinbase tx, which mu=
st be the first block tx, follows. If it does not match 0xffffffff000000000=
0000000000000000000000000000000000000000000000000000000 then the block is i=
nvalid. If it does match, it is computationally infeasible that the merkle =
root is type64 malleated. That's it, absolutely trivial and with no prerequ=
isites. The only thing that even makes it interesting is the segwit bifurca=
tion.<br /><br />Thanks for the example with the segwit bifurcation for the=
 marker. By the way, the segwit marker is documented in BIP144, which is in=
correctly labeled as "Peer Services", though obviously misimplementing the =
transaction ser / deser algorithm for segwit blocks would lead to consensus=
 divergence (what if you expect the "flag" to be 0xff and not 0x01 ?). Pers=
onally, I think it's a good example of how tedious consensus changes can be=
, when even documents for inter-compatibility about consensus changes are n=
ot drawing a clear line between what is consensus and what are p2p rules...=
<br /><br />&gt; Sure, but no language difference that I'm aware of could h=
ave any bearing on this particular question.<br /><br />Same, I don't see l=
anguage difference that could have bearing on this question, at that level =
of granularity.<br /><br />Best,<br />Antoine R<br />ots hash: 3d5ed1718683=
ce1e864751a2eccf21908ed3b11079f183cdf863729d71ae3f36<br /><div class=3D"gma=
il_quote"><div dir=3D"auto" class=3D"gmail_attr">Le samedi 20 juillet 2024 =
=C3=A0 21:51:27 UTC+1, Eric Voskuil a =C3=A9crit=C2=A0:<br/></div><blockquo=
te class=3D"gmail_quote" style=3D"margin: 0 0 0 0.8ex; border-left: 1px sol=
id rgb(204, 204, 204); padding-left: 1ex;">Hi Antoine R,<br><br>&gt;&gt; Wh=
ile at some level the block message buffer would generally be referenced by=
 one or more C pointers, the difference between a valid coinbase input (i.e=
. with a &quot;null point&quot;) and any other input, is not nullptr vs. !n=
ullptr. A &quot;null point&quot; is a 36 byte value, 32 0x00 byes followed =
by 4 0xff bytes. In his infinite wisdom Satoshi decided it was better (or e=
asier) to serialize a first block tx (coinbase) with an input containing an=
 unusable script and pointing to an invalid [tx:index] tuple (input point) =
as opposed to just not having any input. That invalid input point is called=
 a &quot;null point&quot;, and of course cannot be pointed to by a &quot;nu=
ll pointer&quot;. The coinbase must be identified by comparing those 36 byt=
es to the well-known null point value (and if this does not match the Merkl=
e hash cannot have been type64 malleated).<br><br>&gt; Good for the clarifi=
cation here, I had in mind the core&#39;s `CheckBlock` path where the first=
 block transaction pointer is dereferenced to verify if the transaction is =
a coinbase (i.e a &quot;null point&quot; where the prevout is null). Zoomin=
g out and back to my remark, I think this is correct that adding a new 64 b=
yte size check on all block transactions to detect block hash invalidity co=
uld be a low memory overhead (implementation dependant), rather than making=
 that 64 byte check alone on the coinbase transaction as in my understandin=
g you&#39;re proposing.<br><br>I&#39;m not sure what you mean by stating th=
at a new consensus rule, &quot;could be a low memory overhead&quot;. Checki=
ng all tx sizes is far more overhead than validating the coinbase for a nul=
l point. As AntoineP agreed, it cannot be done earlier, and I have shown th=
at it is *significantly* more computationally intensive. It makes the deter=
mination much more costly and in all other cases by adding an additional ch=
eck that serves no purpose.<br><br>&gt;&gt;&gt; The second one is the bip14=
1 wtxid commitment in one of the coinbase transaction `scriptpubkey` output=
, which is itself covered by a txid in the merkle tree.<br><br>&gt;&gt; Whi=
le symmetry seems to imply that the witness commitment would be malleable, =
just as the txs commitment, this is not the case. If the tx commitment is c=
orrect it is computationally infeasible for the witness commitment to be ma=
lleated, as the witness commitment incorporates each full tx (with witness,=
 sentinel, and marker). As such the block identifier, which relies only on =
the header and tx commitment, is a sufficient identifier. Yet it remains ne=
cessary to validate the witness commitment to ensure that the correct witne=
ss data has been provided in the block message.<br>&gt;&gt;<br>&gt;&gt; The=
 second type of malleability, in addition to type64, is what we call type32=
. This is the consequence of duplicated trailing sets of txs (and therefore=
 tx hashes) in a block message. This is applicable to some but not all bloc=
ks, as a function of the number of txs contained.<br><br>&gt; To precise mo=
re your statement in describing source of malleability. The witness stack c=
an be malleated altering the wtxid and yet still valid. I think you can sti=
ll have the case where you&#39;re feeded a block header with a merkle root =
commitment deserializing to a valid coinbase transaction with an invalid wi=
tness commitment. This is the case of a &quot;block message with valid head=
er but malleatead committed valid tx data&quot;. Validation of the witness =
commitment to ensure the correct witness data has been provided in the bloc=
k message is indeed necessary.<br><br>I think you misunderstood me. Of cour=
se the witness commitment must be validated (as I said, &quot;Yet it remain=
s necessary to validate the witness commitment...&quot;), as otherwise the =
witnesses within a block can be anything without affecting the block hash. =
And of course the witness commitment is computed in the same manner as the =
tx commitment and is therefore subject to the same malleations. However, be=
cause the coinbase tx is committed to the block hash, there is no need to g=
uard the witness commitment for malleation. And to my knowledge nobody has =
proposed doing so.<br><br>&gt;&gt;&gt; I think I mostly agree with the iden=
tity issue as laid out so far, there is one caveat to add if you&#39;re con=
sidering identity caching as the problem solved. A validation node might ha=
ve to consider differently block messages processed if they connect on the =
longest most PoW valid chain for which all blocks have been validated. Or a=
lternatively if they have to be added on a candidate longest most PoW valid=
 chain.<br><br>&gt;&gt; Certainly an important consideration. We store both=
 types. Once there is a stronger candidate header chain we store the header=
s and proceed to obtaining the blocks (if we don&#39;t already have them). =
The blocks are stored in the same table; the confirmed vs. candidate indexe=
s simply point to them as applicable. It is feasible (and has happened twic=
e) for two blocks to share the very same coinbase tx, even with either/all =
bip30/34/90 active (and setting aside future issues here for the sake of si=
mplicity). This remains only because two competing branches can have blocks=
 at the same height, and bip34 requires only height in the coinbase input s=
cript. This therefore implies the same transaction but distinct blocks. It =
is however infeasible for one block to exist in multiple distinct chains. I=
n order for this to happen two blocks at the same height must have the same=
 coinbase (ok), and also the same parent (ok). But this then means that the=
y either (1) have distinct identity due to another header property deviatio=
n, or (2) are the same block with the same parent and are therefore in just=
 one chain. So I don&#39;t see an actual caveat. I&#39;m not certain if thi=
s is the ambiguity that you were referring to. If not please feel free to c=
larify.<br><br>&gt; If you assume no network partition and the no blocks mo=
re than 2h in the future consensus rule, I cannot see how one block with no=
 header property deviation can exist in multiple distinct chains.<br><br>It=
 cannot, that was my point: &quot;(1) have distinct identity due to another=
 header property deviation, or (2) are the same block...&quot;<br><br>&gt; =
The ambiguity I was referring was about a different angle, if the design go=
al of introducing a 64 byte size check is to &quot;it was about being able =
to cache the hash of a (non-malleated) invalid block as permanently invalid=
 to avoid re-downloading and re-validating it&quot;, in my thinking we shal=
l consider the whole block headers caching strategy and be sure we don&#39;=
t get situations where an attacker can attach a chain of low-pow block head=
ers with malleated committed valid tx data yielding a block invalidity at t=
he end, provoking as a side-effect a network-wide data download blowup. So =
I think any implementation of the validation of a block validity, of which =
identity is a sub-problem, should be strictly ordered by adequate proof-of-=
work checks.<br><br>This was already the presumption.<br><br>&gt;&gt; We do=
n&#39;t do this and I don&#39;t see how it would be relevant. If a peer pro=
vides any invalid message or otherwise violates the protocol it is simply d=
ropped.<br>&gt;&gt;<br>&gt;&gt; The &quot;problematic&quot; that I&#39;m re=
ferring to is the reliance on the block hash as a message identifier, becau=
se it does not identify the message and cannot be useful in an effectively =
unlimited number of zero-cost cases.<br><br>&gt; Historically, it was to is=
olate transaction-relay from block-relay to optimistically harden in face o=
f network partition, as this is easy to infer transaction-relay topology wi=
th a lot of heuristics.<br><br>I&#39;m not seeing the connection here. Are =
you suggesting that tx and block hashes may collide with each other? Or tha=
t that a block message may be confused with a transaction message?<br><br>&=
gt; I think this is correct that block hash message cannot be relied on as =
it cannot be useful in an unlimited number of zero-cost cases, as I was poi=
nting that bitcoin core partially mitigate that with discouraging connectio=
ns to block-relay peers servicing block messages (`MaybePunishNodeForBlocks=
`).<br><br>This does not mitigate the issue. It&#39;s essentially dead code=
. It&#39;s exactly like saying, &quot;there&#39;s an arbitrary number of ho=
les in the bucket, but we can plug a subset of those holes.&quot; Infinite =
minus any number is still infinite.<br><br>&gt; I believe somehow the bottl=
eneck we&#39;re circling around is computationally definining what are the =
&quot;usable&quot; identifiers for block messages. The most straightforward=
 answer to this question is the full block in one single peer message, at l=
east in my perspective.<br><br>I don&#39;t follow this statement. The term =
&quot;usable&quot; was specifically addressing the proposal - that a header=
 hash must uniquely identify a block (a header and committed set of txs) as=
 valid or otherwise. As I have pointed out, this will still not be the case=
 if 64 byte blocks are invalidated. It is also not the case that detection =
of type64 malleated blocks can be made more performant if 64 byte txs are g=
lobally invalid. In fact the opposite is true, it becomes more costly (and =
complex) and is therefore just dead code.<br><br>&gt; Reality since headers=
 first synchronization (`getheaders`), block validation has been dissociate=
d in steps for performance reasons, among others.<br><br>Headers first only=
 defers malleation checks. The same checks are necessary whether you perfor=
m blocks first or headers first sync (we support both protocol levels). The=
 only difference is that for headers first, a stored header might later bec=
ome invalidated. However, this is the case with and without the possibility=
 of malleation.<br><br>&gt;&gt; Again, this has no relation to tx hashes/id=
entifiers. Libbitcoin has a tx pool, we just don&#39;t store them in RAM (m=
emory).<br>&gt;&gt;<br>&gt;&gt; I don&#39;t follow this. An invalid 64 byte=
 tx consensus rule would definitely not make it harder to exploit block mes=
sage invalidity. In fact it would just slow down validation by adding a red=
undant rule. Furthermore, as I have detailed in a previous message, caching=
 invalidity does absolutely nothing to increase protection. In fact it make=
s the situation materially worse.<br><br>&gt; Just to recall, in my underst=
anding the proposal we&#39;re discussing is about outlawing 64 bytes size t=
ransactions at the consensus-level to minimize denial-of-service vectors du=
ring block validation. I think we&#39;re talking about each other because t=
he mempool already introduce a layer of caching in bitcoin core, of which t=
he result are re-used at block validation, such as signature verification r=
esults. I&#39;m not sure we can fully waive apart performance consideration=
s, though I agree implementation architecture subsystems like mempool shoul=
d only be a sideline considerations.<br><br>I have not suggested that anyth=
ing is waived or ignored here. I&#39;m stating that there is no &quot;mempo=
ol&quot; performance benefit whatsoever to invalidating 64 byte txs. Mempoo=
l caching could only rely on tx identifiers, not block identifiers. Tx iden=
tifiers are not at issue.<br><br>&gt;&gt; No, this is not the case. As I de=
tailed in my previous message, there is no possible scenario where invalida=
tion caching does anything but make the situation materially worse.<br><br>=
&gt; I think this can be correct that invalidation caching make the situati=
on materially worse, or is denial-of-service neutral, as I believe a full n=
ode is only trading space for time resources in matters of block messages v=
alidation. I still believe such analysis, as detailed in your previous mess=
age, would benefit to be more detailed.<br><br>I don&#39;t know how to add =
any more detail than I already have. There are three relevant consideration=
s:<br><br>(1) block hashes will not become unique identifiers for block mes=
sages.<br>(2) the earliest point at which type64 malleation can be detected=
 will not be reduced.<br>(3) the necessary cost of type64=C2=A0malleated=C2=
=A0determination will not be reduced.<br>(4) the additional consensus rule =
will increase validation cost and code complexity.<br>(5) invalid blocks ca=
n still be produced at no cost that require full double tx hashing/Merkle r=
oot computations.<br><br>Which of these statements are not evident at this =
point?<br><br>&gt;&gt; On the other hand, just dealing with parse failure o=
n the spot by introducing a leading pattern in the stream just inflates the=
 size of p2p messages, and the transaction-relay bandwidth cost.<br>&gt;&gt=
;<br>&gt;&gt; I think you misunderstood me. I am suggesting no change to se=
rialization. I can see how it might be unclear, but I said, &quot;nothing p=
recludes incorporating a requirement for a necessary leading pattern in the=
 stream.&quot; I meant that the parser can simply incorporate the *requirem=
ent* that the byte stream starts with a null input point. That identifies t=
he malleation or invalidity without a single hash operation and while only =
reading a handful of bytes. No change to any messages.<br><br>&gt; Indeed, =
this is clearer with the re-explanation above about what you meant by the &=
quot;null point&quot;.<br><br>Ok<br><br>&gt; In my understanding, you&#39;r=
e suggesting the following algorithm:<br>&gt; - receive transaction p2p mes=
sages<br>&gt; - deserialize transaction p2p messages<br>&gt; - if the trans=
action is a coinbase candidate, verify null input point<br>&gt; - if null i=
nput point pattern invalid, reject the transaction<br><br>No, no part of th=
is thread has any bearing on p2p transaction messages - nor are coinbase tr=
ansactions relayed as transaction messages. You could restate it as:<br><br=
>- receive block p2p messages<br>- if the first tx&#39;s first input does n=
ot have a null point, reject the block<br><br>&gt; If I&#39;m understanding=
 correctly, the last rule has for effect to constraint the transaction spac=
e that can be used to brute-force and mount a Merkle root forgery with a 64=
-byte coinbase transaction.<br>&gt;<br>&gt; As described in the 3.1.1 of th=
e paper: <a href=3D"https://lists.linuxfoundation.org/pipermail/bitcoin-dev=
/attachments/20190225/a27d8837/attachment-0001.pdf" target=3D"_blank" rel=
=3D"nofollow" data-saferedirecturl=3D"https://www.google.com/url?hl=3Dfr&am=
p;q=3Dhttps://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/2=
0190225/a27d8837/attachment-0001.pdf&amp;source=3Dgmail&amp;ust=3D173285743=
2040000&amp;usg=3DAOvVaw04td23bqKDQYDkWTx_hMhq">https://lists.linuxfoundati=
on.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.=
pdf</a><br><br>The above approach makes this malleation computationally inf=
easible.<br><br>&gt;&gt; I&#39;m referring to DoS mitigation (the only rele=
vant security consideration here). I&#39;m pointing out that invalidity cac=
hing is pointless in all cases, and in this case is the most pointless as t=
ype64 malleation is the cheapest of all invalidity to detect. I would prefe=
r that all bogus blocks sent to my node are of this type. The worst types o=
f invalidity detection have no mitigation and from a security standpoint ar=
e counterproductive to cache. I&#39;m describing what overall is actually n=
ot a tradeoff. It&#39;s all negative and no positive.<br><br>&gt; I think w=
e&#39;re both discussing the same issue about DoS mitigation for sure. Agai=
n, I think that saying the &quot;invalidity caching&quot; is pointless in a=
ll cases cannot be fully grounded as a statement without precising (a) what=
 is the internal cache(s) layout of the full node processing block messages=
 and (b) the sha256 mining resources available during N difficulty period a=
nd if any miner engage in self-fish mining like strategy.<br><br>It has not=
hing to do with internal cache layout and nothing to do with mining resourc=
es. Not having a cache is clearly more efficient than having a cache that p=
rovides no advantage, regardless of how the cache is laid out. There is no =
cost to forcing a node to perform far more block validation computations th=
an can be precluded by invalidity caching. The caching simply increases the=
 overall computational cost (as would another redundant rule to try and mak=
e it more efficient). Discarding invalid blocks after the minimal amount of=
 work is the most efficient resolution. What one does with the peer at that=
 point is orthogonal (e.g. drop, ban).<br><br>&gt; About (a), I&#39;ll main=
tain my point I think it&#39;s a classic time-space trade-off to ponder in =
function of the internal cache layouts.<br><br>An attacker can throw a near=
ly infinite number of distinct invalid blocks at your node (and all will co=
nnect to the chain and show proper PoW). As such you will encounter zero ca=
che hits and therefore nothing but overhead from the cache. Please explain =
to me in detail how &quot;cache layout&quot; is going to make any differenc=
e at all.<br><br>&gt; About (b) I think we&#39;&#39;ll be back to the heade=
rs synchronization strategy as implemented by a full node to discuss if the=
y&#39;re exploitable asymmetries for self-fish mining like strategies.<br><=
br>I don&#39;t see this as a related/relevant topic. There are zero mining =
resources required to overflow the invalidity cache. Just as Core recently =
published regarding overflowing to its &quot;ban&quot; store, resulting in =
process termination, this then introduces another attack vector that must b=
e mitigated.<br><br>&gt; If you can give a pseudo-code example of the &quot=
;null point&quot; validation implementation in libbitcoin code (?) I think =
this can make the conversation more concrete on the caching aspect.<br><br>=
pseudo-code=C2=A0, not from libbitcoin...<br><br>```<br>bool malleated64(bl=
ock)<br>{<br>=C2=A0 =C2=A0 segregated =3D ((block[80 + 4] =3D=3D 0) and (bl=
ock[80 + 4 + 1] =3D=3D 1))<br>=C2=A0 =C2=A0 return block[segregated ? 86 : =
85] !=3D 0xffffffff00000000000000000000000000000000000000000000000000000000=
00000000<br>}<br>```<br><br>Obviously there is no error handling (e.g. bloc=
k too small, too many inputs, etc.) but that is not relevant to the particu=
lar question. The block.header is fixed size, always 80 bytes. The tx.versi=
on is also fixed, always 4 bytes. A following 0 implies a segregated witnes=
s (otherwise it&#39;s the input count), assuming there is a following 1. Th=
e first and only input for the coinbase tx, which must be the first block t=
x, follows. If it does not match 0xffffffff00000000000000000000000000000000=
00000000000000000000000000000000 then the block is invalid. If it does matc=
h, it is computationally infeasible that the merkle root is type64 malleate=
d. That&#39;s it, absolutely trivial and with no prerequisites. The only th=
ing that even makes it interesting is the segwit bifurcation.<br><br>&gt;&g=
t; Rust has its own set of problems. No need to get into a language Jihad h=
ere. My point was to clarify that the particular question was not about a C=
 (or C++) null pointer value, either on the surface or underneath an abstra=
ction.<br><br>&gt; Thanks for the additional comments on libbitcoin usage o=
f dependencies, yes I don&#39;t think there is a need to get into a languag=
e jihad here. It&#39;s just like all languages have their memory model (sta=
ck, dynamic alloc, smart pointers, etc) and when you&#39;re talking about p=
erformance it&#39;s useful to have their minds, imho.<br><br>Sure, but no l=
anguage difference that I&#39;m aware of could have any bearing on this par=
ticular question.<br><br>Best,<br>Eric<br></blockquote></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;Bitcoin Development Mailing List&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:bitcoindev+unsubscribe@googlegroups.com">bitcoind=
ev+unsubscribe@googlegroups.com</a>.<br />
To view this discussion visit <a href=3D"https://groups.google.com/d/msgid/=
bitcoindev/78e8248d-bc77-452f-ac7e-19c28cbc3280n%40googlegroups.com?utm_med=
ium=3Demail&utm_source=3Dfooter">https://groups.google.com/d/msgid/bitcoind=
ev/78e8248d-bc77-452f-ac7e-19c28cbc3280n%40googlegroups.com</a>.<br />

------=_Part_150667_1323905878.1732771117097--

------=_Part_150666_732945913.1732771117097--