Return-Path: <peter.tschipper@gmail.com> Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id A48968F5 for <bitcoin-dev@lists.linuxfoundation.org>; Tue, 10 Nov 2015 16:17:43 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pa0-f47.google.com (mail-pa0-f47.google.com [209.85.220.47]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id A9819150 for <bitcoin-dev@lists.linuxfoundation.org>; Tue, 10 Nov 2015 16:17:42 +0000 (UTC) Received: by pabfh17 with SMTP id fh17so1218324pab.0 for <bitcoin-dev@lists.linuxfoundation.org>; Tue, 10 Nov 2015 08:17:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-type; bh=qbWD9KlDptARj3tea74YXm2kTeocPta1CNfq0UMGnLQ=; b=f2aXIKN7tqMbOV5UnIm+bXAJyQdaQbdigJMX05YdRTZ47xPxdqf9ZrgAqx3y9I1A55 olLBXpsdC1IlY0bqnGuvFTr9mK78EWWhF0D+u6EVIC/XwwJIp4mD3ae+IW48sZV6L9sk RwtXwgcz4kEQKxSyFAtT9kG9Jp9mfRyewNRYw+QhZohQ7C7xxh1101VcqEkYVnVJW43E mSICqoFZl5mzl728sDBmQqRGGGbABKlyHO/MQfoCyy3zq+c1G0amS8Z+K+hfQdc5VKz3 KguLeBXgb5N2goV8F3XuECAno27yySApEuo6hlUDS5p9LPhKqUS4SaIxQat2N46RkksZ ZT8w== X-Received: by 10.68.192.8 with SMTP id hc8mr6716276pbc.117.1447172262206; Tue, 10 Nov 2015 08:17:42 -0800 (PST) Received: from [192.168.0.132] (S0106bcd165303d84.cc.shawcable.net. [96.54.102.88]) by smtp.googlemail.com with ESMTPSA id c5sm4924306pbu.18.2015.11.10.08.17.41 for <bitcoin-dev@lists.linuxfoundation.org> (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Nov 2015 08:17:41 -0800 (PST) To: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org> References: <5640F172.3010004@gmail.com> <20151109210449.GE5886@mcelrath.org> <CAL7-sS0Apm4O_Qi0FmY7=H580rEVD6DYjk2y+ACpZmKqUJTQwA@mail.gmail.com> <CALOxbZtTUrZwDfy_jTbs60n=K8RKDGg5X0gkLsh-OX3ikLf1FQ@mail.gmail.com> <CAE-z3OUB-se_HUvW2NLjWt=0d5sgMiPEciu0hLzr_HQN0m9fqQ@mail.gmail.com> <5642172C.701@gmail.com> From: Peter Tschipper <peter.tschipper@gmail.com> Message-ID: <564218A4.8070102@gmail.com> Date: Tue, 10 Nov 2015 08:17:40 -0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <5642172C.701@gmail.com> Content-Type: multipart/alternative; boundary="------------030702010507070103060804" X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression" X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Development Discussion <bitcoin-dev.lists.linuxfoundation.org> List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe> List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/> List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org> List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help> List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe> X-List-Received-Date: Tue, 10 Nov 2015 16:17:43 -0000 This is a multi-part message in MIME format. --------------030702010507070103060804 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit On 10/11/2015 8:11 AM, Peter Tschipper wrote: > On 10/11/2015 1:44 AM, Tier Nolan via bitcoin-dev wrote: >> The network protocol is not quite consensus critical, but it is >> important. >> >> Two implementations of the decompressor might not be bug for bug >> compatible. This (potentially) means that a block could be designed >> that won't decode properly for some version of the client but would >> work for another. This would fork the network. >> >> A "raw" network library is unlikely to have the same problem. >> >> Rather than just compress the stream, you could compress only block >> messages only. A new "cblock" message could be created that is a >> compressed block. This shouldn't reduce efficiency by much. >> > I chose the more generic datastream compression so we could in the > future apply to possibly to transactions but currently all that is > planned, is to compress blocks, and that was really my only original > intent until I saw that there might be some bandwidth savings for > transactions as well. > > The compression however could be applied to any datastream but is not > *forced* . Basically it would just be a method call in CDatastream so > we could do ss.compress and ss.decompress and apply that to blocks and > possibly transactions if worthwhile and only IF compression is turned > on. But there is no intend to apply this to every type of message > since most would be too small to benefit from compression. > > Here are some results of using the code in the PR to > compress/decompress blocks using zlib compression level = 6. This > data was taken from the first 275K blocks in the mainnet blockchain. > Clearly once we get past 10KB we get pretty decent compression but > even below that there is some benefit. I'm still collecting data and > will get the same for the whole blockchain. > > range = block size range > ubytes = average size of uncompressed blocks > cbytes = average size of compressed blocks > ctime = average time to compress > dtime = average time to decompress > cmp_ratio% = compression ratio > datapoints = number of datapoints taken > > range ubytes cbytes ctime dtime cmp_ratio% datapoints > 0-250b 215 189 0.001 0.000 12.41 79498 > 250-500b 440 405 0.001 0.000 7.82 11903 > 500-1KB 762 702 0.001 0.000 7.83 10448 > 1KB-10KB 4166 3561 0.001 0.000 14.51 50572 > 10KB-100KB 40820 31597 0.005 0.001 22.59 75555 > 100KB-200KB 146238 106320 0.015 0.001 27.30 25024 > 200KB-300KB 242913 175482 0.025 0.002 27.76 20450 > 300KB-400KB 343430 251760 0.034 0.003 26.69 2069 > 400KB-500KB 457448 343495 0.045 0.004 24.91 1889 > 500KB-600KB 540736 424255 0.056 0.007 21.54 90 > 600KB-700KB 647851 506888 0.063 0.007 21.76 59 > 700KB-800KB 749513 586551 0.073 0.007 21.74 48 > 800KB-900KB 859439 652166 0.086 0.008 24.12 39 > 900KB-1MB 952333 725191 0.089 0.009 23.85 78 > >> If a client fails to decode a cblock, then it can ask for the block >> to be re-sent as a standard "block" message. > interesting idea. >> >> This means that it is a pure performance improvement. If problems >> occur, then the client can just switch back to uncompressed mode for >> that block. >> >> You should look into the block relay system. This gives a larger >> improvement than simply compressing the stream. The main benefit is >> latency but it means that actual blocks don't have to be sent, so >> gives a potential 50% compression ratio. Normally, a node receives >> all the transactions and then those transactions are included later >> in the block. >> > There are better ways of sending new blocks, that's certainly true but > for sending historical blocks and seding transactions I don't think > so. This PR is really designed to save bandwidth and not intended to > be a huge performance improvement in terms of time spent sending. >> >> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev >> <bitcoin-dev@lists.linuxfoundation.org> wrote: >> >> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev >> <bitcoin-dev@lists.linuxfoundation.org >> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote: >> >> >> I think 25% bandwidth savings is certainly considerable, >> especially for people running full nodes in countries like >> Australia where internet bandwidth is lower and there are >> data caps. >> >> >> This reinforces the idea that such trade-off decisions should be >> be local and negotiated between peers, not a required feature of >> the network P2P. >> >> >> -- >> Johnathan Corgan >> Corgan Labs - SDR Training and Development Services >> http://corganlabs.com >> >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> <mailto:bitcoin-dev@lists.linuxfoundation.org> >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >> >> >> >> >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > --------------030702010507070103060804 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">On 10/11/2015 8:11 AM, Peter Tschipper wrote:<br> </div> <blockquote cite="mid:5642172C.701@gmail.com" type="cite"> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> <div class="moz-cite-prefix">On 10/11/2015 1:44 AM, Tier Nolan via bitcoin-dev wrote:<br> </div> <blockquote cite="mid:CAE-z3OUB-se_HUvW2NLjWt=0d5sgMiPEciu0hLzr_HQN0m9fqQ@mail.gmail.com" type="cite"> <div dir="ltr"> <div> <div> <div> <div> <div>The network protocol is not quite consensus critical, but it is important.<br> <br> </div> Two implementations of the decompressor might not be bug for bug compatible. This (potentially) means that a block could be designed that won't decode properly for some version of the client but would work for another. This would fork the network.<br> <br> </div> <div>A "raw" network library is unlikely to have the same problem.<br> </div> <div><br> </div> Rather than just compress the stream, you could compress only block messages only. A new "cblock" message could be created that is a compressed block. This shouldn't reduce efficiency by much.<br> <br> </div> </div> </div> </div> </blockquote> I chose the more generic datastream compression so we could in the future apply to possibly to transactions but currently all that is planned, is to compress blocks, and that was really my only original intent until I saw that there might be some bandwidth savings for transactions as well. <br> <br> The compression however could be applied to any datastream but is not *forced* . Basically it would just be a method call in CDatastream so we could do ss.compress and ss.decompress and apply that to blocks and possibly transactions if worthwhile and only IF compression is turned on. But there is no intend to apply this to every type of message since most would be too small to benefit from compression.<br> <br> Here are some results of using the code in the PR to compress/decompress blocks using zlib compression level = 6. This data was taken from the first 275K blocks in the mainnet blockchain. Clearly once we get past 10KB we get pretty decent compression but even below that there is some benefit. I'm still collecting data and will get the same for the whole blockchain.<br> <br> range = block size range<br> ubytes = average size of uncompressed blocks<br> cbytes = average size of compressed blocks<br> ctime = average time to compress<br> dtime = average time to decompress<br> cmp_ratio% = compression ratio<br> datapoints = number of datapoints taken<br> <br> range ubytes cbytes ctime dtime cmp_ratio% datapoints<br> 0-250b 215 189 0.001 0.000 12.41 79498<br> 250-500b 440 405 0.001 0.000 7.82 11903<br> 500-1KB 762 702 0.001 0.000 7.83 10448<br> 1KB-10KB 4166 3561 0.001 0.000 14.51 50572<br> 10KB-100KB 40820 31597 0.005 0.001 22.59 75555<br> 100KB-200KB 146238 106320 0.015 0.001 27.30 25024<br> 200KB-300KB 242913 175482 0.025 0.002 27.76 20450<br> 300KB-400KB 343430 251760 0.034 0.003 26.69 2069<br> 400KB-500KB 457448 343495 0.045 0.004 24.91 1889<br> 500KB-600KB 540736 424255 0.056 0.007 21.54 90<br> 600KB-700KB 647851 506888 0.063 0.007 21.76 59<br> 700KB-800KB 749513 586551 0.073 0.007 21.74 48<br> 800KB-900KB 859439 652166 0.086 0.008 24.12 39<br> 900KB-1MB 952333 725191 0.089 0.009 23.85 78<br> <br> <blockquote cite="mid:CAE-z3OUB-se_HUvW2NLjWt=0d5sgMiPEciu0hLzr_HQN0m9fqQ@mail.gmail.com" type="cite"> <div dir="ltr"> <div> <div>If a client fails to decode a cblock, then it can ask for the block to be re-sent as a standard "block" message. <br> </div> </div> </div> </blockquote> interesting idea.<br> <blockquote cite="mid:CAE-z3OUB-se_HUvW2NLjWt=0d5sgMiPEciu0hLzr_HQN0m9fqQ@mail.gmail.com" type="cite"> <div dir="ltr"> <div> <div><br> </div> This means that it is a pure performance improvement. If problems occur, then the client can just switch back to uncompressed mode for that block.<br> <br> </div> You should look into the block relay system. This gives a larger improvement than simply compressing the stream. The main benefit is latency but it means that actual blocks don't have to be sent, so gives a potential 50% compression ratio. Normally, a node receives all the transactions and then those transactions are included later in the block.<br> <div> <div><br> </div> </div> </div> </blockquote> There are better ways of sending new blocks, that's certainly true but for sending historical blocks and seding transactions I don't think so. This PR is really designed to save bandwidth and not intended to be a huge performance improvement in terms of time spent sending.<br> <blockquote cite="mid:CAE-z3OUB-se_HUvW2NLjWt=0d5sgMiPEciu0hLzr_HQN0m9fqQ@mail.gmail.com" type="cite"> <div class="gmail_extra"><br> <div class="gmail_quote">On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev <span dir="ltr"><<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:bitcoin-dev@lists.linuxfoundation.org"><a class="moz-txt-link-abbreviated" href="mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.linuxfoundation.org</a></a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div dir="ltr"><span class=""> <div class="gmail_default" style="font-size:small">On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev <span dir="ltr"><<a moz-do-not-send="true" href="mailto:bitcoin-dev@lists.linuxfoundation.org" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.linuxfoundation.org</a></a>></span> wrote:<br> </div> </span> <div class="gmail_extra"> <div class="gmail_quote"><span class=""> <div> </div> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div dir="ltr">I think 25% bandwidth savings is certainly considerable, especially for people running full nodes in countries like Australia where internet bandwidth is lower and there are data caps.</div> </blockquote> <div><br> </div> </span> <div> <div class="gmail_default" style="font-size:small;display:inline">This reinforces the idea that such trade-off decisions should be be local and negotiated between peers, not a required feature of the network P2P.</div> </div> </div> <span class=""> <div><br> </div> -- <br> <div> <div dir="ltr"> <div> <div dir="ltr"> <div dir="ltr"> <div dir="ltr"> <div dir="ltr"> <div dir="ltr"> <div>Johnathan Corgan<br> Corgan Labs - SDR Training and Development Services</div> <div><a moz-do-not-send="true" href="http://corganlabs.com" style="font-size:12.8px" target="_blank">http://corganlabs.com</a><br> </div> </div> </div> </div> </div> </div> </div> </div> </div> </span></div> </div> <br> _______________________________________________<br> bitcoin-dev mailing list<br> <a moz-do-not-send="true" href="mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.linuxfoundation.org</a><br> <a moz-do-not-send="true" href="https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" rel="noreferrer" target="_blank">https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev</a><br> <br> </blockquote> </div> <br> </div> <br> <fieldset class="mimeAttachmentHeader"></fieldset> <br> <pre wrap="">_______________________________________________ bitcoin-dev mailing list <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.linuxfoundation.org</a> <a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev">https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev</a> </pre> </blockquote> <br> </blockquote> <br> </body> </html> --------------030702010507070103060804--