Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 16A7B899 for ; Fri, 7 Apr 2017 19:55:26 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pg0-f47.google.com (mail-pg0-f47.google.com [74.125.83.47]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 441C7130 for ; Fri, 7 Apr 2017 19:55:25 +0000 (UTC) Received: by mail-pg0-f47.google.com with SMTP id 21so75607638pgg.1 for ; Fri, 07 Apr 2017 12:55:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=voskuil-org.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=I/fTNSTbhMsyUUyQIZ7nHLrlDDT+rQ69MOYcF4BL+GI=; b=G8Q9Try0t7xO4YiVHZhhplM92C+ZyOpd10d4DreqPwRvEGRR1rvKWBy0piHX+uTbjG pLSs0IQ55N/TL2qrpdq0mC/I9pJbGTpKyXHfgYow/NMXsKJKLa6QXRRO2za6TFWErm22 FhHn6eWJLIRZw/GTdbEE2/3oY9/DC1+XE86slanj2f2R1TITBvOfGCBh98+gqgKWhe+k 7gppYKkcKl++puFLpUfMNGb1BEdPVqyGNLhE0+RF+gfcUDmXIIvp1kqiwyWPIh8kHbPQ 8D/wH5ONVqEegu/4efNucE0dCxP2IqG2rERdNLOBIWvTMtbYof/uDjzH/Uu8zgLF/5UR 5tag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=I/fTNSTbhMsyUUyQIZ7nHLrlDDT+rQ69MOYcF4BL+GI=; b=KjeMtD8oUYGe9LlvCjkb3V3lHR5p8jrD912fGRHShw8Aqo3Q/F3tZm0QA54qYewxL3 WyMUdW7VAt3+kCUVNAii2Kh/xLE5kqJF07wuHUCf9coUXN65X/6c5/vSR4TsPVUwVSuF Q3QIpwC/dx4vNMf5A523pOO1w5H3li4nwU74dYrsqtebkdsEmaS1JQFC/ujk4SLCFrki EPQpC7Isec3kl2aawaPqW5vAGBFcaS+0xdDExOAEKgLbheKu/ZysG4xcfYhU4SCDq3wQ O6E77tI/+hmel3M6ju7Ds+cPw5X3aZ/a+Ry4IUjvDe7SCCzJ/q1763vEer8MhrjHajDn EGrg== X-Gm-Message-State: AFeK/H0/MjCa77CeJd6GaBbAlv52etsdO5fQh2i+hzTZD2TtHgd++T9EMsvKd3NMrSPKnQ== X-Received: by 10.99.119.69 with SMTP id s66mr43012703pgc.196.1491594924726; Fri, 07 Apr 2017 12:55:24 -0700 (PDT) Received: from ?IPv6:2601:600:9000:d69e:f4e5:6be7:b661:fcc3? ([2601:600:9000:d69e:f4e5:6be7:b661:fcc3]) by smtp.gmail.com with ESMTPSA id i185sm6065130pge.48.2017.04.07.12.55.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 07 Apr 2017 12:55:23 -0700 (PDT) To: Bram Cohen , Bitcoin Protocol Discussion , Gregory Maxwell References: <1491516747.3791700.936828232.69F82904@webmail.messagingengine.com> From: Eric Voskuil X-Enigmail-Draft-Status: N1110 Message-ID: Date: Fri, 7 Apr 2017 12:55:58 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Fri, 07 Apr 2017 19:55:57 +0000 Subject: Re: [bitcoin-dev] Using a storage engine without UTXO-index X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Apr 2017 19:55:26 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 04/07/2017 11:39 AM, Bram Cohen via bitcoin-dev wrote: > Expanding on this question a bit, it's optimized for parallel > access, but hard drive access isn't parallel and memory accesses > are very fast, so shouldn't the target of optimization be about > cramming as much as possible in memory and minimizing disk > accesses? While this may seem to be the case it is not generally optimal. The question is overly broad as one may or may not be optimizing for any combination of: startup time (first usability) warm-up time (priming) shutdown time (flush) fault tolerance (hard shutdown survivability) top block validation (read speed) full chain validation (read/write speed) RAM consumption Disk consumption Query response Servers (big RAM) Desktops (small RAM) Mining (fast validation) Wallets (background performance) SSD vs. HDD But even limiting the question to input validation, all of these considerations (at least) are present. Ideally one wants the simplest implementation that is optimal under all considerations. While this may be a unicorn, it is possible to achieve a simple implementation (relative to alternatives) that allows for the trade-offs necessary to be managed through configuration (by the user and/or implementation). Shoving the entire data set into RAM has the obvious problem of limited RAM. Eventually the OS will be paging more of the data back to disk (as virtual RAM). In other words this does not scale, as a change in hardware disproportionately impacts performance. Ideally one wants the trade between "disk" and "memory" to be made by the underlying platform, as that is its purpose. Creating one data structure for disk and another for memory not only increases complexity, but denies the platform visibility into this trade-off. As such the platform eventually ends up working directly against the optimization. An on-disk structure that is not mapped into memory by the application allows the operating system to maintain as much or as little state in memory as it considers optimal, given the other tasks that the user has given it. In the case of memory mapped files (which are optimized by all operating systems as central to their virtual memory systems) it is possible for everything from zero to the full store to be memory resident. Optimization for lower memory platforms then becomes a process of reducing the need for paging. This is the purpose of a cache. The seam between disk and memory can be filled quite nicely by a small amount of cache. On high RAM systems any cache is actually a de-optimization but on low RAM systems it can prevent excessive paging. This is directly analogous to a CPU cache. There are clear optimal points in terms of cache size, and the implementation and management of such a cache can and should be internal to a store. Of course a cache cannot provide perfect scale all the way to zero RAM, but it scales quite well for actual systems. While a particular drive may not support parallel operations one should not assume that a disk-based store does not benefit from parallelism. Simply refer to the model described above and you will see that with enough memory the entire blockchain can be memory-resident, and for high performance operations a fraction of that is sufficient for a high degree of parallelism. In practice a cache of about 10k transactions worth of outputs is optimal for 8GB RAM. This requires just a few blocks for warm-up, which can be primed in inconsequential time at startup. Fault tolerance can be managed by flushing after all writes, which also reduces shutdown time to zero. For higher performance systems, flushing can be disabled entirely, increasing shutdown time but also dramatically increasing write performance. Given that the blockchain is a cache, this is a very reasonable trade-off in some scenarios. The model works just as well with HDD as SSD, although certainly SSD performs better overall. e -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBCAAGBQJY5+7GAAoJEDzYwH8LXOFOsAsH/3QK55aWH6sAi6OsTwV1FLZV Y/2SSjwn1vUh55MDkPpCxDwV99JqVwpk0vGM8mGg5s4ZS8sxOPqwGiBz/SZWbF9v oStJS0DjUPnbYtI/mrC30GuAYVcKnc5DFDHvjX6f0xrLIzViFR7eiW0npUH6Xipt RI9Mockaf1CqqGExtbIqWal0YDEQGH0ekXRp7uEjh8nPUoKqTVvxDCgqVooQfvfx EeKX9ruSv/r91EM1JQuH8HBBF7+R24tmMtwbpGx0zrDg5ytpIyrRzVH/ze1Mj2a3 ZxThvofGzhKcDiTPWiJI11DBYUvhSH4Kx0uWLzFUA0gxPfWkZQKJWNDl2CEwljk= =C7rD -----END PGP SIGNATURE-----