Received: from sog-mx-3.v43.ch3.sourceforge.com ([172.29.43.193] helo=mx.sourceforge.net) by sfs-ml-1.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1V1nIC-0003jL-3O for bitcoin-development@lists.sourceforge.net; Wed, 24 Jul 2013 00:51:00 +0000 Received-SPF: pass (sog-mx-3.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.217.179 as permitted sender) client-ip=209.85.217.179; envelope-from=gmaxwell@gmail.com; helo=mail-lb0-f179.google.com; Received: from mail-lb0-f179.google.com ([209.85.217.179]) by sog-mx-3.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1V1nI7-00034l-J3 for bitcoin-development@lists.sourceforge.net; Wed, 24 Jul 2013 00:50:59 +0000 Received: by mail-lb0-f179.google.com with SMTP id w20so8547lbh.24 for ; Tue, 23 Jul 2013 17:50:48 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.152.45.65 with SMTP id k1mr15817495lam.78.1374627048864; Tue, 23 Jul 2013 17:50:48 -0700 (PDT) Received: by 10.112.160.104 with HTTP; Tue, 23 Jul 2013 17:50:48 -0700 (PDT) In-Reply-To: References: Date: Tue, 23 Jul 2013 17:50:48 -0700 Message-ID: From: Gregory Maxwell To: Greg Troxel Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -1.6 (-) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (gmaxwell[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1V1nI7-00034l-J3 Cc: bitcoin-development@lists.sourceforge.net Subject: Re: [Bitcoin-development] Linux packaging letter X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Jul 2013 00:51:00 -0000 On Tue, Jul 23, 2013 at 4:23 PM, Greg Troxel wrote: > Is the repeatable build infrastructure portable (to any reasonable > mostly-POSIX-compliant system, with gcc or clang)? I have the vague It's "portable" to anything that can run the relevant VMs. Uh provided you don't mind cross compiling everything from an unbuntu VM. It certainly would be nice if the trusted-computing-base for gitian were a bit smaller, thats an area for long term improvement for sure. It may need some massaging. The tor project is beginning to use the same infrastructure, so this could be usefully conserved work. Likewise expanding the supported output targets would be great=E2=80=94 tho= ugh in the case of Bitcoin this is bounded by resources to adequately QA builds on alternative targets. > Requiring precise library depdendencies is quite awkward. Certainly > requiring new enough to avoid known bugs is understandable, but that > should be caught at configure time and fail. In some cases packages solving bugs is problematic for Bitcoin. This is something that it seems to take a whiteboard to explain, so I apologize for the opacity of simple email here. From a technical perspective Bitcoin's whole purpose is getting a whole bunch of computers world wide to reach a bit identical agreement on the content of a database, subject to a whole pile of rules, in the face of potentially malicious input, without any trusted parties at all (even the guy you got the software from, assuming you have the resources to audit it). I'll walk through a simple example: Say Bitcoin used a backing database which had an unknown a bug where any item with a key that begins with 0xDEADBEEF returns not found when queried, even if its in the DB. Once discovered, any database library would want to fix that quickly and they'd fix it in a point release without reservation. They might not even release note that particular fix it if went along with some others, it could even be fixed accidentally. Now say that we have a state where half the Bitcoin network is running the old buggy version, and half is running the fixed version. Someone creates a transaction with ID 0xDEADBEEF... and then subsequently spends the output of that transaction. This could be by pure chance or it could be a malicious act. To half the network that spending transaction looks like someone spending coin from nowhere, a violation of the rules. The consensus would then fork, effectively partitioning the network. On each fork any coin could be spent twice, and the fork will only be resolvable by one side or the other abandoning their state (generally the more permissive side would need to be abandoned because the permissive one is tolerant of the restrictive one's behavior) by manually downgrading or patching software. As a result of this parties who believed some of their transactions were safely settled would find them reversed by people who exploited the inconsistent consensus. To deploy such a fix in Bitcoin without creating a risk for participants we need to make a staged revision of the network protocol rules: There would be a protocol update that fixed the database bug _and_ explicitly rejected 0xDEADBEEF transactions until either some far out future date or until triggered by quorum sensing (or both). The users of Bitcoin would all be advised that they had to apply fixes/workaround by the switchover point or be left out of service or vulnerable. At designated time / quorum nodes would simultaneously switch to the new behavior. (Or in some cases, we'd just move the 'bug' into the Bitcoin code so that it can be fixed in the database, and we'd then just keep it forever, depending on how harmful it was to Bitcoin, a one if 4 billion chance of having to rewrite a transaction wouldn't be a big deal) We've done these organized updates to solve problems (as various flaws in Bitcoin itself can present similar consensus risks) several times with great success, typical time horizons spanning for months to years. But it cannot work if the behavior is changed out from under the software. Fortunately, if the number of users running with an uncontrolled consensus relevant inconsistent behavior is small the danger is only to themselves (and, perhaps, their customers). I'm not happy to see anyone get harmed, but it's better if its few people than many. This is part of the reason that it's a "linux packaging letter", since for Bitcoin the combination of uncoordinated patching and non-trivial userbases appears to be currently unique to GNU/Linux systems. Though indeed, the concerns do apply more broadly. > multiple packages is difficult, and runs into A wants only n of C, while > B wants only m. My understanding is that gentoo is actually able to handle this (and does, for Bitcoin)=E2=80=94 and really I presume just about everything else could with enough effort. I certainly wouldn't ask anyone else to do that. If you're really getting into the rathole of building separate libraries just for Bitcoin the value of packaging it goes away. > So if you are talking about running regression tests > with the set of versions of a dependency that are considered reasonable, > and there's therefore a solution to the multiple-package constraint > problem, that seems ok. Running a complete set of tests is a start=E2=80=94 though the unit tests a= re not and cannot be adequate. There is a full systems testing harnesses which should be used on new platforms. Even that though isn't really adequate, as it is currently infeasible to even achieve complete test coverage in things like cryptographic libraries and database environments. This is an area where both the Bitcoin software ecosystem and the greater art of large scale software validation need to mature. You won't hear anyone applauding the fact that harmless looking bugfixes in leveldb, boost, or openssl could be major doom event makers We're not crazy folks who insist on using formally undefined behavior and argue that it should never be changed out from under us. When there is a known risk we will boil the oceans to close it even if we think that the world would be more 'proper' some other way, but for known-unknowns and unknown-unknowns we can only adopt a conservative approach and try to do our best. One of the middle term things we did was internally integrated our validation database library (leveldb). Since we _know_ that its a consistency critical component, and a part of our system that is especially difficult to validate, integrating it meant removing a lot of risk and allowed it to be upgraded with an eye on the Bitcoin-specific consequences. Unfortunately distributions have been patching Bitcoin to unbundle it. Checking versions isn't adequate because, at least in other packages, some distributors frequently backport fixes or apply novel fixes which may not even be shared with upstream. Other considerations may drive us out of external dependencies for many of the consensus parts of Bitcoin. E.g. Pieter has writen an ECDSA library for our specific ECC curve which does signature validation >6x faster than OpenSSL (but it isn't obviously upstreamable due to some differing requirements for constant time operations), at some point we may need to adopt a backing database that is able to produce authentication proofs, etc. Certainly additional clever tests will make undiscovered surprising behavior less likely, though figuring out how to get the tests actually run if they take two hours and use 20GB of disk space is a challenge. ... but today we need to work with what we have, which is fragile in some atypical ways. Part of that is making an effort to make sure that anyone who might create a big footgun event has some idea of the concern space. > It seems like a bug that the package will build on BE systems and then > fail tests. If it's known not to be ok, it seems that absent some > configure-time flag the build should fail. Configure time? At the moment Bitcoin is built with a straight forward makefile (though there is a switch to autotools in the pipeline). BE isn't currently supported (and I believe this is well documented in the package). Fixing this would be nice, patches accepted. There was an amusing incident a while back where a distributor was refusing to take an update that added unit tests because they revealed failures on BE, nevermind that the application itself instantly failed on BE and never worked there. I believe that has since been resolved. > Asking people not to patch should mean willingnesss to make accomodation > in the master sources for build issues for multiple packaging systems. > I haven't gotten around to packaging this for pkgsrc - so far I only > have the energy to lurk (due to too many things on the todo list). But > I often find that some changes are needed. If you're willing (in > theory) to add in configure flags to control build behavior (in a way > that you can audit and decide is safe), that's great, and of course we > can discuss an actual situation when one gets figured out. I _believe_ (and hope) we've been very accommodating system specific fixes, even for systems we don't formally support. I personally believe that portable software is better software. Portability forces you to dust out nasty cobwebs, reveals dependency on dangerous undefined behavior, encourages intelligent abstractions and appropriate testing, and it invites contributions from more hands and eyes=E2=80=94 I don't care if you use a weird OS: I just want you for y= our code and your bug-reports. So even if we don't consider a platform worth supporting in any rigorous way, we should still be open to fixes and build support. ... although we're typical very much resource bound on testing. Our upstreaming pipeline is often somewhat slow. But it's slow because we are serious about review, even of trivial changes. Being slow is no reason to not submit, even if you make a decision to not block on it (though, if you're doing that you should make the decision in full knowledge of the potential implications). Like all things stepping up and being willing to do the work goes a long way to getting things done.