Beyond bitcoin: Decentralized collaboration
Yurii Rashkovskii (yrashk)
https://twitter.com/kanzure/status/1043432684591230976
Hi everybody. Today I am going to talk about something different than my usual. I am going to talk more about how we can compute and how we collaborate. I'll start with an introduction to the history of how things happened and why are things the way they are today. Many of you probably use cloud SaaS applications. It's often touted as a new thing, something happening in the past 10-15 years. If you look at the history, this whole idea of clouds isn't new at all. It started in the 60s with mainframes. Computers were expensive to maintain and very few organizations were able to afford that. It was very convenient for them to pay someone to access applications and for them to host the expensive computers, and use cheap devices to do that. Back in the day, they had just monitors and keyboards connected to the mainframe. This model has existed for the entire time of computing in the business setting. It existed throughout the 60s, 70s, and 80s, and it's slowly morphing into something else as computers get cheaper. Since organizations were able to afford to buy lots of computers and servers for their organization, the advantage was software, and they started selling software that works on intranets.
The problem with intranet software was that the complexity of managing such applications became really high. A lot of organizations ended up having their own IT staff. They needed support centers internally just to operate this and train people and manage the devices and have customer support. This helped to create a modern thing called a cost center which was moved away from the consumers of those applications and the applications were put on the internet so that organizations wouldn't have to do tha ton their own.
Still, to this day, from the 60s, our applications look like this- racks of servers that run applications, and we access them with devices that are much more intelligent but they are still acting as terminals to applications running elsewhere. So what cloud delivers today is obviously low cost, where you don't have to pay for the software upfront, and it lowers the opex of running servers; it's convenient, and it's the convention and you're following the herd. It's hard to get fired for something that everyone is doing.
But the problem is that cloud is like fiat software. It's out of control of its own consumers. How many people here have used the gmail inbox? Anybody? Good. It is shutting down. It was an experimental platform by Google. You have no choice in whether you can or can't use it. Maybe Google Wave users remember that application? Gmail Inbox is not the only example. There are tons of examples where applications were just shutdown. The functionality of those applications is not up to you. If the vendor decides that certain functionality is no longer profitable or not interesting to them, you're out of luck and you either take it or leave it and that's how it's going to be for you.
The other aspect is that, by putting that data elsewhere outside of your own domain, we have that kind of institutional relationship between us and consumers who are users of those applications, and the vendors. It's feudalism. They have the properties and the facilities to manage your data, but you don't have the programs, you don't own the programs that allow you to use that the data. The raw data is not very actionable, but what makes it information is code that can process that data.
With the setup that we have so far, we relinquished our ability to control data and use our data and bear the fruits of that data. That's a complicated matter. Even if we believe that the vendor we're using is benevolvent and everything will be grea,t you're still depending on a lot of factors. One of the factors is the fact that we're using the internet.
"Online" is an optical illusion. It really depends on distance, so speed of light is obviously an issue. If Elon Musk eventually puts a colony on Mars, the average time of the signal traveling just from Earth to Mars will be about 14-40 minutes. That definitely prevents us from having any sort of online relationship between us and that colony. But you don't even need to go as far as Mars- distance is not always the factor. The factor can be cost, or complexity of getting connectivity to a certain place.
Imagine a scenario like a freight ship in the industry. You have ships on the sea, and it's prohibitively expensive to be online all the time. They have to rely on tools that allow them to collaborate in batch mode. To this day, most things done are don eover plaintext email, batched on email, send out what you have when you get a connection. And this way, they can control connectivity cost. However, this is not particularly efficient, and email is not the most efficient tool in managing structured information. So sometimes people would send documents and excel spreadsheets and they need to work with all the versions and so on.
Another aspect is the proposition that the term "cloud" means to us. We think of "cloud" as fluffy and nice, but in reality it's more like a dark storm. We're competing for different resources with other consumers of those products, like bandwidth and computing resources that the particular vendor offers to us. So we're actually in this traffic jam situation. For us, we don't often notice that. But the reality is that the vendor, the provider of that service, has to bear that cost. They have to be able to scale their applications so they can serve all their customers. If you can imagine an application like Slack, if you need something like that for your own organization, 10 to 10,000 people, that's not a very complex application to scale- you might need 1-2 servers and that will be fine. But if you're Slack, it's a challenging problem that requires engineers working on this and maintaining it and having infrastructure. And you will still have downtime because it's hard to achieve anything close to 100% uptime. This makes the cost much higher than it could have been.
To summarize, those are three cloud concerns- control, connectivity, and complexity. I like to think about "desert computing", where the resources are minimal and you have to survive on your own. Has anyone read Cory Doctorow's book? I often have a problem where my fiance lives on the other side of the planet, so I have a lot of long haul flights of 10+ hours. I can't even program or triage issues in my project while on the flight. I have to download some of them, think about it, and the ncopy everything back after the flight is done. I talk with a lot of people and some of them just copy-paste their work on the flight. But why are we doing it this way? Why do we have github or jira issues and we can't use it offline? Why is it happening that way? Is this an actual requirement to have a third-party to our conversations that we're having?
So I set off to solve this challenge. I came up with a very simple approach. What if we just recorded every intent and every change that we make in those collaborative processes? What if we create a new issue in this context, and place it in a git repository like BugsEverywhere and other distributed issue trackers have done? Can we record what we wanted to do? Can we then replay it when we have connectivity? Here's the SIT Issue Tracker. We can record these actions as files. We can capture the intent and the data associated with an intent, and writing it down. The issue tracker you can see here is the web interface and the way it looks these days. It allowed me to work on my issues in flight or at any time. What was interesting about it is that since it worked on files, you don't have to have network connectivity ever. You can write all the new files to a flash drive and pass it on to another person that wants to work on it, and then return to it later and reintegrate those files with what you're working with. This means this could work on Earth, with ships on the sea, and it can work with Mars, as long as we can transfer files like by radio or flash drives or network connectivity.
It gives you some interesting additional features. How many of you are software developers here? Quite a lot. This also allows you to carry your patches inside of this system and it allows you to put this information the issues inside of your repository. Every time you clone your project, you can receive it each time. It relies only on regular files. It has worked out great so far. It has been used for a few projects. But then I realized there's a lot more things I want to do with it; I like the simplicity of the approach.
I also want to do bookkeeping and invoicing, expense reports, and other pieces of information in the same way where I can capture the files and derive reports at a later time. I want to be able to make sure that the information is recorded there and checked for integrity each time I use the files. I transitioned this from an issue tracker into an information tracker. It can work on any intent, with any data that is associated with that intent, and record an order of those changes happening to those files and your information.
In a way, you know how.. there are many projects where people want to put something on the blockchain and they want to record the order of that event. Generally I advise them not to do this, like for cost, or the amount of data they want to put out. Instead, what I recommend to do now, is to use something like this issue tracker where you can record your information in files that are linked together that are using the same primitives where the hash of the information is actually linked to the previous data in the system. Only after that could they basically anchor that information into any other blockchain system, like using opentimestamp's git commit wrapper which can timestamp your git commits.
The way I defined this application is that basically it allows decentralization, and most importantly, it allows spontaneously connected parties to collaborate and share pieces of information something they want to work on together.
The other important aspect of developing applications like this is that there are many conventions that need to be reconsidered. There's a notion of logging into systems that establishes an institutional relationship is that you have an account with the vendor. When we collaborate with people, we don't authenticate or identify ourselves. In a digital world, that's like PGP or other types of signatures identifying tha tthis came from me. But if your application is running on your computer, you don't need to login to anything else other than your computer. You don't login to someone's system by saying this is my account or my password or any other way to access it; this allows us to build systems that operate on computing power that is available at the edges of the network where we do have a huge amount of computing power already available. This is very different from the systems that we have legacy from in the 60s where computers were big and expensive. But now they're not. This allows us to rethink how we build applications all together.
The other question is obviously- is it worth the effort? It's complex and it requires changing the way we design applications and how we think about them. Right now it's hard to see why is it so useful. But with- if you start collecting the pieces of information we hear from different news sources like data breaches, the advances on the other end, with even space travel, it is worthwhile investing time in this where we might not be able to reap substantial benefits in 5-10 years, but in 10-30 years, we will be able to control our information and control the software that processes it, and that's worth dedicating effort to this kind of project.