more-than-just-dna

Luis Ceze

Andrew Hessel

LC: I am a computer architect. I make computers more efficient, faster, and do new applications. I've been fascinated by biology. For the past 4 years I've been exploring ways to use molecular biology to build better computers, such as DNA data storage and information process.

AH: Are you a biohacker?

LC: I don't know.

AH: I think we're all biohackers here. I am Andrew Hessel. I've had a bizzare path through science. I started in computers and quickly realized I don't care about computers. They are powerful tools. I decided I would start to hack biology and I switched my program into cell and molecular biology and genetics. I am a bottom-up guy. I want to understand how it works from the bottom. If I was top down then I think I would have become a physician. I spent 7 years in Amgen, a major biopharmaceutical ocmpany. I realized how broken drug development is. I left Amgen and then went into the synthetic biology world and I had no aim without an institution and I was just watching this community appear and start to grow and flourish because it just seemed like watching the semiconductor industry boot up in the 1950s. It was incredibly interesting. I just finished a 6 year stint at Autodesk trying to get them to do CAD tools for biology. Recnetly I've been involved in a few projects including gp-write which is the synthetic biology of the human genome project. We want to write a human genome and other large genomes. I have a DNA synthesizer tech project that we're looking for funding on and we have a small seed stage startup called Humane Genomics where we're making synthetic viruses to fight a type of dog cancer called osteosarcoma. I identify as a biohacker but I look digitally and I try to share everything I learn and make sure that all the resources are available for everyone etc.

LC: I find it interesting to think there's incredible intersection between electronics and biology. It's designing systems to have biological components in them. Let me give you an example. CMOS is very fast. We are at 7nm now. You might have heard that announcement. We can do self-assembly of atomically precise structures.

JZ: Why do we want it smaller?

LC: Smaller is faster.

JZ: What's the theoretical limit?

LC: Some people say 3nm might work. Nobody is sure.

AH: Biology is ground up, built from molecular materials.

LC: Molecular assembly is any engineer's dreams to specify what you want to build and it comes out of the structure. We did a calculation once htat looked at if you were going to build a protein to the work of a transistor does, so, any ... and you project out the most efficient transistor, a protein is simply 10,000 more energy efficient to do the same job as a transistor. We need this to make AI more scary.

JZ: So will our cell phones have proteins in them?

AH: Did Drew Endy make it out here yesterday?

JZ: Yes.

AH: I asked Drew years ago about what he was working on. And he said he was trying to make cell phones. It's aspirationlal but I don't see any reason why you couldn't use synthetic biology to make a cell phone in the future. Imagine dropping out a virus and getting a cell phone in your head.

JZ: Or a shake-and-bake cell phone in your fridge.

LC: So you are working on gp-write and I'm working on DNA data storage to try to improve DNA synthesis costs down to zero and throughput to infinity. It's interesting to see how those two intersection. My naieve view is that when you're talking about DNA data storage what you want to do si...

JZ: What is DNA data storage?

LC: Okay, most of you have heard of it. Okay. The idea is to use DNA molecules as a storage medium. DNA is very general. It can store information. The idea is to develop a way to map digital data to sequences and manufacture those...

JZ: How many terabytes are stored in your body?

LC: It depends on what you mean.

JZ: If a human was a hard drive, how much? What is an exabyte?

LC: It's 10¹⁸ bytes. Kilobyte, megabyte, gigabyte, terabyte, petabytes, exabyte, zettabyte, yottabyte, hellabyte, ((and then brontobytes)). For this to work, DNA synthesis needs to be super cheap. It's different for genome synthesis vs DNA data storage.

JZ: Andrew, when are you going to synthesize a human genome?

AH: A little bit of backstory.. I've been trying t owrite viral genomes for the past few years. It's been a little frustrating. Viruses have the smallest genomes. They are not really alive. They are kind of like USB sticks. They need a cell to actually operate. The viral genome start arounde 3000 bits of code and the largest virus ever is like 1.8 million nucleotides. Much of them are under 200k bits of code. I wanted to build synthetic viruses from scratch since I read the first paper in 2002. In 2007, I found a few groups that would synthesize a virus provided you didn't seem too scary. It took until 2014 when I synthesized the first virus personally using resources that are avialable to me at Autodesk at the time. That was PhiX174. It was just over 5000 bits of code. That took several weeks to get synthesized.

JZ: When you mean bits you mean bases?

AH: Bases are bits, not bytes. Then we moved on to adenovirus which is 34,000 bits. THat took 8 weeks to synthesize and assemble. I realize this is so incredibly frustrating and slow that I started to arguing it was time for a new genome project. Essentially focus on synthesis tech so that we can lower the cost of doing it and set the bar high and get a tide that raises all ships. It's been a fascinating ride. It turns out that scientists don't work on core technologies for the most part. That's engineering. When the scientists that we pull together to advance the gp-write project, they said look, we have to create big pilot projects that will be synthesis-heavy to create a pull for the tech to be created. When we get funded it's based on novel discoveries not engineering novel equipment. It's been fascinating wtatching the dynamics there. The ultimate goal of gp-write is to develop the tools and technologies and standarsds and ethics around engineering large genomes and lowering the cost of doing this by at least a factor of 1000x. It will get a lot cheaper over time, but it will have to start somewhere. I want to create cheap virus genomes for gene therapy.

LC: What is the size of these synthetic virus genomes?

AH: Ideally you want it as close as they can be to the design. But biology is a little flexible. There's error correction systems. Today I expect the next generation of DNA synthesizers...

JZ: How far away are we from synthesizing a human genome?

AH: You can't do that yet. You can't do that... it's cost. Making oligos is really cheap today and you can do it at scale.

LC: Not at scale.

AH: It's the assembly processes of stringing together oligos together into longer strands of DNA is slow and costly and requires a lot of error checking. Viral genomes, we're getting faster. You can turnaround a viral genome in 48 hours if you have the right equipment,. but that's with a lot of invested capital. Doing it outsourced, it will take a few weeks.

LC: What about making it fully automated with an assembly process?

AH: I believe that... we make DNA chemically right now, which is not the way DNA do it. We're starting to see the first few papers of DNA enzymatic synthesis methods. We haven't demonstrated enough control of the enszymatic process to get it to the bit level. I believe the intersection of electronic circuits and the intersection of the biological machinery is the most exciting area to do biohacking today. Making electronics cheap and accessible available to everyone that even foundries are available to people today. The biology- if you're not trying to tweak and modify it, it's going ot be interesting ot start to do that integration. You can attach e3nzymes to chips, you're going to start seeing the ability to, and then we've beengetting these cyborg chips ilke the ion terorrent and nanopore.

LC: The fascinating thing to think about is designing large-scale systems with a dry component and a wet component. Suppose we make DNA data storage. You're going to have all your data and seflies and cat pics all stored in a long time in a dat a center. But now you need a data center that has fluidics in it. One of hte technologies I find interesting is digital microfluidics. The idea is electrowetting, moving droplets on a surface in an arbitrary way, and the nyou can treat them as carriers of biological information. You can scale tha tup and what woul dthe data ccenter footprint look like? Could you make a Walmart sized building full of tiny picoliter droplets? That's the scale that we need to think about for doing computers with molecular components. It's the scale you need to think about doing global scale population sequencing and sample handling/manipulation. What if you do this for 10 million or 10 billion people-- this sample manipulation on this scale is just fascinatong to think about.

JZ: I'm skeptical. Our cell phones heat up because the components.. you talk about putting something biological in there like putting a DNA dat astorage...

LC: There's a lot of work on error correction codes in computer science.

AH: It would lower cancer rates.

LC: For sure.

JZ: Andrew, what would you add to the human geonme?

AH: I want to see a 24th chromosome. I call this c24 tech. Rather than tweak and modify the 23 chromosomes, let's build chromosome 24. Let's make any chromosome- - make it easy to add to cells, particularly if you're doing IVF and just keep adding code ot htat. Add any new features there and have enough siwtches and logic tso ythat you can turn that on and off as you need. I think c24 is going to be big moving forward. I ngeneral in the short term, I want to see the cost of this technology drop by 5-6 orders of magnitude.

JZ: What will drive down that?

AH: I think data storage will drive it down in the short-term. Dat astorage and computing is already a $10-15 billion dollar market just with magnetic tape. Being able to store high value data longt-erm will be the immediate driver in pushing the tech forward. You don't need long DNA, but it wouldn't hurt if you had it. I believe viruses and nanoparticles and metabolic enginereing will be the big driver on the biological side. You don't have to go crazy. The human genome- we have 23 chromosome and the smallest is 50 million bp. That's pretty tiny. Chromosome 1, the largest, is only 5x larger. Once we start getting megabase sized synthesis and assembly at reasonable price, it won't take long to complete large genomes and start moving through all the microbes and all the viruses and then plants and starting to build large genomes from scratch.

JZ: Building modular systems into the cell to modify it.. instead of modifying the other stuff.. That's interesting.

AH: CRISPR and off-target... you don't want to muck around in the genome. You can complement anything. YThere's no reason to have 23 chromosomes. There was a paper recently that took 16 chromosomes in yeast and it works fine to reduce it to one chromosome. Why we have 23 chromosomes is sort of arbitrary. Adding another chromosome is no big deal. Remember when you bought a PC and it came pre-loaded with all the demo stuff of the software available? I think tha twill be what it will be for babies in the future. If you want c24 it's free no cost, and then you get it.

LC: So to get night vision, you have to give your credit card number and get the right promoter?

JZ: It's interesting. Luis, I know you do computer science and comuter security stuff. What do you think about hacking the human? Are people going to start hacking humans?

LC: Can I answer a different question? I think a lot about biosecurity but from a different angle. We were working on some form of malware that exploits DNA sequencers by using DNA as a backdoor to computer systems to carry out a cyberattack. It's interesting. This should be looked at. A lot of these tools have had no penetration testing at all. Also, imagine wetlabs and automated equipment... combine in new ways that might be harmful. Also, we have a lot of privacy issues in DNA synthesis and DNA sequencing today. There's some cross-talk and this could leak information. There's also the psychological terrorism of making someone think they have a genetic disease and they don't.

JZ: Oh wow that's really interesting. I am going to make Andrew talk about hacking c24.

AH: I want to speak on something that has been rising to the forefront of my consciousness for the last year .I did an interview with David Asprey on the bulletproof podcast a week and a half ago. He asked me what's the most important biohacking thing coming down the pipeline. I shared with him that I wrote a piece for Neolife a few weeks ago that DNA sequencing which is as we know it cost $3 billino to get the first genome and we've had a $1000 genome now for about 18 months. It's dropping to free. It's just dropping. The price is falling and the value is growing exponentially. You get this economic flip that suddenly people are going to start offering people free genome sequencing because the company will know that they can get more value out of your genome instead of the cost of sequencing. It's sequencing the complete genome- not exome or whatever .It's just a customer acquisition cost.

JZ: How much should we be asking for?

AH: Well we should be figuring out how to get more value from our genome. What value will we get back? Will we get legal protections, cheap drugs developed using our genomes, will we get royalties? It's the most interesting shift I've seen in genomics since the human genome was sequenced. Now we have an essentially marketplace and business model that could potentially see all of us being sequenced over the next 20 years.

JZ: Isn't this scary because facebook and privacy? Sure I'll share all my information. Will people be taken advantage of?

AH: That's the top-down model for Ancestry and 23andme. But I mean the blockchain model. You will have full control over your genome, it will be a bottom-up market and you will get your fair share. You will get to choose who you want to share it with. It's a core change in the tech in genomics because it should drive sequencing down by another order of magnitude or more and that allows us to sequence the world around us and this translates into synthetic biology and biohacking because then we know how to go and hack with synthesis. It's going to get interesting.

LC: It's interesting to see the growth of this crowdsourced genetic databases. The growth of people sharing their genetic information-- it was 20,000 people a few years ago, and it's 13 million today.

Q: For gp-write's recoding project... where is that?

AH: At gp-write, the goal is to reduce the cost of full genome synthesis by at least a factor of 1000x. It's organized bottom-up. The first one was Congress funding it. Today there's a thousand people on the mailing list. There's 100 institutions. There's something like $500 million committed to the various institutions that are a part of this federation now. It's pretty impressive. I was up in Canada helping them get a seed started. The first core project that they have latched on is making an ultra-safe human cell line. To do that, you remove some of the codons in the genetic code of the human, genome wide, and some of the transcription machinery behind that, so that essentially if a virus infects that cell, it can't fully transcribe its genome. It essentially makes the cells completely resistant to all viruses. You can add other features as well to improve its ability to do apoptosis for cancer. It's a really powerful-- this new cell that will be engineered, it doesn't require full genome synthesis, just genome-wide modification. It becomes a valuable asset for doing research. If you have ever worked with mammalian cells, they are prone to viral infections , and a viral infection in your lab or production facility can shut down your entire project nad ruin your production lines. It's a valuable seed project to bring the community together. It was announced in May this year that this would be the first major project coming out of gp-write, but it's really just a first stepping stone.

JZ: So Luis...

LC: We can store pictures in DNA... diffusion can be extremely efficient for... so here's what we're doing. We get in all these images that people are submitting, and we're going to store them in DNA, but we will extract features from them in the same way that computer vision does feature extraction. Then we demonstrated that we can use hybridization as a measure of how similar images are. And then we are able to say, you can extract only the images that have ... bicycles in them.. by using a piece of DNA that is a query tha tencodes the bicycule, putting it inside a test tube, shaking it, and the DNA comes attached and they map to photographs and things like that. The reason we're doing this is that if this really works at a large-scale, it could be way way way more efficient than how we do approximate search today.

AH: So you're making bio-tags? I love that.

JZ: You were talking about energy-efficiency.

LC: Yes, we were talking about AI and superintelligent machines. Some people say it will be dangerous. I'm not so scared. I know tha we can't afford the energy yet. Computers aren't energy efficient enough to make a AI system that is scary compared to human brains. We need a huge efficiency improvement in computers to build an intelligence system in my opinion. I'm excited about molecualr biology because we see a path to making computers extremely more efficient than they are today.

JZ: So maybe we can create a computer the size of the universe?

LC: I wouldn't want to think about the paperclips.

JZ: Any final questions?

fenn: Could you elaborate on... your sequencers are millions of dollars.. and you talk about these blockchain sequencer startups. I don't see how to do that.

AH: Sequencing isn't expensive. To do a full human genome, you can go to veritas and it's under $1k. That's interpretation too, CLEA, its' all done to medical grade. If you just want to get human genomes, 30x coverage, non-clea, high-throughput, it's probably dwn to about $300.

fenn: But it's high-throughput? You need 1000 of samples to get those prices.

AH: The tech is not stopping. It's continuing pretty fast. Right now, the idea of going and getting 100 million genomes into a database, is fanciful because there's not enough throughput. Illumina basically owns this market. They are-- tyhere's just not enough sequencers out there. As you get a business model that is kind of exo-thermic, every person that you're putting into your network actually increases the value of the network in a causitive way and it should drive the picks and shovels development of sequencing technology. When I realized this period is going to occur, I don't know if you have tracked this, Illumina stock is up 60% over the past year. I bought Illumina stock. PacBio stock has also doubled. We're going to see across the board in bioinformatics and analysis and sequencing tech, these companies just-- selling everything they are producing.

LC: I think it's clear now that the cost of DNA sequencing itself... is basically zero. All of the value is in analyzing the DNA and data. How do we do protein sequencing? What is the Illumina of protein sequencing?

AH: I didn't realize that. I've been stuck in the world of DNA for too long. We're not running out of biology to hack.

JZ: Luis, are you getting bored of DNA?

LC: No, not at all.

JZ: Four bases.. you want the 20 amino acids. I see it in your eyes.

LC: Maybe for storing data.

JZ: Thank you guys so much.