summaryrefslogtreecommitdiff
path: root/protein-engineering.mdwn
blob: 86c80824d2ebca51886dc6319ddbcab05d558389 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
This is a wish list for protein folding and engineering. It contains some speculation and brain storming and shouldn't be considered completely viable for now.

# Wishlist

* Given a 3d shape (of some nanostructure), produce a protein's amino acid sequence that will consistently create that shape. (done as of 2023?)

* Control over protein functional properties, such as catalytic domains and sites, as well as designing specific confirmational changes and control over conformation changes.

* DNA data storage: faster polymerases

* Proteins that make molecular display techniques easier (simplifying lab bench protocols) -- like mRNA display and ribosome display; easier molecular display would be very valuable for projects using directed evolution techniques.

* Better protein-based **nanopores** for DNA sequencing, amino acid sequencing, and protein sensing.

* Human-controlled DNA polymerase synthesis activity (choose each nucleotide), or an instrumented ribosome to control protein production regardless of mRNA content

* Molecular **protein lego**: connect multiple legos together to build large-scale protein structures. This is generally useful for modeling and nanostructures. Binding by DNA addresses or other high affinity ligand specific techniques, for a stable toolbox of known protein structures and shapes and building up larger structures from small parts.

* Protein **mechanical logic**: protein structures that have internal logic and state, based on mechanical motion or other catalytic reactions and interactions.

* Generalized, **fully-programmable molecular nanotechnology**: programmable nanomachines and nanofactories that can produce other nanostructures to exact specifications, without uncertainty regarding protein folding.

# TODO

* What were those long-tube protein molecular-chemistry factories called? (non-ribosomal peptide synthetases or NRPS). They are apparently natural, and they have multiple points of interest inside the tube that modify a molecule as it progresses along the protein.

# Other interesting targets

* gene editing proteins (see [[gene-editing]])
* enzymes for DNA synthesis
* molecular recording (like in vivo DNA-based recording devices, for debugging or otherwise, lineage tracing techniques, "of toasters and molecular ticker tapes")
* protein binding affinity stuff ([protein-protein interaction](https://en.wikipedia.org/wiki/Protein%E2%80%93protein_interaction)) 
* catalytic activity, enhancement of catalysis or reduction of catalysis
* synthetic metabolisms
* biosensors

# Structural protein design with machine learning

Well, it's probably time to update this page... lots of recent <a href="https://twitter.com/kanzure/status/1606364590740107265">progress</a> in machine learning for protein design.

* <a href="https://www.nature.com/articles/s41586-021-03819-2">AlphaFold2: Highly accurate protein structure prediction with AlphaFold</a>
* <a href="https://www.science.org/doi/10.1126/science.abj8754">RoseTTAFold: Accurate prediction of protein structures and interactions using a three-track neural network</a>
* <a href="https://www.biorxiv.org/content/10.1101/2022.12.09.519842v2">RFdiffusion: Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models</a>
* <a href="https://stephanheijl.com/rfdiffusion.html">A new protein design era with protein diffusion</a>
* <a href="https://www.biorxiv.org/content/10.1101/2022.12.21.521526v1">A high-level programming language for generative protein design</a>
* <a href="https://www.biorxiv.org/content/10.1101/2022.12.15.519894v1">Codon language embeddings provide strong signals for protein engineering</a>
* <a href="https://github.com/aqlaboratory/openfold">openfold</a> (<a href="https://www.biorxiv.org/content/10.1101/2022.11.20.517210v2">ref</a>)
* <a href="https://www.biorxiv.org/content/10.1101/2022.12.10.519862v4">De novo design of high-affinity protein binders to bioactive helical peptides</a>
* <a href="https://www.biorxiv.org/content/10.1101/2022.12.01.518682v1.full">Illuminating protein space with a programmable generative model</a>

# References

See <https://diyhpl.us/~bryan/papers2/bio/protein-engineering/>

* <https://en.wikipedia.org/wiki/Protein_design>
* <https://en.wikipedia.org/wiki/Protein_engineering>
* <https://en.wikipedia.org/wiki/Protein_foldng>
* <https://en.wikipedia.org/wiki/Protein_structure_prediction_software>