This is a wish list for protein folding and engineering. It contains some speculation and brain storming and shouldn't be considered completely viable for now.
# Wishlist
* Given a 3d shape (of some nanostructure), produce a protein's amino acid sequence that will consistently create that shape. (done as of 2023?)
* Control over protein functional properties, such as catalytic domains and sites, as well as designing specific confirmational changes and control over conformation changes.
* DNA data storage: faster polymerases
* Proteins that make molecular display techniques easier (simplifying lab bench protocols) -- like mRNA display and ribosome display; easier molecular display would be very valuable for projects using directed evolution techniques.
* Better protein-based **nanopores** for DNA sequencing, amino acid sequencing, and protein sensing.
* Human-controlled DNA polymerase synthesis activity (choose each nucleotide), or an instrumented ribosome to control protein production regardless of mRNA content
* Molecular **protein lego**: connect multiple legos together to build large-scale protein structures. This is generally useful for modeling and nanostructures. Binding by DNA addresses or other high affinity ligand specific techniques, for a stable toolbox of known protein structures and shapes and building up larger structures from small parts.
* Protein **mechanical logic**: protein structures that have internal logic and state, based on mechanical motion or other catalytic reactions and interactions.
* Generalized, **fully-programmable molecular nanotechnology**: programmable nanomachines and nanofactories that can produce other nanostructures to exact specifications, without uncertainty regarding protein folding.
# TODO
* What were those long-tube protein molecular-chemistry factories called? (non-ribosomal peptide synthetases or NRPS). They are apparently natural, and they have multiple points of interest inside the tube that modify a molecule as it progresses along the protein.
# Other interesting targets
* gene editing proteins (see [[gene-editing]])
* enzymes for DNA synthesis
* molecular recording (like in vivo DNA-based recording devices, for debugging or otherwise, lineage tracing techniques, "of toasters and molecular ticker tapes")
* protein binding affinity stuff ([protein-protein interaction](https://en.wikipedia.org/wiki/Protein%E2%80%93protein_interaction))
* catalytic activity, enhancement of catalysis or reduction of catalysis
* synthetic metabolisms
* biosensors
# Structural protein design with machine learning
Well, it's probably time to update this page... lots of recent progress in machine learning for protein design.
* AlphaFold2: Highly accurate protein structure prediction with AlphaFold
* RoseTTAFold: Accurate prediction of protein structures and interactions using a three-track neural network
* RFdiffusion: Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models
* A new protein design era with protein diffusion
* A high-level programming language for generative protein design
* Codon language embeddings provide strong signals for protein engineering
* openfold (ref)
* De novo design of high-affinity protein binders to bioactive helical peptides
* Illuminating protein space with a programmable generative model
# References
See
*
*
*
*