This is a wish list for protein folding and engineering. It contains some speculation and brain storming and shouldn't be considered completely viable for now. # Wishlist * Given a 3d shape (of some nanostructure), produce a protein's amino acid sequence that will consistently create that shape. (done as of 2023?) * Control over protein functional properties, such as catalytic domains and sites, as well as designing specific confirmational changes and control over conformation changes. * DNA data storage: faster polymerases * Proteins that make molecular display techniques easier (simplifying lab bench protocols) -- like mRNA display and ribosome display; easier molecular display would be very valuable for projects using directed evolution techniques. * Better protein-based **nanopores** for DNA sequencing, amino acid sequencing, and protein sensing. * Human-controlled DNA polymerase synthesis activity (choose each nucleotide), or an instrumented ribosome to control protein production regardless of mRNA content * Molecular **protein lego**: connect multiple legos together to build large-scale protein structures. This is generally useful for modeling and nanostructures. Binding by DNA addresses or other high affinity ligand specific techniques, for a stable toolbox of known protein structures and shapes and building up larger structures from small parts. * Protein **mechanical logic**: protein structures that have internal logic and state, based on mechanical motion or other catalytic reactions and interactions. * Generalized, **fully-programmable molecular nanotechnology**: programmable nanomachines and nanofactories that can produce other nanostructures to exact specifications, without uncertainty regarding protein folding. # TODO * What were those long-tube protein molecular-chemistry factories called? (non-ribosomal peptide synthetases or NRPS). They are apparently natural, and they have multiple points of interest inside the tube that modify a molecule as it progresses along the protein. # Other interesting targets * gene editing proteins (see [[gene-editing]]) * enzymes for DNA synthesis * molecular recording (like in vivo DNA-based recording devices, for debugging or otherwise, lineage tracing techniques, "of toasters and molecular ticker tapes") * protein binding affinity stuff ([protein-protein interaction](https://en.wikipedia.org/wiki/Protein%E2%80%93protein_interaction)) * catalytic activity, enhancement of catalysis or reduction of catalysis * synthetic metabolisms * biosensors # Structural protein design with machine learning Well, it's probably time to update this page... lots of recent progress in machine learning for protein design. * AlphaFold2: Highly accurate protein structure prediction with AlphaFold * RoseTTAFold: Accurate prediction of protein structures and interactions using a three-track neural network * RFdiffusion: Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models * A new protein design era with protein diffusion * A high-level programming language for generative protein design * Codon language embeddings provide strong signals for protein engineering * openfold (ref) * De novo design of high-affinity protein binders to bioactive helical peptides * Illuminating protein space with a programmable generative model # References See * * * *