Unified rational protein engineering with sequence-only deep representation learning
Citations Over Time
Abstract
Abstract Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabelled amino acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily, and biophysically grounded. We show that the simplest models built on top of this uni fied rep resentation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach reaches near state-of-the-art or superior performance predicting stability of natural and de novo designed proteins as well as quantitative function of molecularly diverse mutants. UniRep further enables two orders of magnitude cost savings in a protein engineering task. We conclude UniRep is a versatile protein summary that can be applied across protein engineering informatics.
Related Papers
- → Potential of Fragment Recombination for Rational Design of Proteins(2012)46 cited
- → [Protein engineering: from directed evolution to computational design].(2019)13 cited
- → Engineering the serpin α1‐antitrypsin: A diversity of goals and techniques(2019)25 cited
- → Protein Structure Design and Engineering(2011)3 cited
- → Chapter 7. Enhancing Enzymatic Performance via Restricted Sequence Space Approaches(2018)