0 citations
RITA: a Study on Scaling Up Generative Protein Sequence Models
arXiv (Cornell University)2022
Citations Over Time
Abstract
In this work we introduce RITA: a suite of autoregressive generative models for protein sequences, with up to 1.2 billion parameters, trained on over 280 million protein sequences belonging to the UniRef-100 database. Such generative models hold the promise of greatly accelerating protein design. We conduct the first systematic study of how capabilities evolve with model size for autoregressive transformers in the protein domain: we evaluate RITA models in next amino acid prediction, zero-shot fitness, and enzyme function prediction, showing benefits from increased scale. We release the RITA models openly, to the benefit of the research community.
Related Papers
- → A Comprehensive Review of the Latest Advancements in Large Generative AI Models(2023)29 cited
- → Towards Understanding the Interplay of Generative Artificial Intelligence and the Internet(2023)9 cited
- → Generative Model for Person Re-Identification: A Review(2020)
- → Are generative approaches to ZSAR a look in the right direction?(2023)
- → TC-VAE: Uncovering Out-of-Distribution Data Generative Factors(2023)