Repertoire-Based Diagnostics Using Statistical Biophysics
Citations Over Time
Abstract
Abstract A fundamental challenge in immunology is diagnostic classification based on repertoire sequence. We used the principle of maximum entropy (MaxEnt) to build compact representations of antibody (IgH) and T-cell receptor (TCRβ) CDR3 repertoires based on the statistical biophysical patterns latent in the frequency and ordering of repertoires’ constituent amino acids. This approach results in substantial advantages in quality, dimensionality, and training speed compared to MaxEnt models based solely on the standard 20-letter amino-acid alphabet. Descriptor-based models learn patterns that pure amino-acid-based models cannot. We demonstrate the utility of descriptor models by successfully classifying influenza vaccination status (AUC=0.97, p=4×10 -3 ), requiring only 31 samples from 14 individuals. Descriptor-based MaxEnt modeling is a powerful new method for dissecting, encoding, and classifying complex repertoires.
Related Papers
- → The meaning of song repertoire size and song length to male whitethroats Sylvia communis(2001)27 cited
- → Element repertoire: change and development with age in Whitethroat Sylvia communis song(2009)16 cited
- BAĞLAMADA ENSTRÜMANTAL REPERTUVARIN ÇEŞİTLİ DEĞİŞKENLER AÇISINDAN İNCELENMESİ(2020)
- → The Results of Bandstand Dynamics(2013)
- → Challenges and New Insights in the Critical Edition of the Persianate Repertoire in Ottoman Music(2023)