GEESE: Metabolically driven latent space learning for gene expression data
Citations Over Time
Abstract
Abstract Gene expression microarrays provide a characterisation of the transcriptional activity of a particular biological sample. Their high dimensionality hampers the process of pattern recognition and extraction. Several approaches have been proposed for gleaning information about the hidden structure of the data. Among these approaches, deep generative models provide a powerful way for approximating the manifold on which the data reside. Here we develop GEESE, a deep learning based framework that provides novel insight into the manifold learning for gene expression data, employing a metabolic model to constrain the learned representation. We evaluated the proposed framework, showing its ability to capture biologically relevant features, and encoding that features in a much simpler latent space. We showed how using a metabolic model to drive the autoencoder learning process helps in achieving better generalisation to unseen data. GEESE provides a novel perspective on the problem of unsupervised learning for biological data. Availability Source code of GEESE is available at https://bitbucket.org/mbarsacchi/geese/ .
Related Papers
- → From Principal Component Analysis to Autoencoders: a comparison on simulated data from psychometric models(2022)7 cited
- Overview of nonlinear dimensionality reduction methods in manifold learning(2007)
- → Multiple Manifold Learning by Nonlinear Dimensionality Reduction(2011)10 cited
- Dimensionality Reduction Algorithm Based on Manifold Learning(2010)
- → Framework of Multiple-point Statistical Simulation Using Manifold Learning for the Dimensionality Reduction of Patterns(2020)