Reference-free resolution of long-read metagenomic data
Abstract
ABSTRACT Background Read binning is a key step in proper and accurate analysis of metagenomics data. Typically, this is performed by comparing metagenomics reads to known microbial sequences. However, microbial communities usually contain mixtures of hundreds to thousands of unknown bacteria. This restricts the accuracy and completeness of alignment-based approaches. The possibility of reference-free deconvolution of environmental sequencing data could benefit the field of metagenomics, contributing to the estimation of metagenome complexity, improving the metagenome assembly, and enabling the investigation of new bacterial species that are not visible using standard laboratory or alignment-based bioinformatics techniques. Results Here, we apply an alignment-free method that leverages on k -mer frequencies to classify reads within a single long read metagenomic dataset. In addition to a series of simulated metagenomic datasets, we generated sequencing data from a bioreactor microbiome using the PacBio RSII single-molecule real-time sequencing platform. We show that distances obtained after the comparison of k -mer profiles can reveal relationships between reads within a single metagenome, leading to a clustering per species. Conclusions In this study, we demonstrated the possibility to detect substructures within a single metagenome operating only with the information derived from the sequencing reads. The obtained results are highly important as they establish a principle that might potentially expand the toolkit for the detection and investigation of previously unknow microorganisms.
Related Papers
- → Ultrahigh-throughput discovery of promiscuous enzymes by picodroplet functional metagenomics(2015)281 cited
- → Screening for novel enzymes from metagenome and SIGEX, as a way to improve it(2005)100 cited
- → An Alignment-Free “Metapeptide” Strategy for Metaproteomic Characterization of Microbiome Samples Using Shotgun Metagenomic Sequencing(2016)52 cited
- → Metagenomic Approaches for the Discovery of Pollutant-Remediating Enzymes: Recent Trends and Challenges(2022)2 cited
- → Metagenomics Approaches to Investigate the Gut Microbiome of COVID-19 Patients(2021)17 cited