Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering
Citations Over TimeTop 10% of 2015 papers
Abstract
Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging.
Related Papers
- → Reconstruction of biological networks based on life science data integration(2010)6 cited
- → Applications of Random Walk Model on Biological Networks(2016)8 cited
- → Topological parameters, patterns, and motifs in biological networks(2021)3 cited
- → Reconstruction of biological networks based on life science data integration(2010)2 cited
- → Counting Motifs in the Entire Biological Network from Noisy and Incomplete Data(2013)