Robust Population Structure Inference and Correction in the Presence of Known or Cryptic Relatedness
Citations Over Time
Abstract
Abstract Population structure inference with genetic data has been motivated by a variety of applications in population genetics and genetic association studies. Several approaches have been proposed for the identification of genetic ancestry differences in samples where study participants are assumed to be unrelated, including principal components analysis (PCA), multi-dimensional scaling (MDS), and model-based methods for proportional ancestry estimation. Many genetic studies, however, include individuals with some degree of relatedness, and existing methods for inferring genetic ancestry fail in related samples. We present a method, PC-AiR, for robust population structure inference in the presence of known or cryptic relatedness. PC-AiR utilizes genome-screen data and an efficient algorithm to identify a diverse subset of unrelated individuals that is representative of all ancestries in the sample. The PC-AiR method directly performs PCA on the identified ancestry representative subset and then predicts components of variation for all remaining individuals based on genetic similarities. In simulation studies and in applications to real data from Phase III of the HapMap Project, we demonstrate that PC-AiR provides a substantial improvement over existing approaches for population structure inference in related samples. We also demonstrate significant efficiency gains, where a single axis of variation from PC-AiR provides better prediction of ancestry in a variety of structure settings than using ten (or more) components of variation from widely used PCA and MDS approaches. Finally, we illustrate that PC-AiR can provide improved population stratification correction over existing methods in genetic association studies with population structure and relatedness.
Related Papers
- → Cryptic or pseudocryptic: can morphological methods inform copepod taxonomy? An analysis of publications and a case study of theEurytemora affinisspecies complex(2015)73 cited
- → Cryptic Species Identification and Composition of Bemisia tabaci (Hemiptera: Aleyrodidae) Complex in Henan Province, China(2017)33 cited
- Identification of nine cryptic species of Bemisia tabaci ( Hemiptera: Aleyrodidae) from China by using the mtCOI PCR-RFLP technique(2013)
- → Molecular Species Delimitation and Morphometry in the Melampus bidentatus (Panpulmonata, Ellobiidae) Cryptic Species Complex(2023)2 cited
- Cryptic species composition and genetic diversity within Bemisia tabaci complex in soybean in India revealed by mtCOI DNA sequence(2015)