Estimating sampling biases in citizen science datasets
Citations Over TimeTop 1% of 2024 papers
Abstract
The rise of citizen science (also called community science) has led to vast quantities of species observation data collected by members of the public. Citizen science data tend to be unevenly distributed across space and time, but the treatment of sampling bias varies between studies, and interactions between different biases are often overlooked. We present a method for conceptualizing and estimating spatial and temporal sampling biases, and interactions between them. We use this method to estimate sampling biases in an example ornithological citizen science dataset from eBird in Brisbane City, Australia. We then explore the effects of these sampling biases on subsequent model inference of population trends, using both a simulation study and an application of the same trend models to the Brisbane eBird dataset. We find varying levels of sampling bias in the Brisbane eBird dataset across temporal and spatial scales, and evidence for interactions between biases. Several of the sampling biases we identified differ from those described in the literature for other datasets, with protected areas being undersampled in the city, and only limited seasonal sampling bias. We demonstrate variable performance of trend models under different sampling bias scenarios, with more complex biases being associated with typically poorer trend estimates. Sampling biases are important to consider when analysing ecological datasets, and analysts can use this method to ensure that any biologically relevant sampling biases are detected and given due consideration during analysis. With appropriate model specification, the effects of sampling biases can be reduced to yield reliable information about biodiversity.
Related Papers
- → The influence of sampling design on tree‐ring‐based quantification of forest growth(2014)275 cited
- → Practical Implications of Design-Based Sampling Inference for Thematic Map Accuracy Assessment(2000)133 cited
- → Survey design, sampling, and significance testing: Key issues(2021)43 cited
- → Towards Quantifying Sampling Bias in Network Inference(2018)10 cited
- → Partial misspecification of survey design features sufficed to severely bias estimates of health-related outcomes(2010)21 cited