Observational Data Patterns for Time Series Data Quality Assessment
Citations Over TimeTop 11% of 2014 papers
Abstract
Observational data are fundamental for scientific research in almost any domain. Recent advances in sensor and data management technologies are enabling unprecedented amounts of observational data to be collected and analyzed. However, an essential part of using observational data is not currently as scalable as data collection and analysis methods: data quality assurance and control. While specialized tools for very narrow domains do exist, general methods are harder to create. This paper explores the identification of data issues that lead to the creation of data tests and tools to perform data quality control activities. Developing this identification step in a systematic manner allows for better and more general quality control tools. As our case study, we use carbon, water, and energy fluxes as well as micro-meteorological data collected at field sites that are part of FLUXNET, a network of over 400 ecosystem-level monitoring stations. In an effort toward the release of a new global data set of fluxes, we are doing data quality control for these data. The experience from this work led to the creation of a catalog of issues identified in the data. This paper presents this catalog and its generalization into a set of patterns of data quality issues that can be detected in observational data.
Related Papers
- → An Analysis of the Mixed Collection Modes for two Business Surveys Conducted by the US Census Bureau(2015)10 cited
- → Secondary Data Collection(2018)5 cited
- → Data Quality Problems Identified in the Bioclimatic Data Collection Process - A Survey(2019)4 cited
- Research on data quality management in digital campus(2011)
- → Data—The Lifeblood of Decision-Making(2021)