What Does Anonymization Mean? DataSHIELD and the Need for Consensus on Anonymization Terminology
Citations Over TimeTop 24% of 2016 papers
Abstract
Anonymization is a recognized process by which identifiers can be removed from identifiable data to protect an individual's confidentiality and is used as a standard practice when sharing data in biomedical research. However, a plethora of terms, such as coding, pseudonymization, unlinked, and deidentified, have been and continue to be used, leading to confusion and uncertainty. This article shows that this is a historic problem and argues that such continuing uncertainty regarding the levels of protection given to data risks damaging initiatives designed to assist researchers conducting cross-national studies and sharing data internationally. DataSHIELD and the creation of a legal template are used as examples of initiatives that rely on anonymization, but where the inconsistency in terminology could hinder progress. More broadly, this article argues that there is a real possibility that there could be possible damage to the public's trust in research and the institutions that carry it out by relying on vague notions of the anonymization process. Research participants whose lack of clear understanding of the research process is compensated for by trusting those carrying out the research may have that trust damaged if the level of protection given to their data does not match their expectations. One step toward ensuring understanding between parties would be consistent use of clearly defined terminology used internationally, so that all those involved are clear on the level of identifiability of any particular set of data and, therefore, how that data can be accessed and shared.
Related Papers
- → Sharing Research Data and Confidentiality: Restrictions Caused by Deficient Consent Forms(2008)8 cited
- → What Does Anonymization Mean? DataSHIELD and the Need for Consensus on Anonymization Terminology(2016)6 cited
- → A de-identification tool for users in medical operations and public health(2016)4 cited
- → Enhancing privacy for automatically detected quasi identifier using data anonymization(2023)1 cited