Value, but high costs in post-deposition data curation
Citations Over TimeTop 11% of 2016 papers
Abstract
Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena.
Related Papers
- → Value, but high costs in post-deposition data curation(2016)21 cited
- → Prototyping a collaborative data curation service for coastal science(2021)10 cited
- Discoverability of SPARQL endpoints in linked open data(2013)
- → Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques(2019)5 cited
- → Prototyping a collaborative data curation service for coastal science(2021)3 cited