Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
Scientific Data2018Vol. 5(1), pp. 180205–180205
Citations Over TimeTop 10% of 2018 papers
Robert Forkel, Johann‐Mattis List, Simon J. Greenhill, Christoph Rzymski, Sebastian Bank, Michael Cysouw, Harald Hammarström, Martín Haspelmath, Gereon A. Kaiping, Russell D. Gray
Abstract
The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices.
Related Papers
- → Analysis of Operational Airborne ISR Full Motion Video Metadata(2013)2 cited
- → A case study in designing Chinese metadata(2000)2 cited
- → Finding coincident data from satellites: using "meta-metadata" to reduce load on archive(2003)1 cited
- The Exploration of the Standardization in Data Image on Metadata(2007)
- Review on Audiovisual Metadata Elements Abroad(2005)