Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: a Detailed Analysis
Citations Over TimeTop 20% of 2019 papers
Abstract
Recent studies have shown that pre-trained contextual word embeddings, which assign the same word different vectors in different contexts, improve performance in many tasks. But while contextual embeddings can also be trained at the character level, the effectiveness of such embeddings has not been studied. We derive character-level contextual embeddings from Flair (Akbik et al., 2018), and apply them to a time normalization task, yielding major performance improvements over the previous state-of-the-art: 51% error reduction in news and 33% in clinical notes. We analyze the sources of these improvements, and find that pre-trained contextual character embeddings are more robust to term variations, infrequent terms, and cross-domain changes. We also quantify the size of context that pre-trained contextual character embeddings take advantage of, and show that such embeddings capture features like part-of-speech and capitalization.
Related Papers
- → Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition(2020)14 cited
- → On the accuracy of different neural language model approaches to ADE extraction in natural language corpora(2021)8 cited
- Mining Atomic Chinese Abbreviation Pairs: A Probabilistic Model for Single Character Word Recovery(2006)
- → Matching the tagging to the task(1999)2 cited
- → SATLab at SemEval-2022 Task 4: Trying to Detect Patronizing and Condescending Language with only Character and Word N-grams(2022)