Grammatical Error Detection with Self Attention by Pairwise Training
Citations Over Time
Abstract
Automatic grammatical error detection system is useful for language learners to identify whether the texts written by themselves have errors. Researches have paid more attention on different models to deal with this task, various approaches have been proposed and better results have been achieved compare with rules base methods. It is known that artificially generated incorrect texts can further improve the performance of grammatical error correction and pairwise training is necessary for many recommendation algorithms. We incorporating these two techniques together to solve the error detection task with pre-trained words embeddings from BERT in this paper. It is the first work that adopt pairwise training with pairs of samples to detect grammatical errors since all previous work were training models with batches of samples piontwisely. Pairwise training is useful for models to capture the differences within the pair of samples, which are intuitive useful for model to distinguish errors. Extensive experiments have been carried out to prove the effectiveness of pairwise training mechanism. The experimental results shown that the proposed method can achieve the state of the art performance on four different standard benchmarks. With the help of data augmentation and filtering, the value of F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0:5 can be further improved. The overall improvements among the four test set are around 2.5% which demonstrate the generality of pairwise training for datasets from differen domains.
Related Papers
- → The Thirty-one Benchmark Steroids Revisited: Comparative Molecular Moment Analysis (CoMMA) with Principal Component Regression(2000)24 cited
- → B3LYP-SVM METHOD FOR THE ESTIMATION OF MOLECULAR ENTHALPIES OF FORMATION(2007)7 cited
- → The effect of data diversity on the performance of deep learning models for predicting early gastric cancer under endoscopy(2022)2 cited
- → Quantitative Structure-Activity Relationships in Carboquinones and Benzodiazepines Using Counter-Propagation Neural Networks(1995)24 cited
- → Predictions For Pre-training Language Models(2020)