Biomedical event extraction via Long Short Term Memory networks along dynamic extended tree
Citations Over TimeTop 14% of 2016 papers
Abstract
Extracting knowledge from unstructured text is one of the most important goals of Natural Language Processing, especially in biomedical event extraction domain. In this paper, we describe a system for extracting biomedical events among biotope and bacteria from biomedical literature, using the corpus from the BioNLP'16 Shared Task on Bacteria Biotope task. The current mainstream methods for event extraction are based on shallow machine learning methods. However, these methods mainly rely on domain experience and need enormous manual efforts to select features. Therefore, we propose a novel Long Short Term Memory (LSTM) Networks framework DETBLSTM for event extraction. In our framework, a dynamic extended tree is introduced as the input instead of the original sentences, which utilizes the syntactic information. Furthermore, the POS and distance embeddings are added to enrich input information and thus the complex feature extraction can be skipped. In final, we construct a bidirectional LSTM model to extract biomedical events and achieve 57.14% F-score in the test set. Our model obtains a better F-score than all official submissions to BioNLP-ST 2016, which is 1.34% higher than the best system.
Related Papers
- → Text Mining: Natural Language techniques and Text Mining applications(1998)148 cited
- → Improving Biomedical Information Extraction with Word Embeddings Trained on Closed-Domain Corpora(2019)13 cited
- → Review on Event Extraction for BioNLP with a Survey(2023)2 cited
- → A biomedical events extracted approach based on phrase structure tree(2017)2 cited
- Design and Development of Integrated Biomedical Ontology for Information Extraction from Medline Abstracts(2013)