IMPROVED SUMMARIZATION OF CHINESE SPOKEN DOCUMENTS BY PROBABILISTIC LATENT SEMANTIC ANALYSIS (PLSA) WITH FURTHER ANALYSIS AND INTEGRATED SCORING
2006pp. 26–29
Abstract
In a previous paper [1] two new scoring measures, topic significance (TS) and topic entropy (TE), obtained from probabilistic latent semantic analysis (PLSA) were shown to outperform very successful baseline significance score (SS) in selecting the important sentences for summarization of spoken documents. In this paper extensive experiments using the ROUGE scores with respect to different parameters at different summarization ratios were carefully analyzed in great detail. It was also found that integration of these two scoring measures offered further improvements, and special considerations of the structure of Chinese language was also helpful when summarizing Chinese spoken documents.
Related Papers
- → Text summarization using Latent Semantic Analysis(2011)170 cited
- Text Summarization of Turkish Texts using Latent Semantic Analysis(2010)
- → Text Summarization within the Latent Semantic Analysis Framework: Comparative Study(2013)21 cited
- → An Approach to Generic Bengali Text Summarization Using Latent Semantic Analysis(2017)15 cited
- → Comparing the Performance of Latent Semantic Analysis and Probability Latent Semantic Analysis Models on Autoscoring Essay Tasks(2017)1 cited