Automatic topic identification using webpage clustering
2002pp. 195–202
Citations Over TimeTop 10% of 2002 papers
Abstract
Grouping Web pages into distinct topics is one way of organizing the large amount of retrieved information on the Web. In this paper, we report that, based on a similarity metric, which incorporates textual information, hyperlink structure and co-citation relations, an unsupervised clustering method can automatically and effectively identify relevant topics, as shown in experiments on several retrieved sets of Web pages. The clustering method is a state-of-art spectral graph partitioning method based on the normalized cut criterion first developed for image segmentation.
Related Papers
- → Analysis and improvement of HITS algorithm for detecting Web communities(2003)25 cited
- → An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity(2016)12 cited
- → A survey: hyperlink analysis in webpage ranking algorithms(2014)2 cited
- The Hyperlink and Route Choice in Webpages(2006)