ES‐Plag: Efficient and sensitive source code plagiarism detection tool for academic environment
Citations Over TimeTop 10% of 2018 papers
Abstract
Abstract Source code plagiarism detection using Running‐Karp‐Rabin Greedy‐String‐Tiling (RKRGST) is a common practice in academic environment. However, such approach is time‐inefficient (due to RKRGST's cubic time complexity) and insensitive (toward token subsequence rearrangement). This paper proposes ES‐Plag, a plagiarism detection tool featured with cosine‐based filtering and penalty mechanism to handle aforementioned issues. Cosine‐based filtering mitigates time‐inefficiency by excluding non‐potential pairs from RKRGST comparison; while penalty mechanism mitigates insensitivity by reducing the number of matched tokens with the number of matched subsequences prior similarity normalization. In addition to issue‐solving features, ES‐Plag is also featured with project‐based input, colorized adjacency similarity matrix, matched token highlighting, and various similarity algorithms (e.g., Cosine Similarity and Local Alignment). Three findings can be deducted from our evaluation. First, cosine‐based filtering boosts up time efficiency with a trade‐off in effectiveness. Second, penalty mechanism enhances sensitivity even though its improvement in terms of effectiveness is quite limited. Third, ES‐Plag's features are beneficial for examiners.
Related Papers
- → Similarity measures of intuitionistic fuzzy sets based on cosine function for the decision making of mechanical design schemes(2015)83 cited
- → Multi-level text document similarity estimation and its application for plagiarism detection(2022)15 cited
- → Incremental Cosine Computations for Search and Exploration of Tag Spaces(2012)4 cited
- → Design and Implementation of a Final Project Plagiarism Detection System Using Cosine Similarity Method(2022)2 cited
- → On the benchmarking of port performance. A cosine similarity approach(2019)