LHDiff: A Language-Independent Hybrid Approach for Tracking Source Code Lines
Citations Over TimeTop 10% of 2013 papers
Abstract
Tracking source code lines between two different versions of a file is a fundamental step for solving a number of important problems in software maintenance such as locating bug introducing changes, tracking code fragments or defects across versions, merging file versions, and software evolution analysis. Although a number of such approaches are available in the literature, their performance is sensitive to the kind and degree of source code changes. There is also a marked lack of study on the effect of change types on source location tracking techniques. In this paper, we propose a language-independent technique, LHDiff, for tracking source code lines across versions that leverages simhash technique together with heuristics to improve accuracy. We evaluate our approach against state-of-the- art techniques using benchmarks containing different degrees of changes where files are selected from real world applications. We further evaluate LHDiff with other techniques using a mutation based analysis to understand how different types of changes affect their performance. The results reveal that our technique is more effective than language-independent approaches and no worse than some language-dependent techniques. In our study LHDiff even shows better performance than a state-of-the-art language- dependent approach. In addition, we also discuss limitations of different line tracking techniques including ours and propose future research directions.
Related Papers
- → LHDiff: A Language-Independent Hybrid Approach for Tracking Source Code Lines(2013)48 cited
- Reverse engineering source code: Empirical studies of limitations and opportunities(2017)
- → Towards Accurate File Tracking Based on AST Differences(2021)3 cited
- → LHDiff: Tracking Source Code Lines to Support Software Maintenance Activities(2013)5 cited
- → Source Code Format(1993)