An empirical study of code clone genealogies
Citations Over TimeTop 1% of 2005 papers
Abstract
It has been broadly assumed that code clones are inherently bad and that eliminating clones by refactoring would solve the problems of code clones. To investigate the validity of this assumption, we developed a formal denition of clone evolution and built a clone genealogy tool that automatically extracts the history of code clones from a source code repository. Using our tool we extracted clone genealogy information for two Java open source projects and analyzed their evolution. Our study contradicts some conventional wisdom about clones. In particular, refactoring may not always improve software with respect to clones for two reasons. First, many code clones exist in the system for only a short time; extensive refactoring of such short-lived clones may not be worthwhile if they are likely diverge from one another very soon. Second, many clones, especially long-lived clones that have changed consistently with other elements in the same group, are not easily refactorable due to programming language limitations. These insights show that refactoring will not help in dealing with some types of clones and open up opportunities for complementary clone maintenance tools that target these other classes of clones.
Related Papers
- → Assessing the effect of clones on changeability(2008)154 cited
- → An Empirical Study of Long-Lived Code Clones(2011)44 cited
- Sub-clones: Considering the Part Rather than the Whole.(2010)
- → Clone refactoring inspection by summarizing clone refactorings and detecting inconsistent changes during software evolution(2018)9 cited
- Method Level Detection and Removal of Code Clones in C and Java Programs using Refactoring(2010)