Rebasing in Code Review Considered Harmful: A Large-Scale Empirical Investigation
Citations Over TimeTop 10% of 2019 papers
Abstract
Code review has been widely acknowledged as a key quality assurance process in both open-source and industrial software development. Due to the asynchronicity of the code review process, the system's codebase tends to incorporate external commits while a source code change is reviewed, which cause the need for rebasing operations. External commits have the potential to modify files currently under review, which causes re-work for developers and fatigue for reviewers. Since source code changes observed during code review may be due to external commits, rebasing operations may pose a severe threat to empirical studies that employ code review data. Yet, to the best of our knowledge, there is no empirical study that characterises and investigates rebasing in real-world software systems. Hence, this paper reports an empirical investigation aimed at understanding the frequency in which rebasing operations occur and their side-effects in the reviewing process. To achieve so, we perform an in-depth large-scale empirical investigation of the code review data of 11 software systems, 28,808 code reviews and 99,121 revisions. Our observations indicate that developers need to perform rebasing operations in an average of 75.35% of code reviews. In addition, our data suggests that an average of 34.21% of rebasing operations tend to tamper with the reviewing process. Finally, we propose a methodology to handle rebasing in empirical studies that employ code review data. We show how an empirical study that does not account for rebasing operations may report skewed, biased and inaccurate observations.
Related Papers
- → Automatically Recommending Peer Reviewers in Modern Code Review(2015)172 cited
- → Automatic Code Review by Learning the Revision of Source Code(2019)45 cited
- → Recognizing lines of code violating company-specific coding guidelines using machine learning(2019)21 cited
- → Recommending peer reviewers in modern code review(2020)6 cited
- → Chapter 8 Recognizing Lines of Code Violating Company-Specific Coding Guidelines Using Machine Learning(2019)3 cited