Table Of Links Table Of Links ABSTRACT ABSTRACT 1 INTRODUCTION 1 INTRODUCTION 1 INTRODUCTION 2 BACKGROUND 2 BACKGROUND 2 BACKGROUND 2.1 Code Review As Communication Network 2.2 Code Review Networks 2.3 Measuring Information Diffusion in Code Review 3 RESEARCH DESIGN 3 RESEARCH DESIGN 3 RESEARCH DESIGN 3.1 Hypotheses 3.2 Measurement model 3.3 Measuring system 4 LIMITATIONS 4 LIMITATIONS 4 LIMITATIONS ACKNOWLEDGMENTS AND REFERENCES ACKNOWLEDGMENTS AND REFERENCES ACKNOWLEDGMENTS AND REFERENCES 4 LIMITATIONS 4 LIMITATIONS In general, the chain of evidence of our study depends on two main factors: (1) the measurement model, measuring system, and actual measurement, and (2) the thoroughness of our discussion for qualitatively rejecting the hypotheses and, thereby, falsifying the theory of code review as communication network. Although we will not be able to provide the complete raw data and only a prototypical extraction pipeline for Backstage, we believe that our thorough description of our measurement model, measuring system, and the actual measurement at Spotify provides a solid foundation for this line of research. Our replication package will contain the necessary yet anonymized data to reproduce and replicate our study beyond the context of Spotify. However, as for every data-driven study, missing, incomplete, faulty, or unreliable data may significantly affect the validity of our study. To mitigate those risks, we conducted a pilot study in October 2023. Although we have not encountered such threats to validity, we cannot exclude data-related limitations. Therefore, this section will also cover the limitations that come from excluding or missing data once our data collection is completed. However, we believe the two most critical limitations of our study lie in the nature of a qualitative falsification of theories. Although traditional statistical hypothesis tests also have their limitations and, ultimately, also represent an implicit and qualitative discussion, we believe that a discussion remains more prone to bias, most importantly because there are no clear criteria to reject the hypotheses upfront. Such clear rejection and falsification criteria are not possible and meaningful upfront for this research; all thresholds, values, or estimates would be arbitrary. However, we believe that a comprehensive discussion makes a potential bias explicit and allows other researchers to conclude differently. Additionally, we will publish our measurement system and all intermediate anonymized data to enable other researchers to replicate our work. Second, even if our data and a thorough discussion suggest falsifying our theory by rejecting one of the hypotheses, our modelling approach may not capture the (relevant) information diffusion in code review. Although we have strong indications that the explicit referencing of code reviews is an active and explicit information diffusion triggered by human assessment, we are not aware of empirical evidence that supports our assumption. Although already discussed in Section 3, we emphasize again that the findings of the extent of information diffusion will not be generalizable. We do not believe that this is a major limitation of our research design since our argumentation is based on contradiction (reductio ad absurdum). This section will also include a detailed discussion of limitations that originate in incomplete or missing data when they become visible after the data collection and analysis. ACKNOWLEDGMENTS ACKNOWLEDGMENTS We thank Spotify for supporting this research and the anonymous reviewers for their valuable and extensive feedback. This work was supported by the KKS Foundation through the SERT Project (Research Profile Grant 2018/010) at Blekinge Institute of Technology. REFERENCES REFERENCES [1] Claudia Ayala et al. “Use and Misuse of the Term “Experiment” in Mining Software Repositories Research”. In: IEEE Transactions on Software Engineering 48 (11 Nov. 2022), pp. 4229– 4248. issn: 0098-5589. doi: 10.1109/TSE.2021.3113558. url: https://ieeexplore.ieee.org/document/9547824/. https://ieeexplore.ieee.org/document/9547824/ [2] Alberto Bacchelli and Christian Bird. “Expectations, outcomes, and challenges of modern code review”. In: Proceedings - International Conference on Software Engineering (2013), pp. 712–721. issn: 02705257. [3] Tobias Baum et al. “Factors influencing code review processes in industry”. In: 2016. isbn: 9781450342186. [4] Amiangshu Bosu and Jeffrey C. Carver. “Impact of developer reputation on code review outcomes in OSS projects”. In: ACM, Sept. 2014, pp. 1–10. isbn: 9781450327749. doi: 10 . 1145/2652524.2652544. [5] Amiangshu Bosu et al. “Process Aspects and Social Dynamics of Contemporary Code Review: Insights from Open Source Development and Industrial Practice at Microsoft”. In: IEEE Transactions on Software Engineering 43 (2017), pp. 56–75. issn: 00985589. [6] Atacílio Cunha, Tayana Conte, and Bruno Gadelha. “Code Review is just reviewing code? A qualitative study with practitioners in industry”. In: Association for Computing Machinery, Sept. 2021, pp. 269–274. isbn: 9781450390613. doi: 10.1145/3474624.3477063. [7] Michael Dorner et al. “Only Time Will Tell: Modelling Information Diffusion in Code Review with Time-Varying Hypergraphs”. In: ACM, Sept. 2022, pp. 195–204. isbn: 9781450394277. doi: 10.1145/3544902.3546254. url: https://dl.acm.org/doi/ 10.1145/3544902.3546254. https://dl.acm.org/doi/ [8] Michael Dorner et al. The Upper Bound of Information Diffusion in Code Review. 2024. arXiv: 2306.08980 [cs.SE]. cs.SE [9] Daniel Méndez Fernández and Jan-Hendrik Passoth. “Empirical software engineering: From discipline to interdiscipline”. In: Journal of Systems and Software 148 (Feb. 2019), pp. 170– 179. issn: 01641212. doi: 10.1016/j.jss.2018.11.019. [10] Xinbo Gao et al. “A survey of graph edit distance”. In: Pattern Analysis and Applications 13 (1 Feb. 2010), pp. 113–129. issn: 1433-7541. doi: 10.1007/s10044-008-0141-y. [11] Kazuki Hamasaki et al. “Who does what during a code review? Datasets of OSS peer review repositories”. In: 2013, pp. 49–52. isbn: 9781467329361. doi: 10.1109 /MSR.2013. 6624003. [12] Toshiki Hirao et al. “The review linkage graph for code review analytics: a recovery approach and empirical study”. In: ACM, Aug. 2019, pp. 578–589. isbn: 9781450355728. doi: 10.1145/3338906.3338949. url: https://dl.acm.org/doi/10. 1145/3338906.3338949. https://dl.acm.org/doi/10 [13] ISO/IEC. International Vocabulary of Metrology–Basic and General Concepts and Associated Terms. [14] David Kavaler, Premkumar Devanbu, and Vladimir Filkov. “Whom are you going to call? Determinants of @-mentions in Github discussions”. In: Empirical Software Engineering 24 (6 Dec. 2019), pp. 3904–3932. issn: 1382-3256. doi: 10.1007/ s10664-019-09728-3. url: http://link.springer.com/10.1007/ s10664-019-09728-3. http://link.springer.com/10.1007/ [15] Lisha Li et al. “How Are Issue Units Linked? Empirical Study on the Linking Behavior in GitHub”. In: vol. 2018-December. IEEE, Dec. 2018, pp. 386–395. isbn: 978-1-7281-1970-0. doi: 10.1109/APSEC.2018.00053. url: https://ieeexplore.ieee.org/ document/8719531/. https://ieeexplore.ieee.org/ [16] Audris Mockus and James D. Herbsleb. “Expertise browser: a quantitative approach to identifying expertise”. In: ACM Press, 2002, p. 503. isbn: 158113472X. [17] Luca Pascarella et al. “Information Needs in Contemporary Code Review”. In: Proceedings of the ACM on Human-Computer Interaction 2 (CSCW Nov. 2018), pp. 1–27. issn: 2573-0142. doi: 10.1145/3274404. url: https://dl.acm.org/doi/10.1145/ 3274404. https://dl.acm.org/doi/10.1145/ [18] Peter C. Rigby and Christian Bird. “Convergent contemporary software peer review practices”. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering ESEC/FSE 2013 (2013), p. 202. [19] Caitlin Sadowski et al. “Modern Code Review: A Case Study at Google”. In: Proceedings of the 40th International Conference on Software Engineering Software Engineering in Practice ICSE-SEIP ’18 (2018), pp. 181–190. [20] Darja Šmite et al. “Decentralized decision-making and scaled autonomy at Spotify”. In: Journal of Systems and Software 200 (June 2023), p. 111649. issn: 01641212. doi: 10.1016/j.jss. 2023.111649. [21] F. Thung et al. “Network Structure of Social Coding in GitHub”. In: IEEE, Mar. 2013, pp. 323–326. isbn: 978-0-7695-4948-4. doi: 10.1109/CSMR.2013.41. [22] Xin Yang et al. “Understanding OSS peer review roles in peer review social network (PeRSoN)”. In: Proceedings - AsiaPacific Software Engineering Conference, APSEC 1 (2012), pp. 709–712. issn: 15301362. doi: 10.1109/APSEC.2012.63. [23] Yang Zhang et al. “A Exploratory Study of @-Mention in GitHub’s Pull-Requests”. In: vol. 1. IEEE, Dec. 2014, pp. 343– 350. isbn: 978-1-4799-7425-2. doi: 10.1109/APSEC.2014.58. url: http://ieeexplore.ieee.org/document/7091329/. Authors:

Michael Dorner
Daniel Mendez
Ehsan Zabardast
Nicole Valdez
Marcin Floryan Authors: Authors: Michael Dorner
Daniel Mendez
Ehsan Zabardast
Nicole Valdez
Marcin Floryan Michael Dorner Michael Dorner Daniel Mendez Daniel Mendez Ehsan Zabardast Ehsan Zabardast Nicole Valdez Nicole Valdez Marcin Floryan Marcin Floryan This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. available on arxiv n arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Microsoft

Spotify Study Flags Key Limits in Measuring Information Flow in Code Reviews

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Classifier-Based VP System Targets High-Risk Code Changes in Upstream Projects

Spotify Study Maps How Information Spreads Through Code Reviews

Researchers Test Long-Held Theory: Do Code Reviews Truly Act as Communication Networks?

Swift’s #Predicate Explained: How Type-Safe Filtering Works in SwiftData

10 Expert Tips for Improving Code Reviews: A Guide for Developers

13 Tips for Better Pull Requests and Code Review

Classifier-Based VP System Targets High-Risk Code Changes in Upstream Projects

Spotify Study Maps How Information Spreads Through Code Reviews

Researchers Test Long-Held Theory: Do Code Reviews Truly Act as Communication Networks?

Swift’s #Predicate Explained: How Type-Safe Filtering Works in SwiftData

10 Expert Tips for Improving Code Reviews: A Guide for Developers

13 Tips for Better Pull Requests and Code Review

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps