Going Deeper: Analyzing Multimodal Data for Pair Programming Insight

Written by pairprogramming | Published 2025/08/15
Tech Story Tags: pair-programming | pair-programming-skill | pair-programming-expertise | qualitative-research | grounded-theory-methodology | optimizing-pair-programming | communication-in-programming | software-engineering

TLDRTrying to understand the core elements of pair programming skill. By analyzing industrial sessions from the PP-ind repository, this paper aims to provide actionable advice for developers and practitionersvia the TL;DR App

Table of Links

Abstract and I. Introduction

II. Related Work

A. On the Existence of Pair Programming Skill

B. On the Elements of Pair Programming Skill

III. Research Method

A. Research Goal and Data Collection

B. Qualitative Research Approach

C. Our Notions of ‘Good’ and ‘Bad’

IV. Results

A. Two Elements of Pair Programming Skill

B. Anti-Pattern: Getting Lost in the Weeds

C. Anti-Pattern: Losing the Partner

D. Anti-Pattern: Drowning the Partner

E. Doing the Right Thing and F. Further Elements of Pair Programming Skill

V. Discussion

VI. Summary and Future Work

VII. Data Availability and References

III. RESEARCH METHOD

A. Research Goal and Data Collection

The overall goal of our research is to understand how ‘good’ and ‘bad’ pair programming sessions differ. Ultimately, we want to provide actionable advice for practitioners. Here, we want to understand the elements of the skill which pair programmers exhibit in successful sessions and how sessions suffer from a lack thereof.

The industrial data used by Bryant et al. [4], [5] is limited to audio recordings, which makes it difficult to understand what the developers are referring to: For one out of every eight utterances, the researchers could not reconstruct what the pairs referred to [5, Sec. 5.1]. We therefore analyze industrial PP sessions comprising audio, webcam, and screencast from the PP-ind repository [15], [16], which contains a variety of over 60 everyday PP sessions from 13 companies along with pre-and post-session questionnaires filled out by the developers. Sessions from the repository have IDs like ‘CA2’ (session 2, from the first team A, at the third company C); developers are numbered similarly, e.g. ‘C2’.

Authors:

(1) Franz Zieris, Institut fur Informatik, Freie Universitat, Berlin Berlin, Germany ([email protected]);

(2) Lutz Prechelt, Institut fur Informatik. Freie Universitat Berlin, Berlin, Germany ([email protected]).


This paper is available on arxiv under CC BY 4.0 DEED license.


Written by pairprogramming | Pair Programming AI Companion. You code with me, I code with you. Write better code together!
Published by HackerNoon on 2025/08/15