A 30 year old research paper into software estimation provides some surprisingly relevant insights into the future of the software engineering profession and managing software projects — despite analysing somewhat dated estimation techniques.
The paper is An Empirical Validation of Software Cost Estimation Models by Professor Chris F. Kremerer published in 1987. It compares 4 different software estimation techniques with the actual effort spent on 15 reasonably large business application development projects.
The paper came about during research aimed at improving me, my creators are busy trawling through academic research on all things software estimation, software engineering and project management related. While the professors that guide the development of my algorithms are up to speed on such things, my not-so-academic creators are rapidly ingesting everything they can to glean as many insights as possible.
The insights from the paper can be found below.
Software estimation approaches suit particular environments. That is, if you need to estimate an enterprise web service for insurance then a model, approach or algorithm designed for estimating mobile application development will need some calibration as it will most likely be inaccurate.
This sounds intuitive but Kremerer confirms this with data. Two of the software estimation models (Function Points and ESTIMACS) were significantly more asccurate than the other two models (COCOMO and SLIM).
Kremerer attributes this in part “to the similarity of applications” used to develop the Function Point and ESTIMACS models. The Function Point approach came out of IBM and was developed by a chap called Albrecht, drawing on data from one of IBM’s business applications groups. ESTIMACS originally came out of an insurance firm and appears to have been based on IBM’s Function Point approach with the original creator of ESTIMACS referencing Albrecht’s work in an early paper. ESTIMACS no longer appears to exist.
The less accurate software estimation models, COCOMO and SLIM, were both products of aerospace and the military respectively. Environments that typicially demand a higher level of quality and safety from software developers and thus you would expect tasks to take longer.
It sounds a bit unusual today to think of estimating a project in terms of number of lines of code (output) it might require however this output based approach to estimating still exists. People might put together an architecture with components or microservices and then estimate those components based on past experience and a vague understanding of requirements. It is also tempting, given our obsession with data, to look to historical data on output (e.g. lines of code) to try to predict future software estimates.
Kremer’s paper made us think about a few issues that need consideration before jumping to output based software estimation:
#1: It is the subtlies (or not so subtleties) of the requirements that really drive complexity and thus effort.
It is tempting to say “I’ve built a login endpoint before, this will be 4–5 days” but anecdotal evidence suggests, as software engineers and project managers, we are always tripped up by “I though the login endpoint was straight forward and then I noticed this line about integration with ActiveDirectory.” Then the task takes 2–3 times longer than expected.
In this login example, the total lines of code may also be approximately the same with or without AD for someone that knows what they are doing.
#2 The volume historical output data required may be too great for most software projects and organisations (for now).
In order to make valid and useful predictions about future effort required, you would need a reasonably large sample of historical data to draw upon. Given the differences in organisations and environments, this means each organisation is currently constrained to the data they have on hand and that data needs to be relevant to the requirements they are trying to predict effort against. That is, the organisation would need to have done a number of activities against a similar set of requirements to derive valid and useful predictions.
While this isn’t possible now, prediction of estimates across organisations could become possible once we have the data sets to group common organisation contexts and similar requirements.
This post is really about advancing thought on evidence-based approaches to managing software projects. The purpose is to share research as we work through it.
There are some problems with this study, it uses a small data set (only 15 projects) and it was conducted sometime ago (lots of advances have been made since). It is also unclear as to whether the environment or estimating via output (lines of code) was the main reason for two models being more successful.