Bayesian analysis of empirical software engineering cost models
Abstract
To date many software engineering cost models have been developed to predict the cost, schedule, and quality of the software under development. But, the rapidly changing nature of software development has made it extremely difficult to develop empirical models that continue to yield high prediction accuracies. Software development costs continue to increase and practitioners continually express their concerns over their inability to accurately predict the costs involved. Thus, one of the most important objectives of the software engineering community has been to develop useful models that constructively explain the software development life-cycle and accurately predict the cost of developing a software product. To that end, many parametric software estimation models have evolved in the last two decades [25], [17], [26], [15], [28], [1], [2], [33], [7], [10], [22], [23]. Almost all of the above mentioned parametric models have been empirically calibrated to actual data from completed software projects. The most commonly used technique for empirical calibration has been the popular classical multiple regression approach. As discussed in this paper, the multiple regression approach imposes a few assumptions frequently violated by software engineering datasets. The source data is also generally imprecise in reporting size, effort, and cost-driver ratings, particularly across different organizations. This results in the development of inaccurate empirical models that don't perform very well when used for prediction. This paper illustrates the problems faced by the multiple regression approach during the calibration of one of the popular software engineering cost models, COCOMO II. It describes the use of a pragmatic 10 percent weighted average approach that was used for the first publicly available calibrated version [6]. It then moves on to show how a more sophisticated Bayesian approach can be used to alleviate some of the problems faced by multiple regression. It compares and contrasts the two empirical approaches, and concludes that the Bayesian approach was better and more robust than the multiple regression approach. Bayesian analysis is a well-defined and rigorous process of inductive reasoning that has been used in many scientific disciplines (the reader can refer to [11], [35], [3] for a broader understanding of the Bayesian Analysis approach). A distinctive feature of the Bayesian approach is that it permits the investigator to use both sample (data) and prior (expert-judgment) information in a logically consistent manner in making inferences. This is done by using Bayes' theorem to produce a 'postdata' or posterior distribution for the model parameters. Using Bayes' theorem, prior (or initial) values are transformed to postdata views. This transformation can be viewed as a learning process. The posterior distribution is determined by the variances of the prior and sample information. If the variance of the prior information is smaller than the variance of the sampling information, then a higher weight is assigned to the prior information. On the other hand, if the variance of the sample information is smaller than the variance of the prior information, then a higher weight is assigned to the sample information causing the posterior estimate to be closer to the sample information. The Bayesian approach discussed in this paper enables stronger solutions to one of the biggest problems faced by the software engineering community: the challenge of making good decisions using data that is usually scarce and incomplete. We note that the predictive performance of the Bayesian approach (i.e., within 30 percent of the actuals 75 percent of the time) is significantly better than that of the previous multiple regression approach (i.e., within 30 percent of the actuals only 52 percent of the time) on our latest sample of 161 project datapoints. Index Terms - Bayesian analysis, multiple regression, software estimation, software engineering cost models, model calibration, prediction accuracy, empirical modeling, COCOMO, measurement, metrics, project management. © 1999 IEEE.