Which process metrics can significantly improve defect prediction models? An empirical study

Verfasser / Beitragende:
[Lech Madeyski, Marian Jureczko]
Ort, Verlag, Jahr:
2015
Enthalten in:
Software Quality Journal, 23/3(2015-09-01), 393-422
Format:
Artikel (online)
ID: 605495718
LEADER caa a22 4500
001 605495718
003 CHVBK
005 20210128100532.0
007 cr unu---uuuuu
008 210128e20150901xx s 000 0 eng
024 7 0 |a 10.1007/s11219-014-9241-7  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s11219-014-9241-7 
245 0 0 |a Which process metrics can significantly improve defect prediction models? An empirical study  |h [Elektronische Daten]  |c [Lech Madeyski, Marian Jureczko] 
520 3 |a The knowledge about the software metrics which serve as defect indicators is vital for the efficient allocation of resources for quality assurance. It is the process metrics, although sometimes difficult to collect, which have recently become popular with regard to defect prediction. However, in order to identify rightly the process metrics which are actually worth collecting, we need the evidence validating their ability to improve the product metric-based defect prediction models. This paper presents an empirical evaluation in which several process metrics were investigated in order to identify the ones which significantly improve the defect prediction models based on product metrics. Data from a wide range of software projects (both, industrial and open source) were collected. The predictions of the models that use only product metrics (simple models) were compared with the predictions of the models which used product metrics, as well as one of the process metrics under scrutiny (advanced models). To decide whether the improvements were significant or not, statistical tests were performed and effect sizes were calculated. The advanced defect prediction models trained on a data set containing product metrics and additionally Number of Distinct Committers (NDC) were significantly better than the simple models without NDC, while the effect size was medium and the probability of superiority (PS) of the advanced models over simple ones was high ( $$p=.016$$ p = . 016 , $$r=-.29$$ r = - . 29 , $$\hbox {PS}=.76$$ PS = . 76 ), which is a substantial finding useful in defect prediction. A similar result with slightly smaller PS was achieved by the advanced models trained on a data set containing product metrics and additionally all of the investigated process metrics ( $$p=.038$$ p = . 038 , $$r=-.29$$ r = - . 29 , $$\hbox {PS}=.68$$ PS = . 68 ). The advanced models trained on a data set containing product metrics and additionally Number of Modified Lines (NML) were significantly better than the simple models without NML, but the effect size was small ( $$p=.038$$ p = . 038 , $$r=.06$$ r = . 06 ). Hence, it is reasonable to recommend the NDC process metric in building the defect prediction models. 
540 |a The Author(s), 2014 
690 7 |a Software metrics  |2 nationallicence 
690 7 |a Product metrics  |2 nationallicence 
690 7 |a Process metrics  |2 nationallicence 
690 7 |a Defect prediction models  |2 nationallicence 
690 7 |a Software defect prediction  |2 nationallicence 
700 1 |a Madeyski  |D Lech  |u Wroclaw University of Technology, Wyb.Wyspianskiego 27, 50370, Wrocław, Poland  |4 aut 
700 1 |a Jureczko  |D Marian  |u Wroclaw University of Technology, Wyb.Wyspianskiego 27, 50370, Wrocław, Poland  |4 aut 
773 0 |t Software Quality Journal  |d Springer US; http://www.springer-ny.com  |g 23/3(2015-09-01), 393-422  |x 0963-9314  |q 23:3<393  |1 2015  |2 23  |o 11219 
856 4 0 |u https://doi.org/10.1007/s11219-014-9241-7  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s11219-014-9241-7  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Madeyski  |D Lech  |u Wroclaw University of Technology, Wyb.Wyspianskiego 27, 50370, Wrocław, Poland  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Jureczko  |D Marian  |u Wroclaw University of Technology, Wyb.Wyspianskiego 27, 50370, Wrocław, Poland  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Software Quality Journal  |d Springer US; http://www.springer-ny.com  |g 23/3(2015-09-01), 393-422  |x 0963-9314  |q 23:3<393  |1 2015  |2 23  |o 11219