Searching for rules to find defective modules in unbalanced data sets

Rodŕiguez D.; Riquelme J.C.; Ruiz R.; AGUILAR RUIZ, SALVADOR

Title

Date Issued

25 September 2009

Access level

open access

Resource Type

conference paper

Author(s)

Rodŕiguez D.

Riquelme J.C.

Ruiz R.

AGUILAR RUIZ, SALVADOR

University of Seville

Abstract

The characterisation of defective modules in software engineering remains a challenge. In this work, we use data mining techniques to search for rules that indicate modules with a high probability of being defective. Using data sets from the PROMISE repository1, we first applied feature selection (attribute selection) to work only with those attributes from the data sets capable of predicting defective modules. With the reduced data set, a genetic algorithm is used to search for rules characterising modules with a high probability of being defective. This algorithm overcomes the problem of unbalanced data sets where the number of nondefective samples in the data set highly outnumbers the defective ones.©2009 IEEE.

Start page

89

End page

92

Language

English

OCDE Knowledge area

Ciencias de la información Ingeniería de sistemas y comunicaciones

DOI

10.1109/SSBSE.2009.23

Scopus EID

2-s2.0-70349269518

Resource of which it is part

Proceedings - 1st International Symposium on Search Based Software Engineering, SSBSE 2009

ISBN of the container

9780769536750

Conference

1st International Symposium on Search Based Software Engineering, SSBSE 2009

Sources of information: Directorio de Producción Científica Scopus

Options