anuscript received January 10, 2023; revised January 25; accepted February 9, 2023; published June 21, 2024.
Abstract—Software testing is the most significant task in software development and it takes maximum amount of time, cost, and effort. Therefore, to decrease these resources SDP is utilized to improve the work of the SQA process with the help of predicting faulty or defective components. Numerous methods have been proposed by researchers to predict defective components but these methods generate partial results when applied to imbalanced data sets. An imbalanced dataset has nonuniform class distribution with very limited illustrations of a precise class as compared to that of the other class. The usage of imbalanced datasets leads to off-target predictions of the smaller class, that are usually considered to be more significant than the mainstream class. Thus, handling imbalanced data and HPO efficiently is important for the successful development of a capable bug prediction model. In this paper SDP model is anticipated that utilizes different machine learning classifiers with Tree-structured Parzen Estimator Method (TPE) as hyperparameter optimizer to enhance defect prediction accuracy through HPO and SMOTE algorithm to solve class imbalance issue. The proposed method was evaluated on eighteen software defect datasets from the promise repository. Experimental results demonstrated that the proposed technique achieved improved accuracy than when the classifiers are used with default parameters.
Keywords—Software bug prediction (SDP), Tree-structured Parzen Estimator Method (TPE), Synthetic Minority Oversampling Technique (SMOTE), Hyperparameter optimization (HPO)
Cite: Faiza Khan, Sultan Almari, Muhammad Haseeb Khan, and Summrina Kanwal, "Software Defect Prediction Based on Tree-structured Parzen Estimator Using Machine Learning Classifiers," International Journal of Machine Learning vol. 14, no. 2, pp. 59-64, 2024.
Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).