Date of Award

3-2025

Document Type

Thesis

Degree Name

Master of Science in Operations Research

Department

Department of Operational Sciences

First Advisor

Lance E. Champagne, PhD

Abstract

Class imbalance poses significant challenges in machine learning classification. This study evaluates the performance of seven models (ANN, k-Means, kNN, LDA, LR, SVM, XGBoost) across multiple imbalance levels (10\%, 5\%, 1 \%, 0.5\%) and investigates the effectiveness of sampling techniques (Undersampling, SMOTE, SMOTE-ENN). ANOVA results confirm that model choice is the most critical factor, with XGBoost and SVM demonstrating superior robustness. SMOTE improves recall but reduces precision, while undersampling generally degrades overall performance. While significant, imbalance levels do not play a critical role in model effectiveness.

AFIT Designator

AFIT-ENS-MS-25-M-166

Comments

An embargo was observed for posting this thesis.

This work is marked Distribution A, Approved for Public Release. PA case number 88ABW-2025-0306

Share

COinS