Date of Award
Master of Science
Department of Operational Sciences
Raymond R. Hill, PhD
The Air Force must modernize, but the distribution of funds for technology remains as tight as ever. To this end, the Air Force Audit Agency is looking to utilize machine learning techniques to enhance their capabilities. This research explores Logistic Regression and Random Forest modeling to streamline data collection and cost classification. The final Logistic Regression model identified 4 significant attributes out of the 36 given and was 85 accurate in predicting whether a purchase amount was over or under $10,000. To expand beyond binary classification, a six-category classification Random Forest model was developed. It identified 6 significant attributes and was 34 accurate in in predicting whether a purchase was in 1 of 6 amount categories. Due to the class imbalance of the given data, it was necessary to use a class weighting and over-sampling technique to enhance the Random Forest model. The final class balanced model identified the same 6 significant attributes but was 78 accurate in predicting whether a purchase was in 1 of 6 amount categories. No models were able to predict whether a purchase should be classified as an information technology purchase of not.
DTIC Accession Number
Batt, Jacob P., "Training LOGIC and Random Forest Models to Predict IT Spending" (2022). Theses and Dissertations. 5337.