Date of Award


Document Type


Degree Name

Master of Science in Logistics and Supply Chain Management


Department of Operational Sciences

First Advisor

Bradley C. Boehmke, PhD.


The United States Air Force can dramatically reduce resource consumption through strategic sourcing initiatives that leverage sensibly-bound pockets of spend via category management. However, category creation is a particularly daunting task due to the sheer magnitude of purchasing data in large organizations. Text mining is one way to identify categories. Specifically, term frequency analysis, term frequency-inverse document frequency analysis, and topic modeling can identify category membership, unique characteristics of categories, and thematic natures of the categories. This thesis developed an empirical, generalizable, reproducible methodology to analyze historical contract text descriptions to uncover the data’s hidden structure. A sample case was transformed into a practical hierarchy, which was internally and externally validated. As a foundational methodology, the impact of token selection, domain expertise, and unique contracting language were identified as considerations for future research.

AFIT Designator


DTIC Accession Number