Date of Award
3-23-2018
Document Type
Thesis
Degree Name
Master of Science in Operations Research
Department
Department of Operational Sciences
First Advisor
Christopher M. Smith, PhD.
Abstract
Unstructured data in the digital universe is growing rapidly and shows no evidence of slowing anytime soon. With the acceleration of growth in digital data being generated and stored on the World Wide Web, the prospect of information overload is much more prevalent now than it has been in the past. As a preemptive analytic measure, organizations across many industries have begun implementing text mining techniques to analyze such large sources of unstructured data. Utilizing various text mining techniques such as n -gram analysis, document and term frequency analysis, correlation analysis, and topic modeling methodologies, this research seeks to develop a tool to allow analysts to maneuver effectively and efficiently through large corpuses of potentially unknown textual data. Additionally, this research explores two notional data exploration scenarios through a large corpus of text data, each exhibiting unique navigation methods analysts may elect to take. Research concludes with the validation of inferential results obtained through each corpus’s exploration scenario.
AFIT Designator
AFIT-ENS-MS-18-M-163
DTIC Accession Number
AD1056425
Recommended Citation
Smith, Jeffrey R. Jr., "The Application of Text Mining and Data Visualization Techniques to Textual Corpus Exploration" (2018). Theses and Dissertations. 1863.
https://scholar.afit.edu/etd/1863