Author

Craig N. Berg

Date of Award

12-1997

Document Type

Thesis

Degree Name

Master of Science

Abstract

We have become overwhelmed with electronic information and it seems our situation is not going to improve. It is becoming increasingly common for people to work with information on a daily basis. We seem to spend more and more time looking for information, and it is taking longer because more information is available. This thesis will look at how we can provide faster access to the information we want to find. Today's requirements are closely related to searching for information using queries. At the heart of the query process is the removal of search terms having little or no significance to the search being performed. Words considered to have little significance, in terms of their searching power, called stopwords, are compiled in a stoplist. Stoplists are usually constructed from commonly occurring words in the English language. This approach is acceptable for systems handling broad categories of information. We will build a stoplist for a specific area of interest based on a specific body of linguistic data, or corpus. A stoplist developed from an Air Force corpus will be tested to see if it is more effective than a stoplist created from a general use corpus.

AFIT Designator

AFIT-GIR-LAL-97D-02

DTIC Accession Number

ADA334570

Comments

Thesis presented to the Faculty of the Graduate School of Logistics and Acquisitions Management.

Share

COinS