Date of Award
3-14-2014
Document Type
Thesis
Degree Name
Master of Science
Department
Department of Electrical and Computer Engineering
First Advisor
Kennard R. Laviers, PhD.
Abstract
Image retrieval remains one of the most heavily researched areas in Computer Vision. Image retrieval methods have been used in autonomous vehicle localization research, object recognition applications, and commercially in projects such as Google Glass. Current methods for image retrieval become problematic when implemented on image datasets that can easily reach billions of images. In order to process these growing datasets, we distribute the necessary computation for image retrieval among a cluster of machines using Apache Hadoop. While there are many techniques for image retrieval, we focus on systems that use Hierarchical K-Means Trees. Successful image retrieval systems based on Hierarchical K-Means Trees have been built using the tree as a Visual Vocabulary to build an Inverted File Index and implementing a Bag of Words retrieval approach, or by building the tree as a Full Representation of every image in the database and implementing a K-Nearest Neighbor voting scheme for retrieval. Both approaches involve different levels of approximation, and each has strengths and weaknesses that must be weighed in accordance with the needs of the application. Both approaches are implemented with MapReduce, for the first time, and compared in terms of image retrieval precision, index creation run-time, and image retrieval throughput. Experiments that include up to 2 million images running on 20 virtual machines are shown.
AFIT Designator
AFIT-ENG-14-M-56
DTIC Accession Number
ADA602439
Recommended Citation
Murphy, William E., "Large Scale Hierarchical K-Means Based Image Retrieval With MapReduce" (2014). Theses and Dissertations. 616.
https://scholar.afit.edu/etd/616