Date of Award
3-1994
Document Type
Thesis
Degree Name
Master of Science
Department
Department of Electrical and Computer Engineering
First Advisor
Steven K. Rogers, PhD
Abstract
This thesis presents a comparison based on identification rate, of three clustering techniques applied to cepstral features for speaker identification. LBG vector quantization as developed by Linde, Buzo and Gray; is used to provide benchmark performance for comparison with Fuzzy clustering (based on the unsupervised fuzzy partition-optimal number of classes, UFP-ONC algorithm by Gath and Geva) and an Artificial Neural Network, the Multilayer Perceptron. Cepstral features from the TIMIT, King and AFIT93 corpus speaker databases are used to produce speaker-identification classifiers using each of the clustering algorithms. The experiment reported evaluates the speaker identification performance using the 20-dimensional cepstral features which were extracted directly from the databases. The speaker databases were taken from different recording environments, TIMIT is studio quality, AFIT93 was recorded in an office environment and King is recorded telephone conversations. The performance provides an indication of merit for the clustering techniques for the range of typical recording environments. This thesis demonstrates the application of fuzzy clustering for speaker identification. It is shown that the UFP-ONC algorithm can achieve identification rates equal to the LBG vector quantization system. LBG vector quantization provides the best overall performance of all three clustering techniques.
AFIT Designator
AFIT-GE-ENG-94M-05
DTIC Accession Number
ADA278676
Recommended Citation
Prescott, Douglas N., "Clustering Techniques in Speaker Recognition" (1994). Theses and Dissertations. 6714.
https://scholar.afit.edu/etd/6714
Comments
The author's Vita page is omitted.