Date of Award

3-1994

Document Type

Thesis

Degree Name

Master of Science

Department

Department of Electrical and Computer Engineering

First Advisor

Steven K. Rogers, PhD

Abstract

This thesis presents a comparison based on identification rate, of three clustering techniques applied to cepstral features for speaker identification. LBG vector quantization as developed by Linde, Buzo and Gray; is used to provide benchmark performance for comparison with Fuzzy clustering (based on the unsupervised fuzzy partition-optimal number of classes, UFP-ONC algorithm by Gath and Geva) and an Artificial Neural Network, the Multilayer Perceptron. Cepstral features from the TIMIT, King and AFIT93 corpus speaker databases are used to produce speaker-identification classifiers using each of the clustering algorithms. The experiment reported evaluates the speaker identification performance using the 20-dimensional cepstral features which were extracted directly from the databases. The speaker databases were taken from different recording environments, TIMIT is studio quality, AFIT93 was recorded in an office environment and King is recorded telephone conversations. The performance provides an indication of merit for the clustering techniques for the range of typical recording environments. This thesis demonstrates the application of fuzzy clustering for speaker identification. It is shown that the UFP-ONC algorithm can achieve identification rates equal to the LBG vector quantization system. LBG vector quantization provides the best overall performance of all three clustering techniques.

AFIT Designator

AFIT-GE-ENG-94M-05

DTIC Accession Number

ADA278676

Comments

The author's Vita page is omitted.

Share

COinS