Date of Award
3-1993
Document Type
Thesis
Degree Name
Master of Science
Department
Department of Electrical and Computer Engineering
First Advisor
Steven K. Rogers, PhD
Second Advisor
Dennis Ruck, PhD
Abstract
The purpose of this study was test the influence of phase on the quality of speech reproduced by a speaker dependent compression system. The tests consisted of compressing frequency domain speech vectors using the Karhunen-Loeve Transform, with and without phase, then making subjective judgements as to the reproduced quality. Error Metrics were then tested for their suitability as predictors of reproduced quality. The compression software transformed each speech vector into a vector of complex Fourier coefficients (only half of the coefficients are needed as transform is hermitian). Phase was preserved by using the real frequency components to form one vector and the corresponding imaginary components to form a second vector of real numbers which were then separately compressed. The expanded vectors were recombined and speech reconstructed by Inverse Fourier Transformation. Compression ratios of 8:1 could be achieved without any perceivable difference between the original speech and reconstructed speech by minimizing the MSE of each vector of the pair. The 8:1 Compression Ratio corresponded to a covariance matrix Condition Number of 200. Recommendations for further study into voice characterization and an optimal transform for speech are made.
AFIT Designator
AFIT-GEO-ENG-93M-02
DTIC Accession Number
ADA262613
Recommended Citation
Dryley, Donald W., "Frequency Domain Speech Compression Using the Karhunen-Loeve Transform" (1993). Theses and Dissertations. 7177.
https://scholar.afit.edu/etd/7177