[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

USAF Speech Subterfuge (fwd)



----- Forwarded Message:

Date: 25 Oct 1996 10:48:12
To: Recipients of conference <[email protected]>
From: [email protected] (James Salsman)
Subject: speech subterfuge


I have some indirect evidence that patent-related 
activities of the U.S. Air Force may have intentionally 
obscured the mathematical definition of "cepstrum" from 
the mid-1970s, with literally tremendous implecations 
for the computer speech processing industry (perhaps 
billions of dollars in real economic damage by now), and 
also harming current war reduction projects such as 
automatic language translation systems.  The correct 
definition was published in 1963 by Cooley and Tukey 
(who also coined the term "bit".)  For those who care, 
the Cooley-Tukey cepstrum is:

  FFT( ln( | FFT( sample .* window ) | ) )

And for speech processing, the definition is:

  FFT( ln( melScale( | FFT( sample .* window ) | ) ) )

(The "melody scale" atenuates frequencies atenuated by 
the human ear.  N.B.: Both the resulting cepstral 
magnitude and phase are significant, e.g., the result 
is a vector of complex numbers.  Furthermore, only 
the first few elements of the cepstral vector are 
necessary for the formant envelope, while the exitation 
(i.e., the harmonics) are encoded as a peak towards the 
end of the vector.)

The error has been to use the inverse Fourier transform 
instead of the second (outside) FFT, which seems to be
why researchers have been experimenting with the 
(slightly) better Discrete Cosine Transform, from video 
signal processing.

Sincerely,
:James Salsman

----- End Forward