Description: We propose methods to track natural variations
in the characteristics of the vocal-tract system from speech
signals. We are especially interested in the cases where these
characteristics vary over time, as happens in dynamic sounds
such as consonant-vowel transitions. We show that the selection
of appropriate analysis segments is crucial in these methods, and
we propose a selection based on estimated instants of significant
excitation. These instants are obtained by a method based on
the average group-delay property of minimum-phase signals.
In voiced speech, they correspond to the instants of glottal
closure. The vocal-tract system is characterized by its formant
parameters, which are extracted from the analysis segments.
Because the segments are always at the same relative position
in each pitch period, in voiced speech the extracted formants are
consistent across successive pitch periods. We demonstrate the
results of the analysis for several difficult cases of speech signals.
File list (Check if you may need any files):
extraction.pdf