Soundhack phase vocoder

1/31/2024

For the case of a pure sinusoidal input, the magnitude spectra of the successive spectra are the same (as shown).But the phase spectra differ, and these provide the needed values of theta1 (the phase corresponding to the peak of the first magnitude spectrum) and theta2 (the phase corresponding to the peak of the second magnitude spectrum). The FFT is applied to each burst, resulting in magnitude and phase spectra. The output of the windowing is a collection of short sinusoidal bursts. This is shown diagrammatically on the right where the signal is assumed to be a single sinusoid that spans the time interval over which the calculations are made. It then chooses the fn that is closest to the frequency of that peak. The phase vocoder exploits equation (2) by locating a common peak in the magnitude spectrum of two different frames. Without more information, it is not possible to uniquely determine f, though it is constrained to one of the above values.

(2) fn = (theta2 - theta1 + 2 pi n)/ (2 pi (t2-t1))įor some integer n. Or it may revolve twice, or n times.In other words, the frequency multiplied by the change in time must equal the change in angle, that is, 2 pi f (t2-t1) = theta2 - theta1 or some 2 pi multiple. Or it may begin at theta1, move completely around the circle, and end at theta2 after one full revolution. The sinusoid may have a frequency that moves it directly from theta1 to theta2 in time t2-t1. To see how the analysis portion of the PV can use phase information to make improved frequency estimates, suppose there is a sinusoid of unknown frequency but with known phases: at time t1 the sinusoid has phase theta1 and at time t2 it has phase theta2. Is there a way to improve the frequency resolution of the STFT without overly harming the time resolution? Fortunately, the answer is "yes." The phase vocoder makes improved frequency estimates by using phase information that the STFT ignores. The resolution of this FFT is only good to within 25%! For comparison, the distance between consecutive notes on the piano is a constant 6%.

A low note on the piano may have a fundamental near 80 Hz. This may be adequate to specify high frequencies (where 21.5 Hz is a small percentage of the frequency in question) but it is far too coarse at the low end. Using a medium window of size 2048 and a sampling rate of 44.1 KHz, the resolution in frequency is about 21.5 Hz. In typical use, the support of the window (the region over which it is nonzero) is between 5 samples. (1) resolution in Hz = (sampling rate)/(window size). For the special case where no spectral manipulations are made (as shown), the output of the STFT is identical to the input. After the desired spectral changes, the resynthesis is handled by the inverse FFT to return each segment to the time domain.The modified segments are then summed. The FFT is applied to each segment separately and the resulting spectral snapshot can be manipulated in a variety of ways. A short-time Fourier transform (STFT) signal processor is an analysis/synthesis method that begins by windowing a signal into short segments.

0 Comments

Soundhack phase vocoder

Leave a Reply.

Author

Archives

Categories