The frequency scale parameter controls format for the vertical axis of the spectrogram. By default the spectrogram is displayed with "Hertz" linearly on the vertical axis. The other option is "Interval" which displays equivalent intervals equally at any position on the vertical scale.
Since the example ideal spectrogram was used to create the sound on the "Interval" frequency scale, the spectrogram on the right looks more like the ideal spectrogram in this case:
"Hertz" scaling is useful when you want to view harmonics as equally spaced lines. "Interval" scalign is useful when you want to view intervales as equally spaced lines.
The difference can perhaps be seen more clearly with a harmonically pitched note as shown in the following figure:
The harmonics are labeled from 1 being the fundamental, up to 6 being the 6th harmonic. If the fundamental is 100 Hz, for example, then the 6th harmonic is at 600 Hz. In the "Hertz" vertical scaling, the harmonics are evenly spaced at 100 Hz, 200 Hz, 300 Hz, 400 Hz, 500 Hz, and 600 Hz. In the "Interval" vertical scaling, all interval classes retain their size regardless of their vertical position. For example, two of octaves are marked in the figure above. In the "Interval" scaling, the size of the octaves are equal. In the "Hertz" scaling, the lower octave is 1/2 the size of the octave indicated above it.
The Step size parameter specifies how many samples to jump between analysis frames in the audio data. A smaller number will give a smoother look to the spectrogram because the spectrogram is updated more often over time. By default, the step size is set to 512 samples. At a 44.1 kHz sampling rate (as used with Compact Disc recordings), that is a time of 512 / 44100 = 11.6 milliseconds. Here is a picture of the center of the spectrogram using a 512 sample step size:
Decreasing the step size from 512 to 256 will make the spectrogram display update twice as often:
Likewise, increasing the step size to 1024 samples (23.2 ms), will increase the quantization on the horizontal time axis:
The Compress range option applies a compression function to the amplitudes (in decibels) in the spectrogram. If this option is selected, a good visual compromise between quiet and loud sounds occurs.
The compressed range is best when looking at music which contains a variety of dynamics throughout. However, if you compare the pictures, the compression setting used in this case saturates the peaks near the bottom of the spectrogram, causing a loss of visual frequency discrimination.
The Window size parameter controls the frequency resolution ability of the spectrogram. The larger the window size, the more accurately the frequencies can be resolved. However, the larger the window size, the more spectral smearing occurs which decreases the visual sharpness of the display. The following figure demonstrates the trade-off between frequency accuracy and resolution:
However, as the window size increases, the harmonic tracks start to smear when they change pitch, and become very sharp when they are at a constant pitch.
If you are analyzing vocal music, violin music or other similar instruments which use vibrato, you must be careful to use a smaller window size so that the frequencies in the vibrato do not smear. For piano and other fixed-pitch instruments, a larger window is more suitable since the smearing would not be as noticeable.
The source code for the plugin was last modified on 20 Jul 2006.