Sound Analyzers
Werner Van Belle


Introduction

BpmDj contains three distinct part. The first program is the player, the second program is the selector and the third program is the music-analyzer. In this section we will explain the details of the different  analyzers.   The contents of this section:
  1. Using the BPM Counter
  2. The Beat graph
  3. The spectrum

1. Using the BPM counter

bpmcount

The BPM counter of bpmplay is accessible via the 'BPM-counter' button of the main-pane. The BPM-counter window supports three tools.

Tapping the beat

If you want a rough estimate of the tempo of a song, press the tap button. Make sure that the song is playing at its normal tempo, otherwise your measurement will be wrong. If you want to get into the rhythm before tapping, press the reset key whole the time and suddenly start using the Tap-button. If you want to tap less, set the skip box on some value you like. For instance, if the skip-box is set to 4, the tapping counter will assume that with every tap, four beats have been passed. If you tap the beat, the tempo information will change immediately and give the best estimate of the tempo. However, if you want an exact BPM-count, you should continue with the automatic BPM-counter.

The automatic BPM counter

To use the automatic BPM-counter, first specify the upper and lower bounds of the BPM. You can easily narrow these down by tapping the beat as described in previous section. The automatic counter is started with the Start button. Beware this might take a time ! If you need the tempo of a lot of songs,you should use the file selector BpmDj and measure the tempo of songs in batch overnight.

To have an idea how accurate the BPM should be analyzed have a look here.
Or if you are only interested in a BPM counting tool, have a look here.

2. Beat Graphs

BpmDj contains a feature, called the beatgraph. It can be used to quickly align the tempo to perfection. Two versions of the beatgraph are provided. a black/white version and a wavelet based version.

The pattern analyzer can be used to see how the tempo line changes over time and how well the measured tempo  is correct. To analyze a song, the pattern-analyzer will visualize horizontally the different measures, while vertically the content of one such a measure is visualized. If the period (BPM) is correct, the song should show distinct visual lines. However, if the period is wrong, or the song contains different tempos, or the drummer of the group doesn't actually care about being correct, then the visualization will look something different.

The above picture shows the pattern visualization of 'Land of Freedom' (Transwave). As can be seen, the white 'strokes' go horizontally, so the period of this song is correct. If this was not the case, the white horizontal strokes would slant down or up. If this were the case then you can manually modify the tempo by using sliders A and B. Slider A is a fine tempo modification. Slider B is a coarse grain tempo modification. Once the music is 'horizontalized', you can press the 'set Tempo' button to make this tempo permanent.

In the two pictures below, we see on the left the song Anniversary Waltz (Status Quo). As can be seen, this song drifts slightly and the drummer clearly didn't think a steady rhythm would be good in this piece of music. The right picture below is the pattern visualization of the song XFile (Chakra & Edimis), this song comes from an 'already mixed' CD and as you can see the DJ who created the mix had to modify the tempo over time.  In both pictures, the vertical red lines show where the song was playing at the moment of the snapshot.

BpmDj has also a more colorful mode in which wavelets are used to visualize the beatgraph. Although much more slower it is a very good way to show the structure in the music. More information on the beatgraphs is presented below.

             

This beat graph is normally drawn in gray. As such it mainly show energy spikes but nothing more. In order to fix this BpmDj has been extended to color the beatgraph based on the frequency components present within the soundtrack. The frequency analysis is based on a very simple wavelet analysis (Haar).

The Wavelet Beatgraph

   

Above is the colored beatgraph Left is the picture of the shown beatgraph. Right is the same picture but with annotations. The red colors are low frequencies, the blue colors are high frequencies. Everything in between shifts from red (low) to blue (hi). A beatgraph should be read from TOP to BOTTOM and then from LEFT TO RIGHT. In other words the song starts playing at the left top then goes down and then continues in the next column until it is entirely finished at the bottom right. See the turquoise arrows.

In this beatgraph (which visualizes the song Summer98). We see the 4 bass drums of every measure (the red boxes). In every beat we see how it starts with multiple frequencies at once and then fades out in only the bass frequencies. This is the red tail. The purple boxes show the position of high hats and claps within the music.

The two green strokes show a number of measures in which relatively much high frequencies are present. This can be either because there is some strong lead or some distortion is present. The yellow stroke shows where there is no bass drum present and the orange stroke shows where the bass drum is doubled (every 1/8 of a note instead of every 1/4th).

Primary Frequency Bands

A new addition (March 2009) to BpmDj are beatgraphs colored along the primary frequency band. This makes it possible to detect areas in the music that sound the same. The coloring is different from the wavelet beatgraphs in the sense that we no longer assign a color to each frequency; instead we find the main frequency and afterward decide which color that particular band should receive.



3.Sound Color = Spectrum Analysis

To obtain the color of a song, select a suitable cue-point, go to the BPM-counter window and select 'Fetch Spectrum'. Beware, this takes a while ! If you need to do this in batch, please use the file-selector BpmDj.

The basics for a good sound color detection lies in spectrum analysis. The spectrum of a song describes how much a certain frequency is present within the song. However, weighing different frequencies and making something useful from a spectrum analysis is not as straight forward as expected.

The Bark Frequency Scale

The first thing to note is that the human ear is very well suited to detect certain frequencies. Especially in the range of the human voice we are very capable of hearing well. However, on the higher frequency ranges (above 11500 Hz), humans are unable to distinguish different frequencies. Therefore, to correctly describe the sound color of a song we need to take into account how well the human ear perceives these frequencies. Luckily, such a scale exists and is called the Bark Scale.

  • 0-100
  • 100-200
  • 200-300
  • 300-400
  • 400-510
  • 510-630
  • 630-770
  • 770-920
  • 920-1080
  • 1080-1270
  • 1270-1480
  • 1480-1720
  • 1720-2000
  • 2000-2380
  • 2380-2700
  • 2700-3150
  • 3150-3700
  • 3700-4400
  • 4400-5300
  • 5300-6400
  • 6400-7700
  • 7700-9500
  • 9500-12000
  • 12000-15500
  • The spectrum of a song is measured over 10 seconds at the last used cue-position. The strength of the different frequencies is measured and exported as 24 bands of the bark-scale. These 24 bands can then later be used in the selector to measure the 'color-distance' using an L^2 norm or to do a PCA analysis in which the 24 bands are reduced to the 3 most principals.

    4.The Echo/Delay Analysis

    The 'echo' characteristic measures the distribution of the energy amplitude throughout the song. It is measured using a renormalization of a SFFT to the Bark Psycho-acoustic scale. For every band we take the real part of the Fourier transform. This real value determines in which bin's (dB) color value is increased. The binned distribution is thereafter autocorrellated and differentiated to highlight the relations between the different energy levels. This step also removes sound color information. Skipping the imaginary part of the Fourier transform implements a partial DCT transform) which has the nice property that it will compact energy and lead to more articulated information.

    In the figure below we see the 24 bark bands (0 being low, 24 being high frequencies). The first bark band shows how the bass tones are present at all kinds of strengths. This is normal since this also contains the residual energy which could not be captured in any other bin. The first bark band shows that the bass drum is a very articulated piece of information with little echo. The further we go up in frequency the more articulate the pattern becomes. From bark 12 we wee that there is an accent in the hi-hats.




    More information on the sound color and echo characteristic
    can be found at http://werner.yellowcouch.org/Papers/spectrum05/index.html


    5. Rhythm & Composition Analysis

    BpmDj has a rhythm analyzer which works based on the obtained tempo and spectrum information. Once those are known the analysis can be performed by clicking on the 'Rhythm' button in the main window.  The rhythm analysis contains three different panels. In every panel the brighter a spot is the more 'signal' has been measured. All the frequencies scales are based on the bark scale explained earlier.

    The top bar (A in the picture) will show the mean 'rhythm' of one measure.  Horizontally one measure is visualized, while vertically 24 different frequency bands are shown. The blue colors ( E in the picture) are the highest frequency bands, the red colors are the lowest frequency bands (H in the picture). In this rhythm we see a very solid 4/4 rhythm with a lot of 'body' (G), the hi-hats are spaced between the beats (E), and a 1/16 monotone baseline is present (H). Every 2 beats we have a 'crash' symbol (the longer green trail after the E).

    The middle bar (B in the picture) shows the presence of a specific frequency within every measure. From left to right we have the entire song, with every vertical slice being one measure. Vertically we have again the different frequency bands. In this particular case we see some up-sweeps over multiple measures. This picture is used to determine the following one

    The lowest bar (C) shows the probability that a specific frequency content will have changed after x measured. In this case we see that the chance that the lower frequencies change after 4 measures is very high (D). This can either mean that after 4 measure the beat stops or starts. After 7 measures (F) the higher frequencies typically change, in order to change again at the 8th measure. This is a typical break because the hi-hat is cut (measure 7) before a break (measure 8).

    This analysis also allows to write out a superimposition of the music top disk. It is written out as a 16 bit little endian stereo sample.

    Copyright © 2000-2011 - Werner Van Belle - werner@yellowcouch.org - http://bpmdj.yellowcouch.org/