Audio Analysis

Audio Analysis
Emmanouil Benetos
Introduction
Goal: extracting low- and mid-level descriptors
from individual audio recordings
To be used for high-level musicological analysis
Using (mostly) existing audio analysis technologies



Low-level audio descriptors
(e.g. raw audio, spectrograms, onsets)
Mid-level audio descriptors
(e.g. pitches, chords, beats)
High-level audio descriptors
(e.g. temperament, instrumentation, chord/note patterns)
2
The DML Project: Audio Analysis
Audio Descriptors List
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
3
Spectrogram
MFCCs
Chroma
Onsets
Speech/Music Segmentation
Chords
Beats/Tempo
Key
Melody
Note Transcription
The DML Project: Audio Analysis
Raw Audio
1.
B
Sample from CHARM: JS Bach, Chorale Prelude - Beloved Jesus, Cohen, Harriet
(piano), Columbia, 1935
4
The DML Project: Audio Analysis
Spectrogram
2 versions:

STFT magnitude spectrogram

Constant-Q Transform magnitude spectrogram
5
The DML Project: Audio Analysis
MFCCs

Stand for: Mel-Frequency Cepstral Coefficients

Extracted using QM Vamp Plugin Set
6
The DML Project: Audio Analysis
Chroma

Spectrum projected onto 12 bins (representing semitones of an octave)

Extracted using: QM Chromagram and NNLS Chroma Vamp plugins
7
The DML Project: Audio Analysis
Onsets

Onset: the beginning of a musical note or another sound

Extracted using QM Onset Vamp plugin
8
The DML Project: Audio Analysis
Speech/Music Segmentation

Useful for ethnographic recordings/radio broadcasts

Extracted using BBC Speech/Music Segmentation Vamp Plugin
9
The DML Project: Audio Analysis
Chords
Extracted using Chordino Vamp Plugin

10
The DML Project: Audio Analysis
Beats

Beat locations labelled with metrical position

Extracted using Beatroot, Marsyas, Tempotracker Vamp Plugins
11
The DML Project: Audio Analysis
Tempo

Estimated based on onset/beat information

Extracted using Tempotracker and Tempogram Vamp plugins
12
The DML Project: Audio Analysis
Key
Extracted using QM Key Vamp plugin (supports major/minor keys)

13
The DML Project: Audio Analysis
Melody

Or more precisely: “Sequence of fundamental frequency (F0) values
corresponding to the perceived pitch of the main melody.”

Extracted using MELODIA Vamp plugin
14
The DML Project: Audio Analysis
Note Transcription – Semitone Resolution

Multiple-pitch detection (onset/offset/pitch/velocity)

Extracted using Silvet Vamp plugin

Synthesized transcription example:
15
The DML Project: Audio Analysis
Note Transcription – High Pitch Resolution

Multiple-pitch detection on a 20-cent resolution – useful for
tuning/temperament analysis and analysis of non-Western music

Extracted using Silvet Vamp plugin
16
The DML Project: Audio Analysis
Thank you!
17
The DML Project: Audio Analysis