Page: 4 Pages
PDF Source: www.fxpal.com
Share this info:
Visualizing Musical Structure and Rhythm via Self-Similarity. This paper presents a novel approach to visualizing the time structure of musical waveforms.
Visualizing Musical Structure and Rhythm via Self-Similarity
Jonathan Foote and Matthew Cooper
FX Palo Alto Laboratory, Inc. 3400 Hillview Ave., Building 4 Palo Alto, CA 94304 USA
Abstract This paper presents a novel approach to visualizing the time structure of musical waveforms. The acoustic similarity between any two instants of an audio recording is displayed in a static 2D representation, which makes structural and rhythmic characteristics visible. Unlike practically all prior work, this method characterizes self-similarity rather than specific audio attributes such as pitch or spectral features. Examples are presented for classical and popular music.
There has been considerable interest in making music visible. Efforts include artistic attempts to realize images elicited by sound, of which the Walt Disney film Fantasia is perhaps the canonical example. Another approach is to quantitatively render the time and/or frequency content of the audio signal, using methods such as the oscillograph and sound spectrograph [1,2]. These are intended primarily for scientific or quantitative analysis, though artists like Mary Ellen Bute have used quantitative methods such as the cathode ray oscilloscope towards artistic ends . Other visualizations are derived from note-based or score-like representation of music, typically MIDI note events [4,5]. Music is generally self-similar. With the possible exception of a few avant-garde compositions, structure and repetition is a general feature of nearly all music. That is, the coda often resembles the introduction and the second chorus sounds like the first. On a shorter time scale, successive bars are often repetitive, especially in popular music. This paper presents methods of visualizing music by its acoustic selfsimilarity across time, rather than by absolute acoustic characteristics. Self-similarity is visualized in a twodimensional time representation such as Figure 1. This representation presented here is very flexible and can be used with practically any parameterization of audio. Besides audio, similar representations have been used to analyze text , video , hypertext links , and dynamical systems .
Figure 1. Self-similarity of Bach’s Prelude No. 1
similarity at instants i and j. Similar regions are bright while dissimilar regions are dark. Thus there is always a bright diagonal line running from bottom left to top right, because audio is always the most similar to itself at any particular time. Repetitive similarities, such as repeating notes or motifs, show up as a checkerboard patterns: a note repeated twice will give four bright areas at the corner of a square. The two regions at the off-diagonal corners are the "crossterms" resulting from the first note’s similarity to the second. Repeated themes are visible as diagonal lines parallel to, and separated from, the main diagonal by the time difference
similarity matrix S
2. Similarity Analysis
An audio file is visualized as a square. Time runs from left to right as well as from bottom to top. In the square, the brightness of a point (i,j) is proportional to the audio
Figure 2. Distance matrix calculation